在R中的JSON中跳过NULL值

时间:2018-03-12 16:25:49

标签: json r rjson

嘿,伙计们! 我有简单的R脚本来解析JSON文件:

json <- 
   rjson::fromJSON(readLines('http://data.rada.gov.ua/ogd/zpr/skl8/bills-
   skl8.json', warn=F))
bills <-  data.frame(
  id = numeric(), 
  title = character(),
  type = character(), 
  subject = character(), 
  rubric = character(),
  executive = character(),
  sesion = character(),
  result = character() 
)
for (row in json) 
{
  bill <- data.frame(
    id = row$id, 
    title = row$title, 
    type = row$type,
    subject = row$subject, 
    rubric = row$rubric,
    executive = row$mainExecutives$executive$department,
    sesion = row$registrationSession,
    result = row$currentPhase$title
)
  bills <- rbind(bills, bill)
}

但我在data.frame中有错误(id = row $ id,title = row $ title,type = row $ type,subject = row $ subject,:   参数意味着不同的行数:1,0

所以,我的JSON文件在277行中有NULL值。我可以跳过此错误或替换循环中的NULL值吗? 谢谢!

2 个答案:

答案 0 :(得分:1)

要回答你的直接问题,我会用一个小函数来包装它,如果缺少执行部门,它会返回一个字符串。

protect_against_null <- function( x ) {
  if( is.null(x) )
    return( "" ) # Replace with whatever string you'd like.
  else 
    return( x )
}

for (row in json) {
  bill <- data.frame(
    id = row$id, 
    title = row$title, 
    type = row$type,
    subject = row$subject, 
    rubric = row$rubric,
    executive = protect_against_null(row$mainExecutives$executive$department),
    sesion = row$registrationSession,
    result = row$currentPhase$title
  )
  bills <- rbind(bills, bill)
}

长期建议:由于这个数据集是11,000个嵌套记录,我会回避循环。查看purrr包,将嵌套的json / list映射到矩形数据框。特别是purrr::map_dfr()

答案 1 :(得分:1)

为此,fromJSONjsonlite包)可能很方便。

library(jsonlite)

url <- 'http://data.rada.gov.ua/ogd/zpr/skl8/bills-skl8.json'
df <- jsonlite::fromJSON(url)   

df1 <- data.frame(
  id = df$id, 
  title = df$title, 
  type = df$type,
  subject = df$subject, 
  rubric = df$rubric,
  executive = df$mainExecutives$executive$department,
  sesion = df$registrationSession,
  result = df$currentPhase$title
)