R:JSON包 - 导入数据&缺失值/ null

时间:2013-11-27 15:04:26

标签: arrays json r import null

我正在使用JSON包读取数据。

基本上,数据格式如下:

{"a":1,"b":2,"c":3}
{"a": null,"b":2,"c":3}

我在R:

中存储如下数据
DAT<-data.table(read.csv("D:/file.csv"))
i<-1
#create unified variable names
while (i<=nrow(DAT)) {
OUT[[i]]<-fromJSON(as.character(DAT[i]$results))
vnames<-c(vnames,names(OUT[[i]]))
i<-i+1
}
#create the corresponding content 
content <- NULL
Applicant <- NULL
  i<-1
  while (i<=nrow(DAT)) {
    temp<-fromJSON(as.character(DAT[i]$results))
    laenge <- length(fromJSON(as.character(DAT[i]$results)))
    for(j in 1:laenge)
    {
      content_new <- as.character(temp[[j]])
      content <- c(content, content_new)
    }
    i <- i+1
  }

然后我想加入列表(为了获得典型格式的数据):

assets_mren = data.frame(asset_class=vnames, value=content)

但我收到错误消息,指出 vnames 内容 行数 。我认为要读入的数据中的问题是&#34; null&#34; 。您是否知道如何阅读&#34; null&#34; < / strong>以上或如何更好地读取数据?

1 个答案:

答案 0 :(得分:0)

是的,问题是null。每行都有不同的结构。

ll <- '{"a":1,"b":2,"c":3}
       {"a": null,"b":2,"c":3}'
res <- lapply(ll,function(x)str(fromJSON(x)))
 Named num [1:3] 1 2 3                       ## named vector for the first line
 - attr(*, "names")= chr [1:3] "a" "b" "c"
List of 3
 $ a: NULL                                   ## list for the second line
 $ b: num 2
 $ c: num 3

所以你必须使每一行的输出均匀化。这里有2个选项:

1-用虚拟值(0或-1)替换null,例如:

ll <- readLines(textConnection(gsub("null",-1,ll)))
do.call(rbind,lapply(ll,function(x)
    fromJSON(x)))
     a b c
[1,]  1 2 3
[2,] -1 2 3    ## res[res==-1] <- NA to replace dummy value

2-保留null,但您应该使用rbind.fill来获取data.frame:

ll <- readLines(textConnection(gsub("null",-1,ll)))
do.call(rbind,lapply(ll,function(x)
  fromJSON(x)))
ll <- '{"a":1,"b":2,"c":3}
{"a": null,"b":2,"c":3}'
ll <- readLines(textConnection(ll))
res <- lapply(ll,function(x)
    as.data.frame(t(as.matrix(unlist(fromJSON(x))))))
library(plyr)
rbind.fill(res)

   a b c
1  1 2 3
2 NA 2 3