我在使用R方面缺乏经验,我必须为学校项目完成一个脚本 我有一个带嵌套列表的json文件,我必须检索两个属性中包含的值。问题是这些属性位于我的文件中的不同列表中。所以我尝试了这个:
`grabInfo<-function(var){
sapply(json_data.rows, function(x) returnData(x, var))
}
returnData<-function(x, var){
if(!is.null( x$doc$payload[[var]])){
return( trim(x$doc$payload[[var]]))
}else if(!is.null(x$doc$sensorType == 'movement')){
return( trim( rbind(x$doc$value, x$doc$date_inmilli)))
}else {
return(NA)
}
}
fmDataDF<-data.frame(sapply(c(1,3), grabInfo), stringsAsFactors=FALSE)`
在第一个if
(所以第一种列表)中,我提取var
中包含的值(特别是第一个和第三个属性)。当谈到第二个列表时,我必须检查它是否是正确的传感器(数据是脏的),但我不能使用var因为“value”和“date_inmilli”处于不同的位置(c(3,5)
,所以我不得不直接在哪里找代码......我知道......太可怕了)在我的数据框中,而不是在两列中有两个值,每列有两个向量(值,日期)。
我的文件json_data只有两行的例子(我在脚本中使用了一个子集json_data.rows <- json_data[['rows']]
)如下所示:
{"total_rows":99019,"offset":0,"rows":[
{"id":"5f238411f1d877723c8e5f7a40c1fa3f","key":"5f238411f1d877723c8e5f7a40c1fa3f","value":{"rev":"1-7625857a2f79af0e46c2b73fdbde986e"},"doc":{"_id":"5f238411f1d877723c8e5f7a40c1fa3f","_rev":"1-7625857a2f79af0e46c2b73fdbde986e","value":0,"date_human":"2017-04-12T03:44:20.803Z","date_inmilli":1491968660803,"sensorType":"movement","date":"2017-04-12T03:44:19.902Z"}},
{"id":"0006d0a04d14c4cf0158db1a9a185dac","key":"0006d0a04d14c4cf0158db1a9a185dac","value":{"rev":"1-496c4d06dff82c1ad8a03cbcdf47f10b"},"doc":{"_id":"0006d0a04d14c4cf0158db1a9a185dac","_rev":"1-496c4d06dff82c1ad8a03cbcdf47f10b","topic":"iot-2/type/node-red-wiotp/id/f5f7aa27.368c48/evt/update/fmt/json","payload":{"value":0,"date_human":"2017-04-20T17:55:43.788Z","date_inmilli":1492710943788,"sensorType":"movement"},"deviceId":"f5f7aa27.368c48","deviceType":"node-red-wiotp","eventType":"update","format":"json","increment":0}}]}
样品输出是:
Values Date
93 0 1493750501405
94 c(1, 1491568686336) c(1, 1491568686336)
答案 0 :(得分:0)
您的样本数据缺少最终&#34;}&#34;。这是一种更简单的方法。使用
jsonlite::from_JSON
flatten = TRUE
str
。这使得一致
使用缺失值更容易导航的数组结构
提交给NA。使用 library(jsonlite)
myList <- fromJSON(json, flatten = TRUE)
str(myList)
# List of 3
# $ total_rows: int 99019
# $ offset : int 0
# $ rows :'data.frame': 2 obs. of 20 variables:
# ..$ id : chr [1:2] "5f238411f1d877723c8e5f7a40c1fa3f" "0006d0a04d14c4cf0158db1a9a185dac"
# ..$ key : chr [1:2] "5f238411f1d877723c8e5f7a40c1fa3f" "0006d0a04d14c4cf0158db1a9a185dac"
# ..$ value.rev : chr [1:2] "1-7625857a2f79af0e46c2b73fdbde986e" "1-496c4d06dff82c1ad8a03cbcdf47f10b"
# ..$ doc._id : chr [1:2] "5f238411f1d877723c8e5f7a40c1fa3f" "0006d0a04d14c4cf0158db1a9a185dac"
# <snip>
# ..$ doc.sensorType : chr [1:2] "movement" NA
# <snip>
# ..$ doc.payload.date_inmilli: num [1:2] NA 1.49e+12
# ..$ doc.payload.sensorType : chr [1:2] NA "movement"
result <- data.frame(sensorType = myList$rows$doc.payload.sensorType,
date_inmilli = myList$rows$doc.payload.date_inmilli)
# sensorType date_inmilli
# 1 <NA> NA
# 2 movement 1.492711e+12
stderr