在我当前的项目中,我正在尝试从csv
文件读取数据,并尝试根据R中JSON
文件的数据创建分层csv
数组。示例数据显示下面:
添加了数据样本数据(为简单起见,减少了数据集):
Country Provider 2 G Data 3 G Data LTE FP0 anfang0 2G 3G FP1 anfang1
ABC A1 n n n fp0 j NA NA NA NA
ABC A2 NA NA NA NA NA j j fp1 n
ABC A3 n n n fp0 j NA NA NA NA
DEF A7 j j j fp0 n j j fp1 n
了解数据:n
代表值为no
,j
代表值为yes
,NA
代表值缺失。 FP0
和FP1
代表有关同一提供商但在不同区域的信息。单行中有两种类型的数据,即2 G Data, 3 G Data, LTE, FP0, anfang 0
属于1组,2G, 3G, FP1, anfang 1
属于其他组。如果所有信息均为n
,即no
,则我们必须考虑相应的anfang0
或anfang1
值。
示例输出如下所示(基于以上说明):
{
"ABC": {
"fp0":[
{
"provider": "A1",
"anfrage": "j"
},
{
"provider": "A3",
"anfrage": "j"
}
],
"fp1": [
{
"provider": "A2",
"2G": "j",
"3G": "j"
}
]
},
"DEF": {
"fp1": [
{
"provider": "A7",
"2G": "j",
"3G": "j"
}
],
"fp0": [
{
"provider": "A7",
"2G": "j",
"3G": "j",
"LTE": "j"
}
]
}
}
采用上述json
格式,对于每个Country
,应该只有一个json
块,如上所示。到目前为止,我试图关注this link,但无法找到任何可行的解决方案。
for(i in 1:nrow(data)){
a=c(a,jsonlite::toJSON(list(list('fp0' =
list("provider"=data$Provider[i],"2g"=data$`2 G Data`[i],"3g"=data$`3 G
Data`[i],"LTE"=data$LTE[i]))), pretty = TRUE))
}
toJSON(a, pretty = TRUE, auto_unbox = TRUE)
如果您需要更清晰,请告诉我。
答案 0 :(得分:1)
其中一种方法可能是
library(dplyr)
library(jsonlite)
#data pre-processing (bind different areas' data in row)
df1 <- df[, 1:7] %>% #dataframe having data for one area - i.e. fp0
na.omit() %>%
`colnames<-`(c("country", "provider", "2G", "3G", "LTE", "fp", "anfang")) %>%
bind_rows(
df[, c(1:2, 8:ncol(df))] %>% #dataframe having data for another area - i.e. fp1
na.omit() %>%
`colnames<-`(c("country", "provider", "2G", "3G", "fp", "anfang"))
)
df1[df1 == 'n'] <- NA #convert all "n" to NA as we are not concerened about it in the final output
#convert processed dataframe to a list
dfList <- lapply(split(df1, df1$country),
function(x) split(x[, c("provider", "2G", "3G", "LTE", "anfang")], x$fp))
#final result (convert list to JSON)
json_out <- toJSON(dfList, auto_unbox = T)
给出
> json_out
{"ABC":{"fp0":[{"provider":"A1","anfang":"j"},{"provider":"A3","anfang":"j"}],"fp1":[{"provider":"A2","2G":"j","3G":"j"}]},"DEF":{"fp0":[{"provider":"A7","2G":"j","3G":"j","LTE":"j"}],"fp1":[{"provider":"A7","2G":"j","3G":"j"}]}}
示例数据
df <- structure(list(Country = c("ABC", "ABC", "ABC", "DEF"), Provider = c("A1",
"A2", "A3", "A7"), `2 G Data` = c("n", NA, "n", "j"), `3 G Data` = c("n",
NA, "n", "j"), LTE = c("n", NA, "n", "j"), FP0 = c("fp0", NA,
"fp0", "fp0"), anfang0 = c("j", NA, "j", "n"), `2G` = c(NA, "j",
NA, "j"), `3G` = c(NA, "j", NA, "j"), FP1 = c(NA, "fp1", NA,
"fp1"), anfang1 = c(NA, "n", NA, "n")), .Names = c("Country",
"Provider", "2 G Data", "3 G Data", "LTE", "FP0", "anfang0",
"2G", "3G", "FP1", "anfang1"), class = "data.frame", row.names = c(NA,
-4L))
# Country Provider 2 G Data 3 G Data LTE FP0 anfang0 2G 3G FP1 anfang1
#1 ABC A1 n n n fp0 j <NA> <NA> <NA> <NA>
#2 ABC A2 <NA> <NA> <NA> <NA> <NA> j j fp1 n
#3 ABC A3 n n n fp0 j <NA> <NA> <NA> <NA>
#4 DEF A7 j j j fp0 n j j fp1 n