我遇到了以下问题:
我的数据框有一个包含JSON对象的变量(在var2
中):
var1 var2
1 1 {"property1": "val1", "property2": 5}
2 2 {"property1": "val2", "property2": 8}
3 3 {"property1": "val3", "property2": 7}
4 4 {"property1": "val4", "property2": 0}
5 5 {"property1": "val5", "property3": 9}
(关于pastebin的代码here)
我想在var2
中提取JSON属性,并将它们添加到新列中的数据框中,如下所示:
var1 var2 prop1 prop2 prop3
1 1 {"property1": "val1", "property2": 5} val1 5 NA
2 2 {"property1": "val2", "property2": 8} val2 8 NA
3 3 {"property1": "val3", "property2": 7} val3 7 NA
4 4 {"property1": "val4", "property2": 0} val4 0 NA
5 5 {"property1": "val5", "property2": 9} val5 NA 9
在相同的序列中给出相同的属性,我发现这种方法可以使它工作:
jsonProps <- sapply(df$var2, function(x) fromJSON(x)) %>%
t() %>%
as.data.frame()
rownames(jsonProps) <- NULL
y <- cbind(df, jsonProps)
(如果可能的话,我很高兴收到有关如何提高效率的建议。)
时,这不再起作用了
我对如何从我找到的属性动态创建列并且正确传输属性值感到茫然,因此欢迎您就如何解决此问题提出建议。
答案 0 :(得分:3)
你可以这样做:
library(plyr)
library(jsonlite)
ll = lapply(df$var2, function(x) jsonlite::fromJSON(as.character(x)))
cbind(df, ldply(ll, data.frame))
# var1 var2 property1 property3 property2
#1 a {"property1": "val1", "property3": 8} val1 8 NA
#2 a {"property1": "val1", "property2": 5} val1 NA 5
数据:强>
df = structure(list(var11 = structure(c(1L, 1L), .Label = "a", class = "factor"),
var2 = structure(1:2, .Label = c("{\"property1\": \"val1\", \"property3\": 8}",
"{\"property1\": \"val1\", \"property2\": 5}"), class = "factor")), .Names = c("var1",
"var2"), class = "data.frame", row.names = 1:2)
答案 1 :(得分:0)
这并不是你想做的一切,但也许更好
library("dplyr")
library("jsonlite")
get_it <- function(x) {
jsonlite::fromJSON(as.character(x))
}
tbl_df(test) %>%
rowwise() %>%
mutate(one = get_it(var2)[[1]],
two = get_it(var2)[[2]])
Source: local data frame [5 x 4]
Groups: <by row>
var1 var2 one two
(dbl) (fctr) (chr) (int)
1 1 {"property1": "val1", "property2": 5} val1 5
2 2 {"property1": "val2", "property2": 8} val2 8
3 3 {"property1": "val3", "property2": 7} val3 7
4 4 {"property1": "val4", "property2": 0} val4 0
5 5 {"property1": "val5", "property3": 9} val5 9