使用键/ ID将json数据从一列追加到另一列

时间:2018-10-30 01:44:47

标签: r

我有三个这样的表,它们有一个键和一个描述该特定键的随机数据的字段:

> json1
       key                                        field
1 hg8oxoi4 "components":{"a": "21","b": "12","c": "34"}
2 gic3bv14 "components":{"a": "78","b": "66","c": "54"}
3 yo47wglq  "components":{"a": "6","b": "12","c": "12"}
4 vibidd0l   "components":{"a": "45","b": "5","c": "1"}
> json2
       key                                          field
1 hg8oxoi4 "last_recall": {"date": "012118","size": "43"}
2 vibidd0l "last_recall": {"date": "101618","size": "12"}
> json3
       key                           field
1 hg8oxoi4 "other_fields":{"people": "11"}
2 gic3bv14 "other_fields":{"people": "10"}
3 yo47wglq  "other_fields":{"people": "4"}

什么是将所有三个表合并为一个最佳方法,确保将所有键彼此匹配并处理哪些键具有数据而哪些键没有数据的差异?理想情况下,每个字段都将附加到另一个字段上,以便新表的字段列是具有不同数据的json对象。

编辑:这是预期的输出。

> json4
       key
1 hg8oxoi4
2 gic3bv14
3 yo47wglq
4 vibidd0l
                                                                                                                       field
1 {"components":{"a": "21","b": "12","c": "34"},"last_recall": {"date": "012118","size": "43"},"other_fields":{"people": "11"}}
2                                                {"components":{"a": "78","b": "66","c": "54"},"other_fields":{"people": "10"}}
3                                                  {"components":{"a": "6","b": "12","c": "12"},"other_fields":{"people": "4"}}
4                                   {"components":{"a": "45","b": "5","c": "1"},"last_recall": {"date": "101618","size": "12"}}

编辑2:json1和json2的输出

> dput(json1)
structure(list(key = c("hg8oxoi4", "gic3bv14", "yo47wglq", "vibidd0l"
), field = c("\"components\":{\"a\": \"21\",\"b\": \"12\",\"c\": \"34\"}", 
"\"components\":{\"a\": \"78\",\"b\": \"66\",\"c\": \"54\"}", 
"\"components\":{\"a\": \"6\",\"b\": \"12\",\"c\": \"12\"}", 
"\"components\":{\"a\": \"45\",\"b\": \"5\",\"c\": \"1\"}")), .Names = c("key", 
"field"), row.names = c(NA, -4L), class = "data.frame")

> dput(json2)
structure(list(key = c("hg8oxoi4", "vibidd0l"), field = c("\"last_recall\": {\"date\": \"012118\",\"size\": \"43\"}", 
"\"last_recall\": {\"date\": \"101618\",\"size\": \"12\"}")), .Names = c("key", 
"field"), row.names = c(NA, -2L), class = "data.frame")

1 个答案:

答案 0 :(得分:1)

将“数据集”放入merge后,我们通过“键” list

out <- Reduce(function(...) merge(..., all = TRUE, by = "key"), 
       mget(ls(pattern ="^json\\d+$")))

然后,paste行将非NA元素

out$field <- apply(out[-1], 1, function(x) paste(x[!is.na(x)], collapse=", "))
out[c("key", "field")]