我使用.json
library(jsonlite)
stream_in(file(".json"))
个文件
但是,其中一列仍然显示为.json
格式。
我不确定如何继续从ID
列中取消email
和.json
列。
My example:
date <- as.Date(as.character( c("2015-02-13",
"2015-02-14",
"2015-02-14")))
ID <- c(1,2,3)
name <- c("John","Michael","Thomas")
drinks <- c("Beer","Coffee","Tee")
consumed <- c(2,5,3)
john<- "{\"employeID\":\"1\",\"other_details\":{\"email\":\"john@gmx.com\"},\"computer\":\"yes\"}"
michael<- "{\"employeID\":\"2\",\"other_details\":{\"email\":\"michael@yahoo.com\"},\"computer\":\"yes\"}"
thomas<- "{\"employeID\":\"3\",\"other_details\":{\"email\":\"thomas@gmail.com\"},\"computer\":\"yes\"}"
json <- c(john,michael,thomas)
df <- data.frame(date,ID,name,drinks,consumed,json)
data.frame看起来像这样:
我想获得以下格式:
date ID name drinks consumed email computer
#1 2015-02-13 1 John Beer 2 john@gmx.com yes
#2 2015-02-14 2 Michael Coffee 5 michael@yahoo.com no
#3 2015-02-14 3 Thomas Tee 3 thomas@gmail.com yes
我尝试过的是首先在不同版本中再次使用library(jsonlite)
,但它总是导致:
fromJSON(df$json[1])
Error: Argument 'txt' must be a JSON string, URL or file.
如何正确提取这些字段?
答案 0 :(得分:2)
df$json
是因子向量,而fromJSON
只接受JSON字符串,URL或文件。你可以尝试
fromJSON(as.character(df$json[1]))
或在您创建stringsAsFactor=FALSE
时添加df
。
你完成任务,你可以尝试:
library(tidyverse)
df %>%
filter(json != "{}") %>% # Drop rows with json == "{}"
rowwise() %>%
do(data.frame(ID = .$ID, jsonlite::fromJSON(.$json), stringsAsFactors=FALSE)) %>%
merge(df %>% select(-json), by="ID", all.y=TRUE)
输出:
ID employeID email computer date name drinks consumed
1 1 1 john@gmx.com yes 2015-02-13 John Beer 2
2 2 2 michael@yahoo.com yes 2015-02-14 Michael Coffee 5
3 3 3 thomas@gmail.com yes 2015-02-14 Thomas Tee 3
它可以处理"{}"
列中json
的案例。
df2 <- df %>%
rbind(data.frame(date="2015-02-14", ID=4, name="Kitman",
drinks="Chocolate", consumed=1, json="{}"))
df2 %>%
filter(json != "{}") %>%
rowwise() %>%
do(data.frame(ID = .$ID, jsonlite::fromJSON(.$json), stringsAsFactors=FALSE)) %>%
merge(df2 %>% select(-json), by="ID", all.y=TRUE)
输出:
ID employeID email computer date name drinks consumed
1 1 1 john@gmx.com yes 2015-02-13 John Beer 2
2 2 2 michael@yahoo.com yes 2015-02-14 Michael Coffee 5
3 3 3 thomas@gmail.com yes 2015-02-14 Thomas Tee 3
4 4 <NA> <NA> <NA> 2015-02-14 Kitman Chocolate 1
<强>过时:强>
cbind(
df %>% select(-json),
df$json %>%
map(~as.data.frame(jsonlite::fromJSON(.))) %>%
do.call("rbind", .)
)
输出:
date ID name drinks consumed employeID email computer
1 2015-02-13 1 John Beer 2 1 john@gmx.com yes
2 2015-02-14 2 Michael Coffee 5 2 michael@yahoo.com yes
3 2015-02-14 3 Thomas Tee 3 3 thomas@gmail.com yes
答案 1 :(得分:1)
首先,试试:
ndjson::stream_in("filename.json")
ndjson
包比jsonlite
更快,并且是为了展平而设计的(它是非常特定于任务的,而不是像瑞士军刀一样非常有用{{1} } pkg)。
或者,我们可以一直保持整齐的习语:
jsonlite
而且,你得到你的角色列。