答案 0 :(得分:1)
我对您的预期输出并不完全清楚(请参阅@ user5783745答案的评论和讨论)。您的JSON字符串包含一些嵌套对象,如果您使用list
,它们将产生嵌套的jsonlite::fromJSON
结构。由于您没有为给出的样本数据提供匹配的预期输出,因此可能会有不同的方式来处理这些嵌套条目。
一种可能性是解析JSON字符串,然后flatten
解析两次list
,然后再绑定行。
library(tidyverse)
library(jsonlite)
map(json, ~fromJSON(.x) %>% flatten() %>% flatten()) %>% bind_rows()
## A tibble: 2 x 15
# state text has_emoji created_at id indices screen_name name id_str
# <chr> <chr> <lgl> <chr> <dbl> <list> <chr> <chr> <chr>
#1 New … WeTh… FALSE Mon Sep 0… 2.75e7 <int [… joncoopert… Jon … 27493…
#2 Indi… "RT … FALSE Mon Sep 0… 1.68e9 <int [… dariusherr… Dari… 16808…
## … with 6 more variables: source <chr>, location <chr>, verified <lgl>,
## url <chr>, expanded_url <chr>, display_url <chr>
生成的对象是带有一些tibble
列的list
。要以CSV格式存储,您可以排除那些list
列。
json <- c(
'{"state": "New Jersey", "text": "RT @joncoopertweets: Register to join the #WeThePeopleMarch on September 21st in Washington, D.C. \u2014 or one of the 50+ marches that will be\u2026", "has_emoji": false, "created_at": "Mon Sep 02 16:32:05 +0000 2019", "id": 1168562246349467649, "entities": {"hashtags": [{"text": "WeThePeopleMarch", "indices": [42, 59]}], "urls": [], "user_mentions": [{"screen_name": "joncoopertweets", "name": "Jon Cooper", "id": 27493883, "id_str": "27493883", "indices": [3, 19]}], "symbols": []}, "source": "Twitter for iPad", "location": "Leonia, NJ", "verified": false, "geocode": null}',
'{"state": "Indiana", "text": "RT @dariusherron1: Don\u2019t nobody love they girl like Mexicans ", "has_emoji": false, "created_at": "Mon Sep 02 16:32:05 +0000 2019", "id": 1168562246378827776, "entities": {"hashtags": [], "urls": [{"url": "", "expanded_url": "", "display_url": "", "indices": [61, 84]}], "user_mentions": [{"screen_name": "dariusherron1", "name": "Darius Herron", "id": 1680891876, "id_str": "1680891876", "indices": [3, 17]}], "symbols": []}, "source": "Twitter for iPhone", "location": "Indianapolis, IN", "verified": false, "geocode": null}')
答案 1 :(得分:0)
您可以轻松地将其转换为更易于使用(a list
)的数据格式,但是此后的处理方法由您决定。在这种情况下,数据列表不会自动变成data.frame
-您必须考虑如何转换(鉴于某些列表项是单个项目,而其他列表项本身是data.frames
< / p>
a <- '{"state": "New Jersey", "text": "RT @joncoopertweets: Register to join the #WeThePeopleMarch on September 21st in Washington, D.C. \u2014 or one of the 50+ marches that will be\u2026", "has_emoji": false, "created_at": "Mon Sep 02 16:32:05 +0000 2019", "id": 1168562246349467649, "entities": {"hashtags": [{"text": "WeThePeopleMarch", "indices": [42, 59]}], "urls": [], "user_mentions": [{"screen_name": "joncoopertweets", "name": "Jon Cooper", "id": 27493883, "id_str": "27493883", "indices": [3, 19]}], "symbols": []}, "source": "Twitter for iPad", "location": "Leonia, NJ", "verified": false, "geocode": null}'
library(jsonlite)
library(dplyr)
a <- a %>% fromJSON
new_dataframe <- data.frame(state=character(),
text=character(),
has_emoji=character(),
id=character(),
entities=character(), stringsAsFactors = FALSE)
new_dataframe[1, ] <- c(a$state, a$text, a$has_emoji, a$created_at, a$id)