我对Json文件很新。我抓了一个带有几百万个json对象的txt文件,例如:
{
"created_at":"Mon Oct 14 21:04:25 +0000 2013",
"default_profile":true,
"default_profile_image":true,
"description":"...",
"followers_count":5,
"friends_count":560,
"geo_enabled":true,
"id":1961287134,
"lang":"de",
"name":"Peter Schmitz",
"profile_background_color":"C0DEED",
"profile_background_image_url":"http://abs.twimg.com/images/themes",
"utc_offset":-28800,
...
}
{
"created_at":"Fri Oct 17 20:04:25 +0000 2015",
...
}
我想将列提取到R:
中的数据框中Variable Value
created_at X
default_profile Y
…
一般来说,类似于Python中的完成(multiple Json objects in one file extract by python)。如果有人有想法或建议,将非常感谢帮助!谢谢!
答案 0 :(得分:2)
以下是有关如何使用两个对象进行处理的示例。我假设您能够从文件中读取JSON,否则请参阅here。
myjson = '{"created_at": "Mon Oct 14 21:04:25 +0000 2013", "default_profile": true,
"default_profile_image": true, "description": "...", "followers_count":
5, "friends_count": 560, "geo_enabled": true, "id": 1961287134, "lang":
"de", "name": "Peter Schmitz", "profile_background_color": "C0DEED",
"profile_background_image_url": "http://abs.twimg.com/images/themes", "utc_offset": -28800}
{"created_at": "Mon Oct 15 21:04:25 +0000 2013", "default_profile": true,
"default_profile_image": true, "description": "...", "followers_count":
5, "friends_count": 560, "geo_enabled": true, "id": 1961287134, "lang":
"de", "name": "Peter Schmitz", "profile_background_color": "C0DEED",
"profile_background_image_url": "http://abs.twimg.com/images/themes", "utc_offset": -28800}
'
library("rjson")
# Split the text into a list of all JSON objects. I chose '!x!x!' pretty randomly.. There may be better ways of keeping the brackets wile splitting.
my_json_objects = head(strsplit(gsub('\\}','\\}!x!x!', myjson),'!x!x!')[[1]],-1)
# read the text as JSON objects
json_data <- lapply(my_json_objects, function(x) {fromJSON(x)})
# Transform to dataframes
json_data <- lapply(json_data, function(x) {data.frame(val=unlist(x))})
输出:
[[1]]
val
created_at Mon Oct 14 21:04:25 +0000 2013
default_profile TRUE
default_profile_image TRUE
description ...
followers_count 5
friends_count 560
geo_enabled TRUE
id 1961287134
lang de
name Peter Schmitz
profile_background_color C0DEED
profile_background_image_url http://abs.twimg.com/images/themes
utc_offset -28800
[[2]]
val
created_at Mon Oct 15 21:04:25 +0000 2013
default_profile TRUE
default_profile_image TRUE
description ...
followers_count 5
friends_count 560
geo_enabled TRUE
id 1961287134
lang de
name Peter Schmitz
profile_background_color C0DEED
profile_background_image_url http://abs.twimg.com/images/themes
utc_offset -28800
希望这有帮助!