问题:
我有一个20000行的json文件,基本上是每个代表特定用户活动的web日志。我想在R中创建一个data frame
来处理这些数据。这是一个json线(随机)的例子:
{"_type":"verifiedProductDetail","ts":1431820984214,"did":"7cd80696-4ede-49e4-a267-b887e684de32","profileId":"33021589-c159-4ec6-8c22-c0e5d9b600d9","preferenceIds":[],"price":115.0,"itemId":"10645","category":"/Binnenverlichting/Wandlampen","currency":1,"language":1,"name":"Wandlamp Linea 60 aluminium","url":"http://www.shop1.be/pagea/wandlampen.html_be","imageUrl":"http://vhetnevnejk.cloudfont.net/media/catalog/product/cache/7/thumbnail/450x/9df78eab33525dcdehl6e5fb8d27136e95/i/m/image_14583/Wandlamp.jpg","id":"871d275a-c856-4280-9cbd-f163b9f749e7","product":{"_id":"625363f4-0d80-3ff5-b091-174de3f9c9b2","domainId":"7cd80696-4ede-49e4-a267-b887e684de32","created":1427806290512,"updated":1436870460905,"itemId":"10645","prices":{"4":299.99,"1":69.99,"2":69.99,"5":299.99},"ratings":{"4":{"rate":1.0,"count":1,"created":1433447796660,"lan":4},"1":{"rate":0.9,"count":2,"created":1434355924529,"lan":1}},"categories":[{"language":3,"text":" Destockage","created":1427820384334},{"language":2,"text":" Outlet","created":1427883890399},{"language":1,"text":"/Binnenverlichting/Wandlampen","created":1431545171151},{"language":6,"text":" Outlet","created":1427876074772},{"language":4,"text":" Outlet","created":1427901573250},{"language":4,"text":" Beleuchtung nach Raum","created":1427827783211},{"language":11,"text":" Outlet","created":1427809161244}],"names":[{"language":3,"text":"Applique murale Linea 60cm en aluminium","created":1427820384334},{"language":2,"text":"Wall Lamp Linea 60 Aluminium","created":1427826729309},{"language":1,"text":"Wandlamp Linea 60 aluminium","created":1435695901730},{"language":6,"text":"Aplique de pared LINEA 60 aluminio ","created":1427819228360},{"language":11,"text":"Kinkiet Linea 60 aluminium","created":1427806290512},{"language":4,"text":"Wandleuchte Linea 60 Aluminium","created":1436870460905}],"imageUrl":"hhttp://vhetnevnejk.cloudfont.net/media/catalog/product/cache/7/thumbnail/450x/9df78eab335evwnrf5fb8d27136e95/i/m/image_14083/LineaWandlamp.jpg","url":"http://www.lampyiswiatlo.pl/kinkiet-linea.html","overwritePrinciples":{},"sku":"10645","stock":-1},"preferences":[]}
这是我在R中所做的:
install.packages("rjson")
library("rjson")
SampleFile <- "filesample.json"
json_data <- fromJSON(paste(readLines(SampleFile), collapse=""))
str(json_data)
summary(json_data)
最后我在R中读到它并提取变量:
> str(json_data)
List of 18
$ _type : chr "verifiedProductDetail"
$ ts : num 1.43e+12
$ did : chr "7cd80696-4ede-49e4-a267-b887e684de32"
$ profileId : chr "8be1a552-9124-453d-a0aa-7124c99b56c6"
$ preferenceIds: list()
$ price : num 26.9
$ itemId : chr "9858"
$ category : chr ""
$ currency : num 1
$ language : num 6
$ name : chr "up Weiss"
$ profile :List of 13
..$ _id : chr "8be1a552-9124-453d-a0aa-7124c99b56c6"
..$ created : num 1.43e+12
..$ updated : num 1.43e+12
[和其他人]
我的问题:但是,正如您所看到的所有变量的长度都是1,这意味着每个变量只占用并表示一个值(json文件的第一个条目)。其他价值已经消失。我们可以使用summary()函数更好地看到它。
> summary(json_data)
Length Class Mode
_type 1 -none- character
ts 1 -none- numeric
did 1 -none- character
profileId 1 -none- character
preferenceIds 0 -none- list
price 1 -none- numeric
itemId 1 -none- character
category 1 -none- character
currency 1 -none- numeric
language 1 -none- numeric
name 1 -none- character
url 1 -none- character
imageUrl 1 -none- character
id 1 -none- character
profile 13 -none- list
product 14 -none- list
group 10 -none- list
preferences 0 -none- list
摘要:您能否告诉我有关我的代码有什么问题的建议,使其只获得每个变量的第一个值而其他所有变量都消失了?