从R中的列表中提取数据

时间:2016-04-06 10:39:42

标签: json r list

->item(0)

我想将数据放入很好的数据列中,但列表中的许多步骤都是"无名"。知道如何组织数据吗?

1 个答案:

答案 0 :(得分:1)

看看这个差异,让我知道你的想法。这就是你的对象的样子:

library(RCurl)
library(rjson)
json <- getURL('https://extraction.import.io/query/runtime/17d882b5-c118-4f27-8ce1-90085ec0b116?_apikey=d5a8a01e20174e95887dc0f385e4e3f6d7ef5ca1428d5a029f2aa352509948ade8e5d7fb0dc941f4769a32b541ca6b38a7cd6578dfd81b357fbc4f2e008f5154f1dbfcff31878798fa887b70b1ff59dd&url=http%3A%2F%2Fwww.numbeo.com%2Fcost-of-living%2Fcompare_cities.jsp%3Fcountry1%3DSingapore%26country2%3DAustralia%26city1%3DSingapore%26city2%3DMelbourne')
obj <- rjson::fromJSON(json)
str(obj)

List of 2
 $ extractorData:List of 3
  ..$ url       : chr "http://www.numbeo.com/cost-of-living/compare_cities.jsp?country1=Singapore&country2=Australia&city1=Singapore&city2=Melbourne"
  ..$ resourceId: chr "b1250747011ee774e7c881617c86a5a9"
  ..$ data      :List of 1
  .. ..$ :List of 1
  .. .. ..$ group:List of 52
  .. .. .. ..$ :List of 6
  .. .. .. .. ..$ COL VALUE        :List of 1
  .. .. .. .. .. ..$ :List of 1
  .. .. .. .. .. .. ..$ text: chr "Meal, Inexpensive Restaurant"

确实有许多你不需要的列表。现在尝试使用jsonlite软件包fromJSON函数:

library(jsonlite)
obj2<- jsonlite::fromJSON(json)

List of 2
 $ extractorData:List of 3
  ..$ url       : chr "http://www.numbeo.com/cost-of-living/compare_cities.jsp?country1=Singapore&country2=Australia&city1=Singapore&city2=Melbourne"
  ..$ resourceId: chr "b1250747011ee774e7c881617c86a5a9"
  ..$ data      :'data.frame':  1 obs. of  1 variable:
  .. ..$ group:List of 1
  .. .. ..$ :'data.frame':  52 obs. of  6 variables:
  .. .. .. ..$ COL VALUE        :List of 52
  .. .. .. .. ..$ :'data.frame':    1 obs. of  1 variable:
  .. .. .. .. .. ..$ text: chr "Meal, Inexpensive Restaurant"
  .. .. .. .. ..$ :'data.frame':    1 obs. of  1 variable:
  .. .. .. .. .. ..$ text: chr "Meal for 2 People, Mid-range Restaurant, Three-course"
  .. .. .. .. ..$ :'data.frame':    1 obs. of  1 variable:

尽管如此,这个JSON并不漂亮,我们需要解决这个问题。 我认为你想要那个数据框。所以从

开始
df <- obj2$extractorData$data$group[[1]]

并且有你的数据框。但问题是:每个单元格都在这里的列表中。包括NULL值,你不能取消那些,它们会消失,它们所在的列会变得更短......

修改:以下是如何处理list(NULL)值的列。

df[sapply(df[,2],is.null),2] <- NA
df[sapply(df[,3],is.null),3] <- NA
df[sapply(df[,4],is.null),4] <- NA
df[sapply(df[,5],is.null),5] <- NA
df2 <- sapply(df, unlist) %>% as.data.frame

肯定会写得更优雅,但这会让你感到高兴,这是可以理解的。