从使用JSON的网站解析数据表

时间:2017-03-03 05:39:28

标签: json r web-scraping rvest

我正在尝试解析明尼苏达州DNR页面中的数据,并表示他们正在使用JSON。我想构建一个脚本来从许多不同的页面下载数据表,但我首先关注的是一个。我试过rvest,JSONIO和许多其他软件包无济于事。我得到的最令人沮丧的错误是:

  

UseMethod出错(" xml_find_first"):     没有适用于' xml_find_first'的方法应用于类"列表"

的对象

这是我的代码:

nofile               # The total number of files that can be opened  
npro                 # The total number of threads that can be opened  
core unlimited       # not limited  
memlock unlimted     # not limited

如何使用标题下载此表来下载???

1 个答案:

答案 0 :(得分:0)

只需获取制作表格的实际数据即可。这是JSON而不是太复杂:

library(httr)

res <- GET("http://maps2.dnr.state.mn.us/cgi-bin/lakefinder/detail.cgi", 
           query=list(type="lake_survey", id="56003100"))

str(content(res))

这使您可以按县名获取元数据:

get_lake_metadata <- function(county_name) {

  require(httr)
  require(dplyr)
  require(jsonlite)

  xlate_df <- data_frame(
    id = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 
           11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 
           24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 
           37L, 38L, 39L, 40L, 41L, 42L, 44L, 45L, 46L, 43L, 47L, 48L, 49L, 
           50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 
           63L, 64L, 65L, 66L, 67L, 68L, 70L, 71L, 72L, 69L, 73L, 74L, 75L, 
           76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L), 
    county = c("Aitkin", "Anoka", "Becker", "Beltrami", "Benton", 
               "Big Stone", "Blue Earth", "Brown", "Carlton", "Carver", 
               "Cass", "Chippewa", "Chisago", "Clay", "Clearwater", "Cook", 
               "Cottonwood", "Crow Wing", "Dakota", "Dodge", "Douglas", 
               "Faribault", "Fillmore", "Freeborn", "Goodhue", "Grant", 
               "Hennepin", "Houston", "Hubbard", "Isanti", "Itasca", "Jackson", 
               "Kanabec", "Kandiyohi", "Kittson", "Koochiching", "Lac Qui Parle", 
               "Lake", "Lake of the Woods", "Le Sueur", "Lincoln", "Lyon", 
               "Mahnomen", "Marshall", "Martin", "McLeod", "Meeker", "Mille Lacs", 
               "Morrison", "Mower", "Murray", "Nicollet", "Nobles", "Norman", 
               "Olmsted", "Otter Tail", "Pennington", "Pine", "Pipestone", 
               "Polk", "Pope", "Ramsey", "Red Lake", "Redwood", "Renville", 
               "Rice", "Rock", "Roseau", "Scott", "Sherburne", "Sibley", 
               "St. Louis", "Stearns", "Steele", "Stevens", "Swift", "Todd", 
               "Traverse", "Wabasha", "Wadena", "Waseca", "Washington", 
               "Watonwan", "Wilkin", "Winona", "Wright", "Yellow Medicine"))

  target <- filter(xlate_df, tolower(county) == tolower(county_name))

  if (nrow(target) == 1) {
    res <- GET("http://maps2.dnr.state.mn.us/cgi-bin/lakefinder_json.cgi",
               query=list(county=target$id))
    jsonlite::fromJSON(content(res, as="parsed"))
  } else {
    message("County not found")
  }

}

get_lake_metadata("Anoka")
get_lake_metadata("Steele")