有没有一种方法可以编辑已抓取的小节的值?

时间:2020-10-15 22:42:11

标签: r tibble

我从USGS网站上抓取了河流流量数据,流量数据后跟字母A,P或E,以显示实际流量,预测流量或估算流量。有没有一种方法可以截断数据,使其不显示A,P和/或E?

flows_raw <- "https://waterdata.usgs.gov/ut/nwis/dv?cb_00060=on&format=html&site_no=10163000&referred_module=sw&period=&begin_date=2010-10-14&end_date=2020-10-14" %>%
  read_html() %>%
  html_nodes("table") %>%
  .[2] %>%
  html_table() %>%
  .[[1]] %>%
  as_tibble()
flows_raw

1 个答案:

答案 0 :(得分:1)

好吧,我自己解决了它,但是如果有人有更清洁的方法,我还是很乐意看到它。

   flows_raw <- "https://waterdata.usgs.gov/ut/nwis/dv?cb_00060=on&format=html&site_no=10163000&referred_module=sw&period=&begin_date=2008-11-14&end_date=2020-10-14" %>%
  read_html() %>%
  html_nodes("table") %>%
  .[2] %>%
  html_table() %>%
  .[[1]] %>%

  # remove extraneous info from the values marked by the letters A, P, or E

  separate(`Dis-charge, ft3/s,(Mean)`, into = c("edit1", "extra"), convert = TRUE, sep = "A") %>% 
  separate(edit1, into = c("edit2", "extra"), convert = TRUE, sep = "P") %>% 
  separate(edit2, into = c("Cubic Ft/Sec (mean)", "extra"), convert = TRUE, sep = "E") %>% 

  # delete the column that we moved the A, P, and E's into
  
  select(-extra) %>%
  as_tibble()