我从USGS网站上抓取了河流流量数据,流量数据后跟字母A,P或E,以显示实际流量,预测流量或估算流量。有没有一种方法可以截断数据,使其不显示A,P和/或E?
flows_raw <- "https://waterdata.usgs.gov/ut/nwis/dv?cb_00060=on&format=html&site_no=10163000&referred_module=sw&period=&begin_date=2010-10-14&end_date=2020-10-14" %>%
read_html() %>%
html_nodes("table") %>%
.[2] %>%
html_table() %>%
.[[1]] %>%
as_tibble()
flows_raw
答案 0 :(得分:1)
好吧,我自己解决了它,但是如果有人有更清洁的方法,我还是很乐意看到它。
flows_raw <- "https://waterdata.usgs.gov/ut/nwis/dv?cb_00060=on&format=html&site_no=10163000&referred_module=sw&period=&begin_date=2008-11-14&end_date=2020-10-14" %>%
read_html() %>%
html_nodes("table") %>%
.[2] %>%
html_table() %>%
.[[1]] %>%
# remove extraneous info from the values marked by the letters A, P, or E
separate(`Dis-charge, ft3/s,(Mean)`, into = c("edit1", "extra"), convert = TRUE, sep = "A") %>%
separate(edit1, into = c("edit2", "extra"), convert = TRUE, sep = "P") %>%
separate(edit2, into = c("Cubic Ft/Sec (mean)", "extra"), convert = TRUE, sep = "E") %>%
# delete the column that we moved the A, P, and E's into
select(-extra) %>%
as_tibble()