我以前(并且仍然)从开头link导航时遇到问题,以便到达下一个链接以填写数据请求表单,然后从结果表中提取信息。我的编码尝试是here。
我被告知CDC Wonder确实有一个API,以便在R中提交XML请求表单。有关如何提交表单的所有详细信息的链接是here。
但是,我不知道如何使用R发送XML请求表单并尝试搜索解决方案。如果有人可以根据API的说明和他们列出的XML表单示例之一让我开始,那么我想我可以弄清楚其余部分。
下面的代码显示了我尝试使用第一个示例XML请求表单:
request_xml <-
"<?xml version="1.0" encoding="UTF-8"?>
<request-parameters>
<parameter>
<name>accept_datause_restrictions</name>
<value>true</value>
</parameter>
<parameter>
<name>B_1</name>
<value>D76.V1-level1</value>
</parameter>
<parameter>
<name>B_2</name>
<value>D76.V8</value>
</parameter>
<parameter>
<name>B_3</name>
<value>*None*</value>
</parameter>
<parameter>
<name>B_4</name>
<value>*None*</value>
</parameter>
<parameter>
<name>B_5</name>
<value>*None*</value>
</parameter>
<parameter>
<name>F_D76.V1</name>
<value>2009</value>
<value>2010</value>
<value>2011</value>
<value>2012</value>
<value>2013</value>
</parameter>
<parameter>
<name>F_D76.V10</name>
<value>*All*</value>
</parameter>
<parameter>
<name>F_D76.V2</name>
<value>C00-D48</value>
</parameter>
<parameter>
<name>F_D76.V27</name>
<value>*All*</value>
</parameter>
<parameter>
<name>F_D76.V9</name>
<value>*All*</value>
</parameter>
<parameter>
<name>I_D76.V1</name>
<value>
2009 (2009) 2010 (2010) 2011 (2011) 2012 (2012) 2013 (2013)
</value>
</parameter>
<parameter>
<name>I_D76.V10</name>
<value>*All* (The United States)</value>
</parameter>
<parameter>
<name>I_D76.V2</name>
<value>C00-D48 (Neoplasms)</value>
</parameter>
<parameter>
<name>I_D76.V27</name>
<value>*All* (The United States)</value>
</parameter>
<parameter>
<name>I_D76.V9</name>
<value>*All* (The United States)</value>
</parameter>
<parameter>
<name>M_1</name>
<value>D76.M1</value>
</parameter>
<parameter>
<name>M_2</name>
<value>D76.M2</value>
</parameter>
<parameter>
<name>M_3</name>
<value>D76.M3</value>
</parameter>
<parameter>
<name>M_41</name>
<value>D76.M41</value>
</parameter>
<parameter>
<name>M_42</name>
<value>D76.M42</value>
</parameter>
<parameter>
<name>O_V10_fmode</name>
<value>freg</value>
</parameter>
<parameter>
<name>O_V1_fmode</name>
<value>freg</value>
</parameter>
<parameter>
<name>O_V27_fmode</name>
<value>freg</value>
</parameter>
<parameter>
<name>O_V2_fmode</name>
<value>freg</value>
</parameter>
<parameter>
<name>O_V9_fmode</name>
<value>freg</value>
</parameter>
<parameter>
<name>O_aar</name>
<value>aar_std</value>
</parameter>
<parameter>
<name>O_aar_pop</name>
<value>0000</value>
</parameter>
<parameter>
<name>O_age</name>
<value>D76.V5</value>
</parameter>
<parameter>
<name>O_javascript</name>
<value>on</value>
</parameter>
<parameter>
<name>O_location</name>
<value>D76.V9</value>
</parameter>
<parameter>
<name>O_precision</name>
<value>1</value>
</parameter>
<parameter>
<name>O_rate_per</name>
<value>100000</value>
</parameter>
<parameter>
<name>O_show_totals</name>
<value>true</value>
</parameter>
<parameter>
<name>O_timeout</name>
<value>300</value>
</parameter>
<parameter>
<name>O_title</name>
<value>Example1</value>
</parameter>
<parameter>
<name>O_ucd</name>
<value>D76.V2</value>
</parameter>
<parameter>
<name>O_urban</name>
<value>D76.V19</value>
</parameter>
<parameter>
<name>VM_D76.M6_D76.V10</name>
<value/>
</parameter>
<parameter>
<name>VM_D76.M6_D76.V17</name>
<value>*All*</value>
</parameter>
<parameter>
<name>VM_D76.M6_D76.V1_S</name>
<value>*All*</value>
</parameter>
<parameter>
<name>VM_D76.M6_D76.V7</name>
<value>*All*</value>
</parameter>
<parameter>
<name>VM_D76.M6_D76.V8</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V1</name>
<value/>
</parameter>
<parameter>
<name>V_D76.V10</name>
<value/>
</parameter>
<parameter>
<name>V_D76.V11</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V12</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V17</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V19</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V2</name>
<value/>
</parameter>
<parameter>
<name>V_D76.V20</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V21</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V22</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V23</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V24</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V25</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V27</name>
<value/>
</parameter>
<parameter>
<name>V_D76.V4</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V5</name>
<value>1</value>
<value>1-4</value>
<value>5-14</value>
<value>15-24</value>
<value>25-34</value>
<value>35-44</value>
<value>45-54</value>
<value>55-64</value>
<value>65-74</value>
<value>75-84</value>
<value>85+</value>
</parameter>
<parameter>
<name>V_D76.V51</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V52</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V6</name>
<value>00</value>
</parameter>
<parameter>
<name>V_D76.V7</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V8</name>
<value>*All*</value>
</parameter>
<parameter>
<name>V_D76.V9</name>
<value/>
</parameter>
<parameter>
<name>action-Send</name>
<value>Send</value>
</parameter>
<parameter>
<name>finder-stage-D76.V1</name>
<value>codeset</value>
</parameter>
<parameter>
<name>finder-stage-D76.V10</name>
<value>codeset</value>
</parameter>
<parameter>
<name>finder-stage-D76.V2</name>
<value>codeset</value>
</parameter>
<parameter>
<name>finder-stage-D76.V27</name>
<value>codeset</value>
</parameter>
<parameter>
<name>finder-stage-D76.V9</name>
<value>codeset</value>
</parameter>
<parameter>
<name>stage</name>
<value>request</value>
</parameter>
</request-parameters>"
library(RCurl)
url <- "http://wonder.cdc.gov/controller/datarequest/D76"
data <- getURL(
url = url,
postfields = request_xml,
verbose = TRUE
)
感谢您的时间!
ACE
答案 0 :(得分:3)
尽管有些人宣称CDC WONDER API非常棒,但对我来说这似乎是一个非常糟糕的API接口。但是,由于针对SQL的过多攻击方法,我可以看出为什么他们可能会因为暴露SQL界面(这是真正需要的)而犹豫不决。
实际上,必须手工制作XML才能进行这些查询,这简直太可怕了。因此,您应该使用wondr
,一个与API一起使用的R包。
它仍然很难看(这是他们的一个示例查询):
library(wondr) ## devtools::install_github("hrbrmstr/wondr")
wondr() %>%
add_param("B_1", "D76.V22") %>%
add_param("B_2", "D76.V23") %>%
add_param("B_3", "*None*") %>%
add_param("B_4", "*None*") %>%
add_param("B_5", "*None*") %>%
add_param("F_D76.V1", "*All*") %>%
add_param("F_D76.V10", "*All*") %>%
add_param("F_D76.V2", "*All*") %>%
add_param("F_D76.V27", "*All*") %>%
add_param("F_D76.V9", "*All*") %>%
add_param("I_D76.V1", "*All* (All Dates)") %>%
add_param("I_D76.V10", "*All* (The United States)") %>%
add_param("I_D76.V2", "*All* (All Causes of Death)") %>%
add_param("I_D76.V27", "*All* (The United States)") %>%
add_param("I_D76.V9", "*All* (The United States)") %>%
add_param("M_1", "D76.M1") %>%
add_param("M_2", "D76.M2") %>%
add_param("M_3", "D76.M3") %>%
add_param("O_V10_fmode", "freg") %>%
add_param("O_V1_fmode", "freg") %>%
add_param("O_V27_fmode", "freg") %>%
add_param("O_V2_fmode", "freg") %>%
add_param("O_V9_fmode", "freg") %>%
add_param("O_aar", "aar_none") %>%
add_param("O_aar_pop", "0000") %>%
add_param("O_age", "D76.V52") %>%
add_param("O_javascript", "on") %>%
add_param("O_location", "D76.V9") %>%
add_param("O_precision", "1") %>%
add_param("O_rate_per", "100000") %>%
add_param("O_show_totals", "true") %>%
add_param("O_timeout", "300") %>%
add_param("O_title", "Example2") %>%
add_param("O_ucd", "D76.V22") %>%
add_param("O_urban", "D76.V19") %>%
add_param("VM_D76.M6_D76.V10", "") %>%
add_param("VM_D76.M6_D76.V17", "*All*") %>%
add_param("VM_D76.M6_D76.V1_S", "*All*") %>%
add_param("VM_D76.M6_D76.V7", "*All*") %>%
add_param("VM_D76.M6_D76.V8", "*All*") %>%
add_param("V_D76.V1", "") %>%
add_param("V_D76.V10", "") %>%
add_param("V_D76.V11", "*All*") %>%
add_param("V_D76.V12", "*All*") %>%
add_param("V_D76.V17", "*All*") %>%
add_param("V_D76.V19", "*All*") %>%
add_param("V_D76.V2", "") %>%
add_param("V_D76.V20", "*All*") %>%
add_param("V_D76.V21", "*All*") %>%
add_param("V_D76.V22", "1", "2", "3", "4", "5") %>%
add_param("V_D76.V23", "*All*") %>%
add_param("V_D76.V24", "*All*") %>%
add_param("V_D76.V25", "*All*") %>%
add_param("V_D76.V27", "") %>%
add_param("V_D76.V4", "*All*") %>%
add_param("V_D76.V5", "*All*") %>%
add_param("V_D76.V51", "*All*") %>%
add_param("V_D76.V52", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18") %>%
add_param("V_D76.V6", "00") %>%
add_param("V_D76.V7", "*All*") %>%
add_param("V_D76.V8", "*All*") %>%
add_param("V_D76.V9", "") %>%
add_param("action-Send", "Send") %>%
add_param("finder-stage-D76.V1", "codeset") %>%
add_param("finder-stage-D76.V10", "codeset") %>%
add_param("finder-stage-D76.V2", "codeset") %>%
add_param("finder-stage-D76.V27", "codeset") %>%
add_param("finder-stage-D76.V9", "codeset") %>%
add_param("stage", "request") %>%
make_query("D76") -> query_result
您需要解析结果:
library(xml2)
library(purrr)
xml_find_all(query_result, ".//response/data-table/r") %>%
map_df(function(row) {
xml_find_all(row, ".//c") %>%
xml_attrs() %>%
as.list() %>%
setNames(sprintf("V%d", 1:length(.))) %>%
as.data.frame(stringsAsFactors=FALSE)
}) -> df
print(df)
## V1 V2 V3 V4 V5
## 1 Unintentional Cut/Pierce 97 1,243,249,173 0.0
## 2 Unintentional Drowning 15,945 1,243,249,173 1.3
## 3 Unintentional Fall 2,213 1,243,249,173 0.2
## 4 Unintentional Fire/Flame 7,301 1,243,249,173 0.6
## 5 Unintentional Hot object/Substance 96 1,243,249,173 0.0
## 6 Unintentional Firearm 2,042 1,243,249,173 0.2
## 7 Unintentional Machinery 484 1,243,249,173 0.0
## 8 Unintentional Motor Vehicle Traffic 74,997 1,243,249,173 6.0
## 9 Unintentional Other Pedal cyclist 458 1,243,249,173 0.0
## 10 Unintentional Other Pedestrian 3,221 1,243,249,173 0.3
## 11 Unintentional Other land transport 3,449 1,243,249,173 0.3
## 12 Unintentional Other transport 1,344 1,243,249,173 0.1
## 13 Unintentional Natural/Environmental 1,681 1,243,249,173 0.1
## 14 Unintentional Overexertion 2 1,243,249,173 Unreliable
## 15 Unintentional Poisoning 7,326 1,243,249,173 0.6
## 16 Unintentional Struck by or against 1,378 1,243,249,173 0.1
## 17 Unintentional Suffocation 17,356 1,243,249,173 1.4
## 18 Unintentional Other specified, classifiable Injury 1,199 1,243,249,173 0.1
## 19 Unintentional Other specified, not elsewhere classified Injury 518 1,243,249,173 0.0
## 20 Unintentional Unspecified Injury 1,540 1,243,249,173 0.1
## 21 Unintentional 1 142,647 1,243,249,173 11.5
## 22 Suicide Cut/Pierce 64 1,243,249,173 0.0
## 23 Suicide Drowning 111 1,243,249,173 0.0
## 24 Suicide Fall 412 1,243,249,173 0.0
## 25 Suicide Fire/Flame 50 1,243,249,173 0.0
## 26 Suicide Firearm 9,956 1,243,249,173 0.8
## 27 Suicide Other land transport 139 1,243,249,173 0.0
## 28 Suicide Poisoning 1,336 1,243,249,173 0.1
## 29 Suicide Suffocation 10,559 1,243,249,173 0.8
## 30 Suicide Other specified, classifiable Injury 379 1,243,249,173 0.0
## 31 Suicide Other specified, not elsewhere classified Injury 103 1,243,249,173 0.0
## 32 Suicide Unspecified Injury 76 1,243,249,173 0.0
## 33 Suicide 1 23,185 1,243,249,173 1.9
## 34 Homicide Cut/Pierce 2,375 1,243,249,173 0.2
## 35 Homicide Drowning 459 1,243,249,173 0.0
## 36 Homicide Fall 33 1,243,249,173 0.0
## 37 Homicide Fire/Flame 561 1,243,249,173 0.0
## 38 Homicide Hot object/Substance 47 1,243,249,173 0.0
## 39 Homicide Firearm 20,897 1,243,249,173 1.7
## 40 Homicide Other land transport 131 1,243,249,173 0.0
## 41 Homicide Other transport 8 1,243,249,173 Unreliable
## 42 Homicide Poisoning 494 1,243,249,173 0.0
## 43 Homicide Struck by or against 338 1,243,249,173 0.0
## 44 Homicide Suffocation 1,582 1,243,249,173 0.1
## 45 Homicide Other specified, classifiable Injury 2,912 1,243,249,173 0.2
## 46 Homicide Other specified, not elsewhere classified Injury 1,137 1,243,249,173 0.1
## 47 Homicide Unspecified Injury 5,821 1,243,249,173 0.5
## 48 Homicide 1 36,795 1,243,249,173 3.0
## 49 Undetermined Cut/Pierce 10 1,243,249,173 Unreliable
## 50 Undetermined Drowning 377 1,243,249,173 0.0
## 51 Undetermined Fall 96 1,243,249,173 0.0
## 52 Undetermined Fire/Flame 256 1,243,249,173 0.0
## 53 Undetermined Hot object/Substance 4 1,243,249,173 Unreliable
## 54 Undetermined Firearm 501 1,243,249,173 0.0
## 55 Undetermined Other land transport 33 1,243,249,173 0.0
## 56 Undetermined Poisoning 1,217 1,243,249,173 0.1
## 57 Undetermined Struck by or against 2 1,243,249,173 Unreliable
## 58 Undetermined Suffocation 1,234 1,243,249,173 0.1
## 59 Undetermined Other specified, classifiable Injury 23 1,243,249,173 0.0
## 60 Undetermined Other specified, not elsewhere classified Injury 267 1,243,249,173 0.0
## 61 Undetermined Unspecified Injury 743 1,243,249,173 0.1
## 62 Undetermined 1 4,763 1,243,249,173 0.4
## 63 Legal Intervention / Operations of War Firearm 251 1,243,249,173 0.0
## 64 Legal Intervention / Operations of War Other specified, classifiable Injury 1 1,243,249,173 Unreliable
## 65 Legal Intervention / Operations of War Other specified, not elsewhere classified Injury 10 1,243,249,173 Unreliable
## 66 Legal Intervention / Operations of War Unspecified Injury 2 1,243,249,173 Unreliable
## 67 Legal Intervention / Operations of War 1 264 1,243,249,173 0.0
## 68 2 207,654 1,243,249,173 16.7 <NA>
而且,你仍然需要清理它,但是你现在至少有办法制作,执行和解析查询。
可能值得请求rOpenSci人员采用这个pkg(我不记得在他们庞大的图书馆看到它)。我也会在几周内打它。