使用xml2和purrr创建列表的元素时出现bind_rows错误

时间:2017-08-08 11:22:28

标签: r tidyverse purrr xml2 tibble

我的目标是从XML文件中提取列表列表,并将它们存储为元组以供进一步使用。我已成功使用here描述的tidyverse选项,但收到某些列表的错误消息。

我已将错误消息输入我的在线搜索引擎,并在Stack Overflow上使用了以下搜索字词:

  • map_df(flatten)bind_rows
  • map_df getCharCE
  • bind_rows getCharCE
  • bind_rows CHARSXP

但没有成功确定我的问题的解决方案。

代码:

library(tidyverse)
library(purrr)
library(xml2)

url_deu <- "http://uwwtd.oieau.fr/Germany/sites/uwwtd.oieau.fr.Germany/files/data_sources/ekommude20145thdeploymentrelease_1475165194.xml"

dat_deu <- read_xml(url_deu)

list_deu <- as_list(dat_deu)

vector_deu <- names(list_deu$UWWTD_Report)

list_deu %>% 
  map(vector_deu[12]) %>% 
  flatten() %>% 
  map_df(flatten)

错误消息:

Error in bind_rows_(x, .id) : 'getCharCE' must be called on a CHARSXP

它适用于vector_deu中的大多数其他列表元素,请参阅:

list_deu %>% 
  map(vector_deu[7]) %>% 
  flatten() %>% 
  map_df(flatten)

结果的前五行:

 # A tibble: 4,372 x 23
   aggState        repCode      aggCode           aggName aggNUTS aggLatitude aggLongitude aggGenerated
      <chr>          <chr>        <chr>             <chr>   <chr>       <chr>        <chr>        <chr>
 1        1 DE_UWWT_2014_1  DEAG_SH1000         Flensburg   DEF01     54.8049     9.446056       140000
 2        1 DE_UWWT_2014_1  DEAG_SH3000 Hansestadt Lübeck   DEF03    53.90011      10.6945       395152
 3        1 DE_UWWT_2014_1 DEAG_SH3000b    Lübeck-Priwall   DEF03    53.77016     10.85915        18317
 4        1 DE_UWWT_2014_1 DEAG_SH51001        Albersdorf   DEF05    54.14533      9.27315         4870
 5        1 DE_UWWT_2014_1 DEAG_SH51011       Brunsbüttel   DEF05    53.89388     9.180801        13886

我也使用了unlist,它有效:

unlist(list_deu$UWWTD_Report[12])

并返回以下内容:

    MSLevel.repCode MSLevel.mslSludgeProduction      MSLevel.mslWWReusePerc 
   "DE_UWWT_2014_1"                   "1463463"                         "0" 

任何帮助都将受到高度赞赏。

会话信息

sessionInfo()

R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 17.04

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.7.0
LAPACK: /usr/lib/lapack/liblapack.so.3.7.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=de_CH.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=de_CH.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_CH.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=de_CH.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] xml2_1.1.1      dplyr_0.7.2     purrr_0.2.2.2   readr_1.1.1     tidyr_0.6.3     tibble_1.3.3   
[7] ggplot2_2.2.1   tidyverse_1.1.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12     cellranger_1.1.0 compiler_3.4.1   plyr_1.8.4       bindr_0.1        forcats_0.2.0   
 [7] tools_3.4.1      jsonlite_1.5     lubridate_1.6.0  nlme_3.1-131     gtable_0.2.0     lattice_0.20-35 
[13] pkgconfig_2.0.1  rlang_0.1.1      psych_1.7.5      curl_2.8.1       parallel_3.4.1   haven_1.1.0     
[19] bindrcpp_0.2     stringr_1.2.0    httr_1.2.1       hms_0.3          grid_3.4.1       glue_1.1.1      
[25] R6_2.2.2         readxl_1.0.0     foreign_0.8-69   modelr_0.1.1     reshape2_1.4.2   magrittr_1.5    
[31] scales_0.4.1     rvest_0.3.2      assertthat_0.2.0 mnormt_1.5-5     colorspace_1.3-2 stringi_1.1.5   
[37] lazyeval_0.2.0   munsell_0.4.3    broom_0.4.2   

0 个答案:

没有答案