R. How to append loop (for) results into a Data Frame?

时间:2016-04-04 17:05:56

标签: r loops web service append

I´m trying to build a dataframe with some brazilian address by accessing a web service and searching for a zip code. Actually, I´m able to receive one single result and store it in a dataframe, but when I try to search for multiple zip codes (e.g in a vector), my dataframe is only keep the last element. Could anybody help me please?

See the code below:

###############
library(httr)
library(RCurl)
library(XML)
library(dplyr)
###############

# ZIPs I want to search for:
vectorzip <- c("71938360", "70673052", "71020510")
j <- length(vectorzip)

# loop:
for(i in 1:j) {

# Save the URL of the xml file in a variable:
xml.url <- getURL(paste("http://cep.republicavirtual.com.br/web_cep.php?cep=",vectorzip[i], sep = ""), encoding = "ISO-8859-1")
xml.url

# Use the xmlTreeParse-function to parse xml file directly from the web:
xmlfile <- xmlTreeParse(xml.url)
xmlfile
# the xml file is now saved as an object you can easily work with in R:
class(xmlfile)

# Use the xmlRoot-function to access the top node:
xmltop = xmlRoot(xmlfile)

# have a look at the XML-code of the first subnodes:
print(xmltop)

# To extract the XML-values from the document, use xmlSApply:
zips <- xmlSApply(xmlfile, function(x) xmlSApply(x, xmlValue))
zips
# Finally, get the data in a data-frame and have a look at the first rows and columns:
zips <- NULL
zips <- rbind(zips_df, data.frame(t(zips),row.names=NULL))

View(zips_df)}

2 个答案:

答案 0 :(得分:0)

You want to:

a) define zips_df
b) define zips_df outside of the loop.
c) not set the zips_df to null inside the loop :)

###############
library(httr)
library(RCurl)
library(XML)
library(dplyr)
###############

# ZIPs I want to search for:
vectorzip <- c("71938360", "70673052", "71020510")
j <- length(vectorzip)
zips_df <- data.frame()

i<-1
# loop:
for(i in 1:j) {

  # Save the URL of the xml file in a variable:
  xml.url <- getURL(paste("http://cep.republicavirtual.com.br/web_cep.php?cep=",vectorzip[i], sep = ""), encoding = "ISO-8859-1")
  xml.url

  # Use the xmlTreeParse-function to parse xml file directly from the web:
  xmlfile <- xmlTreeParse(xml.url)
  xmlfile
  # the xml file is now saved as an object you can easily work with in R:
  class(xmlfile)

  # Use the xmlRoot-function to access the top node:
  xmltop = xmlRoot(xmlfile)

  # have a look at the XML-code of the first subnodes:
  print(xmltop)

  # To extract the XML-values from the document, use xmlSApply:
  zips <- xmlSApply(xmlfile, function(x) xmlSApply(x, xmlValue))
  zips
  # Finally, get the data in a data-frame and have a look at the first rows and columns:

  zips_df <- rbind(zips_df, data.frame(t(zips),row.names=NULL))
}

  View(zips_df)

You get this:

> zips_df
  resultado.text     resultado_txt.text uf.text cidade.text         bairro.text tipo_logradouro.text  logradouro.text
1              1 sucesso - cep completo      DF  Taguatinga Sul (Ãguas Claras)                  Rua               09
2              1 sucesso - cep completo      DF    Cruzeiro      Setor Sudoeste               Quadra      300 Bloco O
3              1 sucesso - cep completo      DF      Guará            Guará I               Quadra QI 11 Conjunto U

答案 1 :(得分:0)

Please try to provide a minimum working example. Your example has tons of lines of code that are not related to your actual problem. And if you tried to remove this unnecessary code, you would probably have spotted the zips <- NULL line that is erasing the zips information, just before saving it. Secondly, you are referencing a zips_df object, but that is not created in your code.

To answer your question:

  • Add a line creating the zips_df as an empty dataframe object before you start the loop:

    vectorzip <- c("71938360", "70673052", "71020510")
    j <- length(vectorzip)
    zips_df <- data.frame()
    
  • Remove the line where you erase the zips object (zips <- NULL)

  • Change the line where you grow the zips_df data.frame to save the full data to the data.frame object, not the temporary "zips" variable:

    zips <- rbind(zips_df, data.frame(t(zips),row.names=NULL))
    

I also recommend removing the "View" line and inspecting the data.frame with print:

print(zips_df)
resultado.text     resultado_txt.text uf.text cidade.text              bairro.text tipo_logradouro.text  logradouro.text
1              1 sucesso - cep completo      DF  Taguatinga Sul (Ã\u0081guas Claras)                  Rua               09
2              1 sucesso - cep completo      DF    Cruzeiro           Setor Sudoeste               Quadra      300 Bloco O
3              1 sucesso - cep completo      DF      Guará                 Guará I               Quadra QI 11 Conjunto U