I´m trying to build a dataframe with some brazilian address by accessing a web service and searching for a zip code. Actually, I´m able to receive one single result and store it in a dataframe, but when I try to search for multiple zip codes (e.g in a vector), my dataframe is only keep the last element. Could anybody help me please?
See the code below:
###############
library(httr)
library(RCurl)
library(XML)
library(dplyr)
###############
# ZIPs I want to search for:
vectorzip <- c("71938360", "70673052", "71020510")
j <- length(vectorzip)
# loop:
for(i in 1:j) {
# Save the URL of the xml file in a variable:
xml.url <- getURL(paste("http://cep.republicavirtual.com.br/web_cep.php?cep=",vectorzip[i], sep = ""), encoding = "ISO-8859-1")
xml.url
# Use the xmlTreeParse-function to parse xml file directly from the web:
xmlfile <- xmlTreeParse(xml.url)
xmlfile
# the xml file is now saved as an object you can easily work with in R:
class(xmlfile)
# Use the xmlRoot-function to access the top node:
xmltop = xmlRoot(xmlfile)
# have a look at the XML-code of the first subnodes:
print(xmltop)
# To extract the XML-values from the document, use xmlSApply:
zips <- xmlSApply(xmlfile, function(x) xmlSApply(x, xmlValue))
zips
# Finally, get the data in a data-frame and have a look at the first rows and columns:
zips <- NULL
zips <- rbind(zips_df, data.frame(t(zips),row.names=NULL))
View(zips_df)}
答案 0 :(得分:0)
You want to:
a) define zips_df
b) define zips_df outside of the loop.
c) not set the zips_df to null inside the loop :)
###############
library(httr)
library(RCurl)
library(XML)
library(dplyr)
###############
# ZIPs I want to search for:
vectorzip <- c("71938360", "70673052", "71020510")
j <- length(vectorzip)
zips_df <- data.frame()
i<-1
# loop:
for(i in 1:j) {
# Save the URL of the xml file in a variable:
xml.url <- getURL(paste("http://cep.republicavirtual.com.br/web_cep.php?cep=",vectorzip[i], sep = ""), encoding = "ISO-8859-1")
xml.url
# Use the xmlTreeParse-function to parse xml file directly from the web:
xmlfile <- xmlTreeParse(xml.url)
xmlfile
# the xml file is now saved as an object you can easily work with in R:
class(xmlfile)
# Use the xmlRoot-function to access the top node:
xmltop = xmlRoot(xmlfile)
# have a look at the XML-code of the first subnodes:
print(xmltop)
# To extract the XML-values from the document, use xmlSApply:
zips <- xmlSApply(xmlfile, function(x) xmlSApply(x, xmlValue))
zips
# Finally, get the data in a data-frame and have a look at the first rows and columns:
zips_df <- rbind(zips_df, data.frame(t(zips),row.names=NULL))
}
View(zips_df)
You get this:
> zips_df
resultado.text resultado_txt.text uf.text cidade.text bairro.text tipo_logradouro.text logradouro.text
1 1 sucesso - cep completo DF Taguatinga Sul (Ãguas Claras) Rua 09
2 1 sucesso - cep completo DF Cruzeiro Setor Sudoeste Quadra 300 Bloco O
3 1 sucesso - cep completo DF Guará Guará I Quadra QI 11 Conjunto U
答案 1 :(得分:0)
Please try to provide a minimum working example. Your example has tons of lines of code that are not related to your actual problem. And if you tried to remove this unnecessary code, you would probably have spotted the zips <- NULL
line that is erasing the zips information, just before saving it. Secondly, you are referencing a zips_df
object, but that is not created in your code.
To answer your question:
Add a line creating the zips_df
as an empty dataframe object before you start the loop:
vectorzip <- c("71938360", "70673052", "71020510")
j <- length(vectorzip)
zips_df <- data.frame()
Remove the line where you erase the zips
object (zips <- NULL
)
Change the line where you grow the zips_df data.frame to save the full data to the data.frame object, not the temporary "zips" variable:
zips <- rbind(zips_df, data.frame(t(zips),row.names=NULL))
I also recommend removing the "View" line and inspecting the data.frame with print:
print(zips_df)
resultado.text resultado_txt.text uf.text cidade.text bairro.text tipo_logradouro.text logradouro.text
1 1 sucesso - cep completo DF Taguatinga Sul (Ã\u0081guas Claras) Rua 09
2 1 sucesso - cep completo DF Cruzeiro Setor Sudoeste Quadra 300 Bloco O
3 1 sucesso - cep completo DF Guará Guará I Quadra QI 11 Conjunto U