当我从
直接下载gz文件时http://prices.shufersal.co.il/(任何链接都会在表格的最左边的列上执行)
该文件被视为已损坏,但当我从该网站手动下载时,它可以正常读取
>shufersal.url='http://pricesprodpublic.blob.core.windows.net/price/Price7290027600007-001-201505220240.gz?sv=2014-02-14&sr=b&sig=VUHTJAzWEBqMJXO%2BwHE4WAh3DJNWkw4w03%2BLk8c6dUw%3D&se=2015-05-23T14%3A47%3A42Z&sp=r'
> temp <- tempfile()
> download.file(shufersal.url,temp,quiet = T)
> gzfile(temp)
class
"gzfile"
mode
"rb"
text
"text"
opened
"closed"
can read
"yes"
can write
"yes"
> readLines(gzfile(temp))
character(0)
Warning message:
invalid or incomplete compressed data
> unlink(temp)
shufersal.url="https://github.com/yonicd/supermarketprices/raw/master/shufersal/Price7290027600007-001-201505220240.gz"
temp <- tempfile()
download.file(shufersal.url,temp,quiet = T)
readLines(gzfile(temp),encoding = "UTF-8")
unlink(temp)
> readLines(gzfile(temp),encoding = "UTF-8")
[1] "<?xml version=\"1.0\" encoding=\"utf-8\"?>"
[2] "<root>"
[3] " <ChainId>7290027600007</ChainId>"
[4] " <SubChainId>001</SubChainId>"
[5] " <StoreId>001</StoreId>"
[6] " <BikoretNo>6</BikoretNo>"
[7] " <Items>"
[8] " <Item>"
[9] " <PriceUpdateDate>2015-05-21 11:11</PriceUpdateDate>"
[10] " <ItemCode>7290000000046</ItemCode>"
...
[77] " <AllowDiscount>1</AllowDiscount>"
[78] " <ItemStatus>1</ItemStatus>"
[79] " </Item>"
[80] " </Items>"
答案 0 :(得分:1)
#Set mode="wb" as an argument in download.file
download.file(shufersal.url,temp,mode="wb",quiet = T)
有关详细信息,请参阅help(download.file)