if if line:使用urllib.urlretrieve时引发EOFError

时间:2016-04-06 12:30:06

标签: python ftp urllib

我必须从FTP链接下载多个文件。但是无论顺序如何,下载都会在5个文件之后完全停止并出现上述错误。任何人都可以提出解决方案

import pandas as pd
import os
import urllib
import zipfile

zipFilePath=['ftp://ftp.sec.gov/edgar/data/1000069/000089418911000620/0000894189-11-000620-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1000180/000100018011000006/0001000180-11-000006-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1000228/000100022811000014/0001000228-11-000014-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1000229/000100022911000015/0001000229-11-000015-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1000351/000089418911000615/0000894189-11-000615-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1000351/000089418911000655/0000894189-11-000655-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1000697/000095012311018381/0000950123-11-018381-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1000753/000114036111008714/0001140361-11-008714-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1001039/000119312511027450/0001193125-11-027450-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1001082/000110465911009436/0001104659-11-009436-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/100122/000095012311020431/0000950123-11-020431-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1001250/000110465911005139/0001104659-11-005139-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1001288/000095012311019815/0000950123-11-019815-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1001604/000100160411000022/0001001604-11-000022-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1001838/000110465911011083/0001104659-11-011083-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1002047/000119312511056223/0001193125-11-056223-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1002517/000095012311011086/0000950123-11-011086-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1002638/000119312511022882/0001193125-11-022882-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1002718/000119312511040571/0001193125-11-040571-xbrl.zip',
 'ftp://ftp.sec.gov/edgar/data/1002718/000119312511042365/0001193125-11-042365-xbrl.zip']

tempFolderPath = "<give some path>"
tempDownloadPath=os.path.join(tempFolderPath,"xbrl.zip")
xbrlFinal=pd.DataFrame()
for inds,paths in enumerate(zipFilePath):
    print "processing xmls " + str(inds+1) +" of " + str(len(zipFilePath))
    urllib.urlretrieve(paths,tempDownloadPath)
    fh=open(tempDownloadPath,'rb')
    z=zipfile.ZipFile(fh)
    files=z.extract(z.namelist()[0], tempFolderPath)
    z.close()
    fh.close()

1 个答案:

答案 0 :(得分:0)

我找到了答案。实际上下载在R中工作正常,因此该网站不会强加任何请求问题。我在python,urllib,wget尝试了不同的包,并且请求不起作用,但urllib2工作。代码如下:

private void initialize(Context context, AttributeSet attrs, int defStyle) {
              LayoutInflater inflater = (LayoutInflater) context.getSystemService(
                      Context.LAYOUT_INFLATER_SERVICE);
              View root = inflater.inflate(R.layout.card_person, this, false);
                                                                // i Used false here

              mImageView = (ImageView) root.findViewById(R.id.icon1);
              mNameTextView = (TextView)root.findViewById(R.id.name_text);
              mAgeTextView = (TextView)root.findViewById(R.id.age_text);
              mLocationTextView = (TextView) root.findViewById(R.id.location_text);
              addView(root);
          }

和urllib2比其他

快5倍