Question

我已经查看了我在此论坛In Python, given a URL to a text file, what is the simplest way to read the contents of the text file?

中找到的其他答案

这很有用，但如果你在这里查看我的网址文件http://baldboybakery.com/courses/phys2300/resources/CDO6674605799016.txt

你会注意到这里有大量的数据。所以当我使用这段代码时：

import urllib2

data =
urllib2.urlopen('http://baldboybakery.com/courses/phys2300/resources/CDO6674605799016.txt').read(69700) # read only 69700 chars

data = data.split("\n") # then split it into lines

for line in data:

      print line

python可以使用URL文件中的标题读取的字符数是69700个字符，但我的问题是我需要其中的所有数据大约是30000000个字符左右。

当我放入大量字符时，我只得到一大块数据而不是所有数据，URL文件数据中每一列的标题都消失了。帮助解决这个问题??

Answer 1

你想在这里做什么是阅读并以块的形式处理数据，例如：

import urllib2
f = urllib2.urlopen('http://baldboybakery.com/courses/phys2300/resources/CDO6674605799016.txt')
while True:
    next_chunk = f.read(4096) #read next 4k
    if not next_chunk: #all data has been read
        break
    process_chunk(next_chunk) #arbitrary processing
f.close()

Answer 2

简单的方法可以正常工作：

如果要逐行检查文件：

for line in urllib2.urlopen('http://baldboybakery.com/courses/phys2300/resources/CDO6674605799016.txt'):
    # Do something, like maybe print the data:
    print line,

或者，如果您想下载所有数据：

data = urllib2.urlopen('http://baldboybakery.com/courses/phys2300/resources/CDO6674605799016.txt')
data = data.read()
sys.stdout.write(data)

给定文本文件的URL，读取具有大量数据的文本文件内容的最简单方法是什么？

2 个答案: