直接从Python 3中的网站阅读csv文件

时间:2017-10-05 17:27:23

标签: python python-3.x csv

我正在尝试直接从网站(从可下载的链接)读取CSV文件,然后将其列中的一个作为列表获取,以便我可以进一步使用它。我无法正确编码。我到达的最近的是

import csv
import urllib.request as urllib
import urllib.request as urlRequest
import urllib.parse as urlParse

url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
req = urlRequest.Request(url, headers = headers)
# open the url 
x = urlRequest.urlopen(req)
sourceCode = x.read()

1 个答案:

答案 0 :(得分:1)

你非常接近目标。

只需按行拆分读取的CSV数据并将其传递给csv.reader():

import csv
import urllib.request as urllib
import urllib.request as urlRequest
import urllib.parse as urlParse

url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
req = urlRequest.Request(url, headers = headers)
# open the url 
x = urlRequest.urlopen(req)
sourceCode = x.read()

cr = csv.DictReader(sourceCode.splitlines())
l = [row['Series'] for row in cr]

但请注意x.read()返回bytearray,因此如果csv包含非ASCII符号,请不要忘记添加:

 x.read().decode('utf-8') # or another encoding you need