如何直接从URL读取.dat文件并访问其中的列?

时间:2019-04-24 16:29:03

标签: python

我正在尝试通过URL访问该文件:

https://data.princeton.edu/wws509/datasets/copen.dat

但是,我无法访问它并将其拆分以进行培训和测试。

有人对此有解决方案吗?

谢谢

我运行了以下代码,将数据转换为html。现在如何访问数据,例如。如果要访问某些行和列,我该怎么办?

import urllib.request
weburl=urllib.request.urlopen('https://data.princeton.edu/wws509/datasets/cuse.dat')

print('result code:'+ str(weburl.getcode()))
data=weburl.read()
print(data)

1 个答案:

答案 0 :(得分:1)

为此,您需要在python中安装请求模块。requests module

如@nekomatic建议的那样,您可以通过以下链接Getting list of lists into pandas DataFrame

将数据转换为正确的格式
import requests

response = requests.get('https://data.princeton.edu/wws509/datasets/copen.dat')
data = response.text // you can use response.json() method in this line

print("data is ")
print(data)

// the url we mentioned given data in text/plain format so response.json() doesn't work

data_by_line = data.split('\n')
for i in range(0,len(data_by_line)):
   data_by_line[i] = ' '.join(data_by_line[i].split())
   data_by_line[i] = data_by_line[i].split(' ')

print(data_by_line[2][2]) // output will be "low". We have converted data to multidimensional list(data_by_line)