我正面临以下问题。我有一个使用以下代码从远程URL获取的列表列表:
import csv
import urllib.request
text_url = 'https://www.emidius.eu/fdsnws/event/1/query?starttime=1899-01-01T00:00:00&endtime=1899-01-31T23:59:59&minmag=4&maxmag=9&orderby=time-asc&limit=100&format=text'
with urllib.request.urlopen(text_url) as response:
my_text = response.read().decode()
lines = my_text.splitlines()
reader = csv.reader(lines, delimiter='|')
我可以使用以下方法将阅读器转换为列表列表:
my_list = list(reader)
我想做的是在列表字典中转换列表列表(或在reader
本身内)。第一个列表的项应成为字典键,而从第二个元素到最后一个元素,我希望将字典值作为列表获取:
my_list[0] # dict keys
['#EventID',
'Time',
'Latitude',
'Longitude',
'Depth/km',
'Author',
'Catalog',
'Contributor',
'ContributorID',
'MagType',
'Magnitude',
'MagAuthor',
'EventLocationName']
my_list[1:] # dict values as list
[['quakeml:eu.ahead/event/18990105_0245_000',
'1899-01-05T02:45:--',
'41.500',
'13.783',
'',
'AHEAD',
'SHEEC',
'CPTI04',
'1309',
'Mw',
'4.63',
'SHEEC',
'Pignataro'],
['quakeml:eu.ahead/event/18990118_2048_000',
'1899-01-18T20:48:--',
'46.180',
'14.500',
'4.8',
'AHEAD',
'SHEEC',
'RIBA982',
'',
'Mw',
'4.51',
'SHEEC',
'Vodice Brnik'],
['quakeml:eu.ahead/event/18990122_0956_000',
'1899-01-22T09:56:--',
'37.200',
'21.600',
'',
'AHEAD',
'SHEEC',
'PAPA003',
'',
'Mw',
'6.50',
'SHEEC',
'Kyparissia'],
['quakeml:eu.ahead/event/18990131_1112_000',
'1899-01-31T11:12:--',
'66.300',
'-19.900',
'',
'AHEAD',
'SHEEC',
'AMBSI000',
'',
'Mw',
'5.80',
'SHEEC',
'[N. Iceland]'],
['quakeml:eu.ahead/event/18990131_2345_000',
'1899-01-31T23:45:--',
'60.100',
'5.500',
'30',
'AHEAD',
'SHEEC',
'FEN007',
'',
'Mw',
'4.60',
'SHEEC',
'[Biornafjorden]']]
基本上,输出应类似于:
d['#EventID'] = ['quakeml:eu.ahead/event/18990105_0245_000', 'quakeml:eu.ahead/event/18990105_0245_000', 'quakeml:eu.ahead/event/18990105_0245_000']
答案 0 :(得分:2)
尝试一下
>>> result_dict = {}
>>> for idx, key in enumerate(a):
for val in b:
result_dict.setdefault(key, []).append(val[idx])
输出:
>>> result_dict
{'#EventID': ['quakeml:eu.ahead/event/18990105_0245_000', 'quakeml:eu.ahead/event/18990118_2048_000', 'quakeml:eu.ahead/event/18990122_0956_000', 'quakeml:eu.ahead/event/18990131_1112_000', 'quakeml:eu.ahead/event/18990131_2345_000'], 'Time': ['1899-01-05T02:45:--', '1899-01-18T20:48:--', '1899-01-22T09:56:--', '1899-01-31T11:12:--', '1899-01-31T23:45:--'], 'Latitude': ['41.500', '46.180', '37.200', '66.300', '60.100'], 'Longitude': ['13.783', '14.500', '21.600', '-19.900', '5.500'], 'Depth/km': ['', '4.8', '', '', '30'], 'Author': ['AHEAD', 'AHEAD', 'AHEAD', 'AHEAD', 'AHEAD'], 'Catalog': ['SHEEC', 'SHEEC', 'SHEEC', 'SHEEC', 'SHEEC'], 'Contributor': ['CPTI04', 'RIBA982', 'PAPA003', 'AMBSI000', 'FEN007'], 'ContributorID': ['1309', '', '', '', ''], 'MagType': ['Mw', 'Mw', 'Mw', 'Mw', 'Mw'], 'Magnitude': ['4.63', '4.51', '6.50', '5.80', '4.60'], 'MagAuthor': ['SHEEC', 'SHEEC', 'SHEEC', 'SHEEC', 'SHEEC'], 'EventLocationName': ['Pignataro', 'Vodice Brnik', 'Kyparissia', '[N. Iceland]', '[Biornafjorden]']}
答案 1 :(得分:2)
使用csv.DictReader
和dict.setdefault
例如:
import csv
d = {}
reader = csv.DictReader(lines, delimiter='|')
for row in reader: #Iterate Each row
for k, v in row.items(): #Iterate Key-Value
d.setdefault(k, []).append(v)
答案 2 :(得分:1)
一个天真的选择是这样的:
l = [["a","b","c"],[1,2,3],[4,5,6],[7,8,9]]
d = {k:[] for k in l[0]}
for i in l[1:]:
dummy = {k:v for k,v in zip(l[0],i)}
for k in d.keys():
d[k].append(dummy[k])
答案 3 :(得分:1)
列表可以通过zip()
旋转90度
d = {key:val for key, val in zip(my_list[0], zip(*my_list[1:]))}
答案 4 :(得分:1)
不使用字典即可解决问题的另一种方法是将CSV文件加载到Pandas数据框中:
import pandas as pd
import urllib.request
text_url = 'https://www.emidius.eu/fdsnws/event/1/query?starttime=1899-01-01T00:00:00&endtime=1899-01-31T23:59:59&minmag=4&maxmag=9&orderby=time-asc&limit=100&format=text'
with urllib.request.urlopen(text_url) as response:
df = pd.read_csv(response, sep='|')
现在数据采用结构化格式:
>>> df
#EventID ... EventLocationName
0 quakeml:eu.ahead/event/18990105_0245_000 ... Pignataro
1 quakeml:eu.ahead/event/18990118_2048_000 ... Vodice Brnik
2 quakeml:eu.ahead/event/18990122_0956_000 ... Kyparissia
3 quakeml:eu.ahead/event/18990131_1112_000 ... [N. Iceland]
4 quakeml:eu.ahead/event/18990131_2345_000 ... [Biornafjorden]
[5 rows x 13 columns]
>>> df['#EventID']
0 quakeml:eu.ahead/event/18990105_0245_000
1 quakeml:eu.ahead/event/18990118_2048_000
2 quakeml:eu.ahead/event/18990122_0956_000
3 quakeml:eu.ahead/event/18990131_1112_000
4 quakeml:eu.ahead/event/18990131_2345_000
Name: #EventID, dtype: object
>>> df.Latitude * df.Longitude
0 571.9945
1 669.6100
2 803.5200
3 -1319.3700
4 330.5500
dtype: float64