在字典python中转换列表列表

时间:2019-11-26 09:06:53

标签: python list dictionary

我正面临以下问题。我有一个使用以下代码从远程URL获取的列表列表:

import csv
import urllib.request

text_url = 'https://www.emidius.eu/fdsnws/event/1/query?starttime=1899-01-01T00:00:00&endtime=1899-01-31T23:59:59&minmag=4&maxmag=9&orderby=time-asc&limit=100&format=text'

with urllib.request.urlopen(text_url) as response:
   my_text = response.read().decode()

lines = my_text.splitlines()
reader = csv.reader(lines, delimiter='|')

我可以使用以下方法将阅读器转换为列表列表:

my_list = list(reader)

我想做的是在列表字典中转换列表列表(或在reader本身内)。第一个列表的项应成为字典键,而从第二个元素到最后一个元素,我希望将字典值作为列表获取:

my_list[0] # dict keys
['#EventID',
 'Time',
 'Latitude',
 'Longitude',
 'Depth/km',
 'Author',
 'Catalog',
 'Contributor',
 'ContributorID',
 'MagType',
 'Magnitude',
 'MagAuthor',
 'EventLocationName']

my_list[1:] # dict values as list
[['quakeml:eu.ahead/event/18990105_0245_000',
  '1899-01-05T02:45:--',
  '41.500',
  '13.783',
  '',
  'AHEAD',
  'SHEEC',
  'CPTI04',
  '1309',
  'Mw',
  '4.63',
  'SHEEC',
  'Pignataro'],
 ['quakeml:eu.ahead/event/18990118_2048_000',
  '1899-01-18T20:48:--',
  '46.180',
  '14.500',
  '4.8',
  'AHEAD',
  'SHEEC',
  'RIBA982',
  '',
  'Mw',
  '4.51',
  'SHEEC',
  'Vodice Brnik'],
 ['quakeml:eu.ahead/event/18990122_0956_000',
  '1899-01-22T09:56:--',
  '37.200',
  '21.600',
  '',
  'AHEAD',
  'SHEEC',
  'PAPA003',
  '',
  'Mw',
  '6.50',
  'SHEEC',
  'Kyparissia'],
 ['quakeml:eu.ahead/event/18990131_1112_000',
  '1899-01-31T11:12:--',
  '66.300',
  '-19.900',
  '',
  'AHEAD',
  'SHEEC',
  'AMBSI000',
  '',
  'Mw',
  '5.80',
  'SHEEC',
  '[N. Iceland]'],
 ['quakeml:eu.ahead/event/18990131_2345_000',
  '1899-01-31T23:45:--',
  '60.100',
  '5.500',
  '30',
  'AHEAD',
  'SHEEC',
  'FEN007',
  '',
  'Mw',
  '4.60',
  'SHEEC',
  '[Biornafjorden]']]

基本上,输出应类似于:

d['#EventID'] = ['quakeml:eu.ahead/event/18990105_0245_000', 'quakeml:eu.ahead/event/18990105_0245_000', 'quakeml:eu.ahead/event/18990105_0245_000']

5 个答案:

答案 0 :(得分:2)

尝试一下

>>> result_dict = {}
>>> for idx, key in enumerate(a):
    for val in b:
        result_dict.setdefault(key, []).append(val[idx])

输出:

>>> result_dict
{'#EventID': ['quakeml:eu.ahead/event/18990105_0245_000', 'quakeml:eu.ahead/event/18990118_2048_000', 'quakeml:eu.ahead/event/18990122_0956_000', 'quakeml:eu.ahead/event/18990131_1112_000', 'quakeml:eu.ahead/event/18990131_2345_000'], 'Time': ['1899-01-05T02:45:--', '1899-01-18T20:48:--', '1899-01-22T09:56:--', '1899-01-31T11:12:--', '1899-01-31T23:45:--'], 'Latitude': ['41.500', '46.180', '37.200', '66.300', '60.100'], 'Longitude': ['13.783', '14.500', '21.600', '-19.900', '5.500'], 'Depth/km': ['', '4.8', '', '', '30'], 'Author': ['AHEAD', 'AHEAD', 'AHEAD', 'AHEAD', 'AHEAD'], 'Catalog': ['SHEEC', 'SHEEC', 'SHEEC', 'SHEEC', 'SHEEC'], 'Contributor': ['CPTI04', 'RIBA982', 'PAPA003', 'AMBSI000', 'FEN007'], 'ContributorID': ['1309', '', '', '', ''], 'MagType': ['Mw', 'Mw', 'Mw', 'Mw', 'Mw'], 'Magnitude': ['4.63', '4.51', '6.50', '5.80', '4.60'], 'MagAuthor': ['SHEEC', 'SHEEC', 'SHEEC', 'SHEEC', 'SHEEC'], 'EventLocationName': ['Pignataro', 'Vodice Brnik', 'Kyparissia', '[N. Iceland]', '[Biornafjorden]']}

答案 1 :(得分:2)

使用csv.DictReaderdict.setdefault

例如:

import csv

d = {}
reader = csv.DictReader(lines, delimiter='|')
for row in reader:                              #Iterate Each row
    for k, v in row.items():                    #Iterate Key-Value
        d.setdefault(k, []).append(v)

答案 2 :(得分:1)

一个天真的选择是这样的:

l = [["a","b","c"],[1,2,3],[4,5,6],[7,8,9]]
d = {k:[] for k in l[0]}
for i in l[1:]:
    dummy = {k:v for k,v in zip(l[0],i)}
    for k in d.keys():
        d[k].append(dummy[k])

答案 3 :(得分:1)

列表可以通过zip()旋转90度

d = {key:val for key, val in zip(my_list[0], zip(*my_list[1:]))}

答案 4 :(得分:1)

不使用字典即可​​解决问题的另一种方法是将CSV文件加载到Pandas数据框中:

import pandas as pd
import urllib.request

text_url = 'https://www.emidius.eu/fdsnws/event/1/query?starttime=1899-01-01T00:00:00&endtime=1899-01-31T23:59:59&minmag=4&maxmag=9&orderby=time-asc&limit=100&format=text'

with urllib.request.urlopen(text_url) as response:
    df = pd.read_csv(response, sep='|')

现在数据采用结构化格式:

>>> df
                                   #EventID  ... EventLocationName
0  quakeml:eu.ahead/event/18990105_0245_000  ...         Pignataro
1  quakeml:eu.ahead/event/18990118_2048_000  ...      Vodice Brnik
2  quakeml:eu.ahead/event/18990122_0956_000  ...        Kyparissia
3  quakeml:eu.ahead/event/18990131_1112_000  ...      [N. Iceland]
4  quakeml:eu.ahead/event/18990131_2345_000  ...   [Biornafjorden]

[5 rows x 13 columns]
>>> df['#EventID']
0    quakeml:eu.ahead/event/18990105_0245_000
1    quakeml:eu.ahead/event/18990118_2048_000
2    quakeml:eu.ahead/event/18990122_0956_000
3    quakeml:eu.ahead/event/18990131_1112_000
4    quakeml:eu.ahead/event/18990131_2345_000
Name: #EventID, dtype: object
>>> df.Latitude * df.Longitude
0     571.9945
1     669.6100
2     803.5200
3   -1319.3700
4     330.5500
dtype: float64