从.csv文件中获取并存储相邻值(Python)

时间:2013-08-14 11:02:44

标签: python csv

如果我用有问题的.csv进行解释,可能会更容易:

https://www.dropbox.com/s/iswvm4xyjnlhj2w/speciesandbss.csv

以上是双壳类物种的清单及其在白垩纪末收集位置的床剪应力的相应值。

我正在尝试创建一个事件图,我需要格式化我的数据,以便物种名称在一列中,具有相应的最低和最高值的床剪应力(在数据集中,有多个同时发生的同一物种)。

显然,手工操作会非常繁琐。

如何创建一个循环以将每个匹配项附加到单独的列表中,该名称是床剪切应力所对应的物种的名称?然后我可以遍历每个列表以找到最高和最低。

输入:

eggs 0.1
ham 0.2
ham 0.5
eggs 0.7
eggs 0.3

输出:

eggs = [0.1, 0.7, 0.3]
ham = [0.2, 0.5]

1 个答案:

答案 0 :(得分:0)

将值收集到列表字典中; collections.defaultdict()对象最简单:

from collections import defaultdict
import csv

species = defaultdict(list)

with open('speciesandbss.csv', 'rb') as inputfile:
    for row in csv.reader(inputfile):
        species[row[0]].append(row[1])

for name in sorted(species, key=str.lower):
    print '{} = {}'.format(name, species[name])

输出:

acutata = ['0.16509', '0.16509', '0.16509']
acutocostata = ['0.03145', '0.01936', '0.01781', '0.01698', '0.01684', '0.01077']
adkinsi = ['0.16509']
Aenona = ['0.01311', '0.01311']
aequilateralis = ['0.00495', '0.00445', '0.00368', '0.00356']
agdjakendensis = ['0.00628']
Agerostrea = ['0.01764']
albertensis = ['0.00852', '0.00356', '0.00495', '0.00461', '0.00445', '0.0041']
alta = ['0.00328', '0.33148', '0.33148', '0.43129', '0.33148', '0.325', '0.17882', '0.00307']
alternata = ['0.04929', '0.03373', '0.01311']
americana = ['0.01497', '0.00436', '0.01497', '0.00495', '0.00461', '0.00445', '0.00105']
anacachoensis = ['0.05696', '0.05696', '0.05172', '0.03373']
angulatum = ['0.01179']
anomala = ['0.00852']
Anomia = ['0.00852', '0.00506', '0.02955', '0.00786']
anteradiata = ['0.43129', '0.16509']
antroea = ['0.01373']
antrosa = ['0.01103']
Aphrodina = ['0.43129', '0.01311']
apressus = ['0.01564']
Arca = ['0.01179', '0.01311', '0.01311', '0.01311', '0.01224', '0.01224']
archeri = ['0.16509', '0.16509', '0.16509', '0.16509', '0.16509', '0.16509', '0.16509']
Arctica = ['0.00203']
argentaria = ['0.01233', '0.33148', '0.33148', '0.33148', '0.33148', '0.43129', '0.43129', '0.21502', '0.01311', '0.01224', '0.00352', '0.01311', '0.01311', '0.01179', '0.01373', '0.01311', '0.01311', '0.01224', '0.01224', '0.01224', '0.01224', '0.16509', '0.01564']
armatum = ['0.33148', '0.33148', '0.33148', '0.33148']
Ascaulocardium = ['0.43129', '0.21502']
assiniboiensis = ['0.00401', '0.00436', '0.00436', '0.00685', '0.00495', '0.00495', '0.00486', '0.00461', '0.00453', '0.00445']
assiniboinensis = ['0.00117']
Astarte = ['0.01497', '0.01311']
balchii = ['0.00786']
balticus = ['0.05696', '0.05696', '0.05696', '0.05238', '0.03623', '0.03373', '0.00724', '0.04574']
barabini = ['0.01233', '0.00852', '0.00506']
Barbatia = ['0.05696', '0.03373', '0.18121', '0.17882', '0.01224']
bartoni = ['0.16509']
bartrami = ['0.325', '0.26095', '0.25697', '0.17882', '0.01311']
bella = ['0.01311', '0.01311', '0.01311', '0.01311', '0.01311', '0.01311', '0.01311', '0.01311', '0.01764']
bellisculptus = ['0.05696', '0.05696', '0.03373', '0.25697', '0.01311', '0.01224', '0.01311', '0.01224', '0.01179', '0.01311', '0.01311', '0.01311', '0.01311', '0.01311']
berryi = ['0.43129', '0.17882']
biplicata = ['0.05696', '0.03373', '0.33148', '0.33148', '0.43129', '0.01224', '0.16509', '0.16509', '0.16509', '0.16509', '0.16509', '0.16509']
bisulcata = ['0.01311', '0.01224', '0.01224']
borealis = ['0.01233', '0.00852', '0.00452', '0.00401', '0.00852', '0.00452', '0.00401', '0.00436', '0.01497', '0.02971']
bowiei = ['0.16509', '0.16509']
Breviarca = ['0.01311', '0.01311']
Brevicardium = ['0.16509']
brevifrons = ['0.43129']
bryani = ['0.16509']
bulbosa = ['0.16509', '0.16509', '0.16509']
burlingtonensis = ['0.05696', '0.05696', '0.04929', '0.03373', '0.325', '0.01311', '0.01179', '0.04574']

写下最低和最高值:

with open('outputfile.csv', 'wb') as outputfile:
    writer = csv.writer(outputfile)
    writer.writerows([n, min(values), max(values)] for n, v in species.iteritems() for values in (map(float, v),))