计算CSV文件中字段的首次出现次数

时间:2015-04-29 10:15:46

标签: python dictionary

使用以下CSV文件的格式:

Pos   ID    Name
1   0001L01 50293
2   0002L01 128864
3   0003L01 172937
4   0004L01 12878
5   0005L01 demo
6   0004L01 12878
7   0004L01 12878
8   0005L01 demo

我想要在字典中添加[ID], {Pos, Name, FirstTime},其中FirstTimeID首次出现在CSV文件中的位置相对应。例如,ID = 0005L01将具有:[0005L01],{5,demo,5},{8,demo,5}

我已设法存储[ID], {Pos,Name},但我正在与FirstTime进行斗争。到目前为止我已经:

 # From the csv reader, save it to a list
 dlist=[]
 for row in reader:
      # store only the non empty lines
      if any(row):
         dlist.append(row)
d={}
for row in dlist:
    d.setdefault(row[1],[]).append([row[0],row[2]])

2 个答案:

答案 0 :(得分:1)

如果先计算firstTime,然后填写字典,则会更容易:

# From the csv reader, save it to a list
dlist=[]
for row in reader:
    # store only the non empty lines
    if any(row):
        dlist.append(row)
firstTime={}
for row in dlist:
    if row[1] not in firstTime: firstTime[row[1]] = row[0]
d={}
for row in dlist:
    d.setdefault(row[1],[]).append([row[0],row[2],firstTime[row[1]]])

答案 1 :(得分:1)

from collections import defaultdict

d = defaultdict(list)
first = {}

for row in reader:
    if any(row):
        pos, ID, name = row
        if ID not in first:
            first[ID] = pos
        d[ID].append(pos, name, first[ID])