Python匹配列表并返回找到的值

时间:2013-02-08 13:43:03

标签: python list csv string-matching

我正在尝试计算CSV文件中最常见的值,并在CSV文件中的每个项目旁边附加出现值。例如:

CSV文件:

  * 8 Values in Column 1*
  HelloWorld
  HelloWorld
  HelloSaturn
  HelloMars
  HelloPluto
  HelloSaturn
  HelloMoon
  HelloMoon

Python代码计算最常见的:

  #Removed Code - Take each row in CSV and append to list#
  #Create new list, count common occurrences out of 8 items
  newList = []
  counter = collections.Counter(newList)
  d = counter.most_common(8)
  print d

印刷输出(已计算上述CSV中最常见的值,例如有两个'HelloWorld'):

  [('HelloWorld', 2), ('HelloMars', 1), ('HelloSaturn', 2), ('HelloPluto', 1), ('HelloMoon', 2)]

我现在正尝试将这些值附加/插入到每个值旁边的CSV文件中,例如:

  * 8 Values in Column 1* *Occurrence*
  HelloWorld 2
  HelloWorld 2
  HelloSaturn 2
  HelloMars 1
  HelloPluto 1
  HelloSaturn 2
  HelloMoon 2
  HelloMoon 2

我该怎么做?

2 个答案:

答案 0 :(得分:2)

您需要使用csv.writer对象重写CSV文件:

  1. 使用csv.reader
  2. 将CSV文件读入内存(如行列表等)
  3. 使用现有代码计算出现频率
  4. 迭代您在步骤1中读取的每一行。使用csv.writer输出行中的每一列。在行的末尾,输出您在步骤2中计算的相应频率。
  5. 代码看起来像这样(完全未经测试):

    import csv
    list_of_rows = list()
    with open(filename) as fin:
        reader = csv.reader(fin)
        for row in reader:
           list_of_rows.append(row)
    
    # calculate frequency of occurrence
    counter = ...
    
    with open(filename, "w") as fout:
        writer = csv.writer(fout)
        for row in counter.most_common(8):            
            # row is now (word, frequency)
            writer.writerow(row)
    

答案 1 :(得分:1)

import csv

# I fake here the opening and extracting from a CSV file
# to obtain a list of the words of the first column
ss = """HelloWorld
HelloWorld
HelloSaturn
HelloMars
HelloPluto
HelloSaturn
HelloMoon
HelloMoon"""
column = ss.splitlines()


# Now, the counting
from collections import Counter
c = Counter(column) 

# Seeing the counter we got
print '\n'.join(c)

# Putting the result in a CSV file
with open('resu.csv','wb') as g:
    gw = csv.writer(g)
    gw.writerows([item,c[item]] for item in column)