我正在尝试编写一个Python程序,将简单的文本文件转换为.csv文件。
每行输入都包含from:
后跟名称的名称。不以from:
开头的行将被忽略。
输入:
from: Lance Cummins
This line is ignored by the program
from: Jackie Cohen
Hello world
from: Chris Paul
Lalala
from: Jackie Cohen
Message
程序的输出应该是一个CSV文件,显示人员的姓名,然后是输入文件中出现的次数:
Lance Cummins,1
Chris Paul,1
Jackie Cohen,2
但是,该程序的实际输出是:
["Chris Paul": 1, "Lance Cummins": 1, "Jackie Cohen": 2}
令我困惑的是,我让另一个人在他们的计算机上运行我的程序,结果是正确的。为什么会这样?
这是我的实际计划:
def is_field(field_name, s):
if s[:len(field_name)] == field_name:
return True
else:
return False
def contributor_counts(file_name):
fname = open(file_name, "r" )
counts = {}
for x in fname:
if is_field("from: ", x):
x = x.strip("from: ")
x = x.rstrip()
if x in counts:
counts[x] = counts[x] + 1
else:
counts[x] = 1
return counts
def print_contributors(counts):
for x in counts:
if counts[x] > 1:
print str(x) + " posted " + str(counts[x]) + " times"
else:
print str(x) + " posted once"
def save_contributors(counts, output_file_name):
f = open(output_file_name, "w")
for value in counts:
number = counts[value]
y = str(value) + "," + str(number)
f.write(y + "\n")
f.close()
contributions = contributor_counts("long182feed.txt")
print_contributors(contributions)
save_contributors(contributions, 'contributors.csv')
答案 0 :(得分:0)
真正的问题是生成csv中行的顺序吗?请注意,Python字典中的项目可以按任意顺序排列。 OrderedDict可用Python 2.7开始。
答案 1 :(得分:0)
您可以利用Python标准库即csv
模块和collections.Counter
:
#!/usr/bin/env python
import csv
import fileinput
from collections import Counter
c = Counter()
for line in fileinput.input(): # read stdin or file(s) provided at a command-line
if line.startswith('from:'): # ignore lines that do not start with 'from:'
name = line.partition(':')[2].strip() # extract name from the line
c[name] += 1 # count number of occurrences
# write csv file
with open('contributors.csv', 'wb') as f:
csv.writer(f, delimiter=',').writerows(c.most_common())
$ python write-csv.py input.txt
Jackie Cohen,2
Lance Cummins,1
Chris Paul,1
作为替代方法,您可以使用正则表达式来解析输入:
#!/usr/bin/env python
import csv
import fileinput
import re
import sys
from collections import Counter
text = ''.join(fileinput.input()) # read input
c = Counter(re.findall(r'^from:\s*(.*?)\s*$', text, re.MULTILINE)) # count names
csv.writer(sys.stdout, delimiter=',').writerows(c.most_common()) # write csv