我有一个看起来像这样的csv
FILE.CSV
"File is","NameofFileA"
"randomdata","1" <-- size of file
"randomdata","32"
"randomdata","43"
<---->[this is a blank line found in the file]
"File is","NameofFileB"
"randomdata","4"
"randomdata","3"
"randomdata","1"
所以我想要的是最终得到像这样的列表
NameofFileA Total = 73
NameofFileB Total = 8
...
..
.
我找到了如何从第2列(第[1]行)得到最后一列的总数,但它没有按照NameofFileX排序:
with open(csvInput,"r") as inputFile, open(csvOutput ,"w") as outputFile:
data = csv.reader(inputFile, delimiter=',', quotechar='"')
total = 0
headerline = inputFile.next()
for row in data:
print ', '.join(row)
total += int(row[1])
print total
问题: 我如何用pythonic的方式说“为NameofFileA添加所有项目”?
文件名总是以“File is”为单位进行,而“header”行总是出现在上面的空行之后。
我不知道如何计算出一个空白行,然后告诉它将行[1]的空白存储为文件名...然后跳过行并将所有行放在行[1]中,但停在空白处。
由于
答案 0 :(得分:2)
为了获取文件的数字,您可以利用https://www.microsoft.com/en-au/download/details.aspx?id=30425从CSV返回行,直到找到空白行。然后只需将数字相加并从CSV中读取下一个文件:
import csv
from itertools import takewhile
res = []
with open('file.csv') as in_f:
reader = csv.reader(in_f, delimiter=',', quotechar='"')
# Read next name from CSV
for _, name in reader:
# Read rows and return numbers until blank line is found
total = sum(int(row[1]) for row in takewhile(bool, reader))
res.append((name, total))
print res
输出:
[('NameofFileA', 76), ('NameofFileB', 8)]
在上面itertools.takewhile
用作谓词,让takewhile
知道是否应返回行。由于CSV阅读器将返回空行为[]
而空list
为False
,因此布尔上下文takewhile
将停在那里。
然后,对于takewhile
返回的每一行,生成器表达式从第二列获取值并将其转换为int
。最后将这些数字相加得到总数。