有这样的csv
col1,col2,col3
t,t,t
f,f,f
t,f,t
该文件非常大(50 Mb),包含许多列
需要计算每列的t数量
试过这个:
import csv
import collections
col1 = collections.Counter()
with open('file.csv') as input_file:
for row in csv.reader(input_file, delimiter=','):
col1[row[0]] += 1
print 'Number of t in col1: %s' % col1['t']
但这仅计算第一列(col1),我如何计算多列?
答案 0 :(得分:1)
import csv
totals = {}
with open('file.csv') as input_file:
for row in csv.reader(input_file, delimiter=','):
for column, cell in enumerate(row):
if column not in totals:
totals[column] = 0
if cell == 't':
totals[column] += 1
for column in totals:
print 'column %d has %d trues' % (column, totals[column])
答案 1 :(得分:0)
这将计算第一列中的Ts数。我假设它们都是小写的,但如果不是这样,你可以很容易地做出改变。
t_count = []
with open('file.csv') as f:
for line in f:
for col_num, col in enumerate(line.rstrip().split(',')):
if len(t_count) < col_num + 1:
t_count.append(0)
if col == "t":
t_count[col_num] += 1
print t_coun
吨
[2,1,2]
这将告诉每列的Ts数,因此索引0是col1,依此类推......