我有一个用逗号分隔的CSV文件。我需要读取文件,确定字段(颜色)中具有特定值(例如蓝色)的字符串,并计算符合条件的字符串百分比。
到目前为止我的代码如下:
myfile = open('3517315a.csv','r')
myfilecount = 0
linecount = 0
firstline = True
for line in myfile:
if firstline:
firstline = False
continue
fields = line.split(',')
linecount += 1
count = int(fields[0])
colour = str(fields[1])
channels = int(fields[2])
code = str(fields[3])
correct = str(fields[4])
reading = float(fields[5])
我不知道如何设定条件并计算百分比。
答案 0 :(得分:1)
试试这个:)它比其他答案更容易配置,并且由于csv
模块,它将适用于所有类型的CSV文件。使用Python 3.6.1进行测试。
import csv
import io # needed because our file is not really a file
CSVFILE = """name,occupation,birthyear
John,Salesman,1992
James,Intern,1997
Abe,Salesman,1983
Michael,Salesman,1994"""
f = io.StringIO(CSVFILE) # needed because our file is not really a file
# This is the name of the row we want to know about
our_row = 'occupation'
# If we want to limit the output to one value, put it here.
our_value = None # For example, try 'Intern'
# This will hold the total number of rows
row_total = 0
totals = dict()
for row in csv.DictReader(f):
v = row[our_row]
# If we've already come across a row with this value before, add 1 to it
if v in totals:
totals[v] += 1
else: # Set this row's total value to 1
totals[v] = 1
row_total += 1
for k, v in totals.items():
if our_value:
if k != our_value: continue
print("{}: {:.2f}%".format(k, v/row_total*100))
输出:
Salesman: 75.00%
Intern: 25.00%
答案 1 :(得分:0)
嗯,基本上有三个步骤:
linecount
occurences / linecount
看起来像这样:
myfile = open('3517315a.csv','r')
myfilecount = 0
linecount = 0
occurences = 0
firstline = True
for line in myfile:
if firstline:
firstline = False
continue
fields = line.split(',')
linecount += 1
count = int(fields[0])
colour = str(fields[1])
channels = int(fields[2])
code = str(fields[3])
correct = str(fields[4])
reading = float(fields[5])
if colour == 'Blue':
occurences_blue += 1
percentage_blue = occurences_blue / linecount
但这是一个非常基本的例子。在任何情况下,您可能应该使用Python csv库来读取csv中的字段,如对帖子(https://docs.python.org/2/library/csv.html)的评论中所建议的那样。我还希望那里有库,可以更有效地解决你的问题。
答案 2 :(得分:0)
如果您愿意使用第三方模块,那么我强烈建议您使用Pandas。代码大致是:
import pandas as pd
df = pd.read_csv("my_data.csv")
blues = len(df[df.colour == "blue"])
percentage = blues / len(df)
print(f"{percentage}% of the colours are blue")