Question

目标：查找“预算”列的平均值（来自导入的csv文件的数据）。

到目前为止，我已经让我的程序打开并打印电子表格的内容并清理了一些格式。

我不确定如何将文件附加到此帖子，但列中显示['标题，年份，长度，预算，评级，投票\ r \ n']

如何开始操纵数据？更具体地说，定位“预算”列并开始进行数学计算。

PS：我被要求在不使用'import csv'模块的情况下解决这个问题。

到目前为止我的工作：

f = open("movies.csv") 
lines = f.readlines()

i = 0

while i < len(lines):
    line = lines[i]
    line = line[:-2] # remove trailine \r\n\ from line
    print "%4d   %s" % (i+1, line)
    i = i + 1

Answer 1

使用标准库https://docs.python.org/3/library/csv.html

中的csv模块

import csv
with open('movies.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    budgets = [row['budget'] for row in reader]

现在你可以以任何你想要的方式操纵它，你的所有预算都在budgets变量

Answer 2

您可以像这样拆分csv文件以仅获取所需的列。我在这里打印.csv文件中的价格列。

 >>> for line in open("SalesJan2009.csv"):
...     csv_row = line.split('\r')[0].split(',')
...     print csv_row[2]

输出如下。你可以将它们相加而不是打印

Answer 3

f=open('movies.csv','r')
lines = f.read().split('\n')   #create list from the lines read
header_list = lines[0].strip().split(',') # first item of the list is header line make it a list , so that you can get the index of your keyword budget
index = header_list.index('budget')

total_budget = 0
count = 0

for item in lines[1:]:  # skip the first line which is a header
    if item != '':
        if item.strip().split(',')[index] is not 'NA':
            total_budget += float(item.strip().split(',')[index])
            count +=1

print  total_budget
avg_budget = total_budget/count
print avg_budget

如何定位csv中的一列数据？

3 个答案: