计算python中csv文件各列的平均值和标准差

时间:2017-03-06 01:13:07

标签: python csv

我正在尝试通过删除“'”来预处理数据集。从每个数据点开始,然后计算每列的平均值和标准偏差。我收到以下错误:

  

IOError:[Errno 13]权限被拒绝:' outputFile'

这是我的代码:

import csv
import sys
import numpy as np
from collections import Counter
class PreProcessDataSet:
    def standardize(self) :
        special_chars = set('?')
        inputFile = open(sys.argv[1], 'rb')
        print ('Input file as entered is : ', inputFile)
        outputFile = open(sys.argv[2],'wb')
        print ('Output file as entered is : ', outputFile)
        writer = csv.writer(outputFile)
        for row in csv.reader(inputFile):
            if not set(''.join(row)) & special_chars:
                writer.writerow(row)
                print row


    column_totals = Counter()
    with open('outputFile') as f:
        reader = csv.reader(f)
        row_count = 0.0
        for row in reader:
            for column_idx, column_value in enumerate(row):
                try:
                    n = float(column_value)
                    column_totals[column_idx] += n
                except ValueError:
                    print "Error -- ({}) Column({}) could not be converted to float!".format(column_value,
                                                                                                 column_idx)
            row_count += 1.0

    # row_count is now 1 too many so decrement it back down
    row_count -= 1.0

    column_indexes = column_totals.keys()
    column_indexes.sort()

    # calculate per column averages using a list comprehension
    averages = [column_totals[idx] / row_count for idx in column_indexes]
    print averages
obj = PreProcessDataSet()
obj.standardize()

有人可以指出我哪里错了吗?提前谢谢!

1 个答案:

答案 0 :(得分:3)

如果错误是“权限被拒绝”,那么您肯定无法完全访问您正在使用的系统,

OR

检查您正在使用的列表索引的逻辑, 错误的迭代/范围也会给出与您的错误相同的错误。

OR

您没有足够的权限来写入根目录。