我想得到csv文件中每列的方差,我写了以下内容:
import numpy as np
import csv
import collections
Training = 'Training.csv'
inputFile = open(Training,'r',newline='')
cols_values = collections.defaultdict(list)
numericalValues = []
reader = csv.reader(inputFile)
row = next(reader)
for row in reader:
for col, value in enumerate(row):
cols_values[col].append(value)
numericalValues.append(cols_values[col])
np.var(numericalValues[0], dtype=np.float64)
我在np.var
行收到错误:
TypeError: cannot perform reduce with flexible type
任何我不知道的东西,价值肯定是数字!
答案 0 :(得分:1)
有没有理由不为此使用Pandas?
import numpy as np
import pandas as pd
Training = 'Training.csv'
df = pd.read_csv(Training)
df.apply(np.var, axis=0) # can also use `df.var(...)`
您希望确保所有列都包含数值。如果您愿意,也可以使用np.nanvar
忽略NaN
值。