如何计算多个csv文件的均值?

时间:2019-08-11 15:22:29

标签: python csv

我有100个格式相似的CSV文件,并且它们只有两个值mean和std:

file1.csv

mean  0.21  
std   0.54

我需要从每个CSV文件中提取每个均值和标准差,并计算总均值,例如:(mean [mean1,mean2,..])和(mean [std1,std2,..])。很难手动逐个手动地复制每个文件的均值和标准差,然后计算所有均值。

3 个答案:

答案 0 :(得分:1)

假设文件名在your_files中:

means, deviations = [], []
for file_name in your_files:
    with open(file_name) as f:
        lines = (float(line.split()[1]) for line in f)
        means.append(next(lines))
        deviations.append(next(lines))

然后您可以使用普通公式计算平均值。

答案 1 :(得分:1)

我将其称为“穴居人”方法,但它应该可以工作:

import os
means = []
stds = []
for file in os.listdir():
    if not file.startswith('file'):
        continue
    mean, std = [float(l.split()[1]) for l in open(file).readlines()]
    means.append(mean)
    stds.append(std)

print('mean mean', sum(means)/len(means))
print('mean stds', sum(stds)/len(stds))

测试:

$ echo "mean 0.21
> std 0.54" > file1.csv
$ echo "mean 0.23
> std 0.56" > file2.csv
$ python -q
>>> import os
>>> means = []
>>> stds = []
>>> for file in os.listdir():
...     if not file.startswith('file'):
...         continue
...     mean, std = [float(l.split()[1]) for l in open(file).readlines()]
...     means.append(mean)
...     stds.append(std)
... 
>>> print('mean mean', sum(means)/len(means))
mean mean 0.22
>>> print('mean stds', sum(stds)/len(stds))
mean stds 0.55

答案 2 :(得分:1)

如果file1.csvfile100.csv都在同一目录中,则可以使用以下Python脚本:

#!/usr/bin/env python3

N = 100
mean_sum = 0
std_sum = 0
for i in range(1, N + 1):
    with open(f"file{i}.csv") as f:
        mean_sum += float(f.readline().split(",")[1])
        std_sum += float(f.readline().split(",")[1])

print(f"Mean of means: {mean_sum / N}")
print(f"Mean of stds: {std_sum / N}")

这是假定它们实际上已格式化为CSV文件,并带有逗号分隔符。如果您的代码段中的字段只是用空格隔开,则使用.split()而不是.split(",")

相关问题