我有一个.csv文件(mydb.csv),其中包含以下条目(+1万行)。该表的第7行包含日期。日期重复多次,因为此数据集包含每小时记录。
QTEwOA==,81881,-7.610773,-72.681333,220,A108,2016-06-11,08,21.4,95,994.3,3.3,0,0,,
QTEwOA==,81881,-7.610773,-72.681333,220,A108,2016-06-11,09,21.3,95,994.1,1.2,0,0,,
QTEwOA==,81881,-7.610773,-72.681333,220,A108,2016-06-11,10,21.2,94,994.5,2.1,0,0,,
QTEwOA==,81881,-7.610773,-72.681333,220,A108,2016-06-11,11,20.9,94,994.7,1.3,0,0,,
QTEwOA==,81881,-7.610773,-72.681333,220,A108,2016-06-11,12,20.9,93,995.6,1.7,0,0,0.0,0.0
我需要计算每次记录观察的当天平均值。
我可以在python中执行此操作,还是应该将.csv文件转换为sqlite文件进行查询?
答案 0 :(得分:0)
您可以使用python中的pandas
库来快速完成。它看起来像那样:
import pandas as pd
df = pd.read_csv("initial.csv")
avgd_df = df.groupby('date').mean()
avgd_df.to_csv("averaged.csv")