我有多个csv文件,我想将它们组合成一个df。
它们都是这种通用格式,有两个索引列:
1 2
CU0112-005287-7 Output Energy, (Wh/h) 0.064 0.066
CU0112-005287-7 Lights (Wh) 0 0
1 2
CU0112-001885-L Output Energy, (Wh/h) 1.33 1.317
CU0112-001885-L Lights (Wh) 1.33 1.317
依旧......
合并的df将是:
1 2
CU0112-005287-7 Output Energy, (Wh/h) 0.064 0.066
CU0112-005287-7 Lights (Wh) 0 0
CU0112-001885-L Output Energy, (Wh/h) 1.33 1.317
CU0112-001885-L Lights (Wh) 1.33 1.317
我正在尝试这段代码:
import os
import pandas as pd
import glob
files = glob.glob(r'2017-12-05\Aggregated\*.csv') //folder which contains all the csv files
df = pd.merge([pd.read_csv(f, index_col=[0,1])for f in files], how='outer')
df.to_csv(r'\merged.csv')
但是我收到了这个错误:
TypeError: merge() takes at least 2 arguments (2 given)
答案 0 :(得分:3)
我认为您需要concat
而不是merge
:
df = pd.concat([pd.read_csv(f, index_col=[0,1]) for f in files])
答案 1 :(得分:0)
您可以尝试以下操作。我对DataFrame结合逻辑
做了一些更改import os
import pandas as pd
import glob
files = glob.glob(r'2017-12-05\Aggregated\*.csv') //folder which contains all the csv files
df = reduce(lambda df1,df2: pd.merge(df1,df2,on='id',how='outer'),[pd.read_csv(f, index_col=[0,1])for f in files] )
df.to_csv(r'\merged.csv')
答案 2 :(得分:0)
一种简单的方法:
创建名称为csvs的列表:
files=listdir()
csvs=list()
for file in files:
if file.endswith(".csv"):
csvs.append(file)
连接csvs:
data=pd.DataFrame()
for i in csvs:
table=pd.read_csv(i)
data=pd.concat([data,table])