我有3个pandas数据框,每个都有不同的行数和一些相似的列,我需要将所有数据合并在一起
mydata = [0]*3
dataA = {'First': [500],'Second': ['Sone']}
mydata[0] = pd.DataFrame(dataA,columns=['First','Second'])
dataB = {'First': [500,500],'Third': [0.5,0.6]}
mydata[1] = pd.DataFrame(dataB,columns=['First','Third'])
dataC = {'First': [500,500,500],'Fourth': ['Fone', 'Ftwo','Fthree'],'Fifth': [23, 24, 25]}
mydata[2] = pd.DataFrame(dataC,columns=['First','Fourth','Fifth'])
合并数据看起来像
merge_data = {'First': [500,500,500,500,500,500],'Second': ['Sone','Sone','Sone','Sone','Sone','Sone'],'Third': [0.5,0.6,0.5,0.6,0.5,0.6],'Fourth': ['Fone', 'Fone', 'Ftwo', 'Ftwo', 'Fthree','Fthree'],'Fifth': [23, 23, 24, 24, 25, 25]}
merge_df = pd.DataFrame(merge_data,columns=['First','Second','Third','Fourth','Fifth'])
数据附加产生Nan行
merge_data = mydata[0].copy()
for i in np.arange(1, len(mydata)):
merge_data = merge_data.append(mydata[i], sort=False)
并合并丢失的行
merge_data = pd.merge(mydata[0], mydata[1], left_index=True, right_index=True)
是否可以合并为merged_df
答案 0 :(得分:1)
您必须在'First'
列上进行合并:
pd.merge(mydata[0], mydata[1], on='First').merge(mydata[2], on='First')
获得:
First Second Third Fourth Fifth
0 500 Sone 0.5 Fone 23
1 500 Sone 0.5 Ftwo 24
2 500 Sone 0.5 Fthree 25
3 500 Sone 0.6 Fone 23
4 500 Sone 0.6 Ftwo 24
5 500 Sone 0.6 Fthree 25
仅Fourth
和Fifth
列仍在此处对齐,而merge_df
数据帧中没有...