使用不同且重叠的列

时间:2016-07-01 10:40:04

标签: python csv concatenation

我有三个csv文件。

一个有争议的人:

names1=['Date','Conc','Flow','SZ','SB','RZ','RB','Fraction','Attenuation','Conc_less_-200_flag','Conc_greater_500_flag','Wind Speed','Wind Direction','Wind_direction_Flag','Wind_Speed_Less_than_4','Middle','Wind_Speed_Greater_than_10','Multiple conditions']

第一行:

Date,Conc,Flow,SZ,SB,RZ,RB,Fraction,Attenuation,Conc_less_-200_flag,Conc_greater_500_flag,Wind Speed,Wind Direction,Wind_direction_Flag,Wind_Speed_Less_than_4,Middle,Wind_Speed_Greater_than_10,Multiple conditions
2004-02-27 00:00:00,,,,,,,,,,,6.524999999999999,177.75,0.0,0.0,1.0,0.0,0.0
2004-02-27 01:00:00,,,,,,,,,,,6.991666666666667,197.83333333333334,0.0,0.0,1.0,0.0,0.0

另外两个:

names2=['Date','Chanel0','Chanel1','Chanel2','Chanel3','Chanel4','Chanel5','Chanel6','Chanel7','Conc_less_-200_flag','Conc_greater_500_flag','Wind Speed','Wind Direction','Wind_direction_Flag','Wind_Speed_Less_than_4','Middle','Wind_Speed_Greater_than_10','Multiple conditions']

第一行:

Date,Chanel0,Chanel1,Chanel2,Chanel3,Chanel4,Chanel5,Chanel6,Chanel7,Conc_less_-200_flag,Conc_greater_500_flag,Wind Speed,Wind Direction,Wind_direction_Flag,Wind_Speed_Less_than_4,Middle,Wind_Speed_Greater_than_10,Multiple conditions
2012-01-23 08:00:00,-2402.3575757575754,-2418.8121212121237,-2423.983863636366,-2422.913745454546,-2423.983863636366,-2422.814151515151,-2423.242424242424,-2422.4842121212123,1.0,1.0,,,,,,,
2012-01-23 09:00:00,6.5666666666666655,6.8849999999999945,0.02130000000000001,1.4343266666666665,0.02130000000000001,1.5671516666666663,1.0,2.085166666666667,1.0,1.0,,,,,,,

我希望输出为带有标题的csv文件:

['Date','Conc','Flow','SZ','SB','RZ','RB','Fraction','Attenuation','Chanel0','Chanel1','Chanel2','Chanel3','Chanel4','Chanel5','Chanel6','Chanel7','Conc_greater_500_flag','Wind Speed','Wind Direction','Wind_direction_Flag','Wind_Speed_Less_than_4','Middle','Wind_Speed_Greater_than_10','Multiple conditions']

显而易见:来自后两个文件的贡献部分将有空白(或更好的Nan' 0'用于流,sz列等。第一个文件将在channel0处有它们-7列

注意Date是索引col。

我尝试了df_merged=pd.concat(df1,df2,df3),但这似乎与标题重叠。

也尝试过:

df_merged = pd.concat([DF1,DF2,DF3],轴= 1)

但是这会改变cs输出: 进入

,Conc,Flow,SZ,SB,RZ,RB,Fraction,Attenuation,Conc_less_-200_flag,Conc_greater_500_flag,Wind Speed,Wind Direction,Wind_direction_Flag,Wind_Speed_Less_than_4,Middle,Wind_Speed_Greater_than_10,Multiple conditions,Chanel0,Chanel1,Chanel2,Chanel3,Chanel4,Chanel5,Chanel6,Chanel7,Conc_less_-200_flag,Conc_greater_500_flag,Wind Speed,Wind Direction,Wind_direction_Flag,Wind_Speed_Less_than_4,Middle,Wind_Speed_Greater_than_10,Multiple conditions,Chanel0,Chanel1,Chanel2,Chanel3,Chanel4,Chanel5,Chanel6,Chanel7,Conc_less_-200_flag,Conc_greater_500_flag,Wind Speed,Wind Direction,Wind_direction_Flag,Wind_Speed_Less_than_4,Middle,Wind_Speed_Greater_than_10,Multiple conditions
2004-02-27 00:00:00,,,,,,,,,,,6.524999999999999,177.75,0.0,0.0,1.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

哪个很接近,但最后还有额外的列 而且我不认为重叠共同的colomns

1 个答案:

答案 0 :(得分:0)

这是个主意。 首先将两个文件读入列表然后设置并联合两个列表

您可以查看此答案How to get the union of two lists using list comprehension?