如何将concat数据框的日期列与N / As合并

时间:2019-01-28 20:25:47

标签: python pandas python-2.7 merge concat

我有一个数据帧是通过使用空白数据帧构成的,我通过循环将多个数据帧连接到了其中。使用以下内容。 final = pd.concat([final, out], axis=1, sort=True) 那给了我类似于

的东西
Date    Count   Date    Count   Date    Count   Date    Count
1/1/2019    1   1/1/2019    1   N/A N/A 1/1/2019    1
1/2/2019    1   1/2/2019    1   1/2/2019    1   1/2/2019    1
1/3/2019    1   1/3/2019    1   1/3/2019    1   1/3/2019    1
N/A N/A 1/4/2019    1   1/4/2019    1   1/4/2019    1
1/5/2019    1   1/5/2019    1   1/5/2019    1   1/5/2019    1
1/6/2019    1   1/6/2019    1   1/6/2019    1   N/A N/A
N/A N/A 1/7/2019    1   1/7/2019    1   1/7/2019    1
1/8/2019    1   1/8/2019    1   N/A N/A 1/8/2019    1
1/9/2019    1   1/9/2019    1   1/9/2019    1   1/9/2019    1
N/A N/A N/A N/A 1/10/2019   1   1/10/2019   1
1/11/2019   1   1/11/2019   1   1/11/2019   1   1/11/2019   1
1/12/2019   1   1/12/2019   1   1/12/2019   1   1/12/2019   1
1/13/2019   1   1/13/2019   1   1/13/2019   1   N/A N/A

但是我的目标是要得到这个

Date    Count   Count   Count   Count
1/1/2019    1   1   N/A 1
1/2/2019    1   1   1   1
1/3/2019    1   1   1   1
1/4/2019    N/A 1   1   1
1/5/2019    1   1   1   1
1/6/2019    1   1   1   N/A
1/7/2019    N/A 1   1   1
1/8/2019    1   1   N/A 1
1/9/2019    1   1   1   1
1/10/2019   N/A N/A 1   1
1/11/2019   1   1   1   1
1/12/2019   1   1   1   1
1/13/2019   1   1   1   N/A

2 个答案:

答案 0 :(得分:0)

要使用合并时,您正在使用concat。我假设有一些数据将丢失。每轮串联应为:

 final = final.merge(out, on='Date', how='outer')

例如,您可能还想使用对数据有意义的suffixessuffixes=['','new_data']在合并中(例如final = final.merge(out, on='Date', how='outer',suffixes=['','new_data'])。这将有助于您了解数据来自何处

答案 1 :(得分:0)

据我所知,您想将Date列组合在一起,以便第一Date列中没有缺失的值。

这是输入数据

df = pd.read_clipboard()
print(df)
         Date  Count     Date.1  Count.1     Date.2  Count.2     Date.3  Count.3
0    1/1/2019    1.0   1/1/2019      1.0        NaN      NaN   1/1/2019      1.0
1    1/2/2019    1.0   1/2/2019      1.0   1/2/2019      1.0   1/2/2019      1.0
2    1/3/2019    1.0   1/3/2019      1.0   1/3/2019      1.0   1/3/2019      1.0
3         NaN    NaN   1/4/2019      1.0   1/4/2019      1.0   1/4/2019      1.0
4    1/5/2019    1.0   1/5/2019      1.0   1/5/2019      1.0   1/5/2019      1.0
5    1/6/2019    1.0   1/6/2019      1.0   1/6/2019      1.0        NaN      NaN
6         NaN    NaN   1/7/2019      1.0   1/7/2019      1.0   1/7/2019      1.0
7    1/8/2019    1.0   1/8/2019      1.0        NaN      NaN   1/8/2019      1.0
8    1/9/2019    1.0   1/9/2019      1.0   1/9/2019      1.0   1/9/2019      1.0
9         NaN    NaN        NaN      NaN  1/10/2019      1.0  1/10/2019      1.0
10  1/11/2019    1.0  1/11/2019      1.0  1/11/2019      1.0  1/11/2019      1.0
11  1/12/2019    1.0  1/12/2019      1.0  1/12/2019      1.0  1/12/2019      1.0
12  1/13/2019    1.0  1/13/2019      1.0  1/13/2019      1.0        NaN      NaN

一种可能的方法是,一次用其他Date列填充NaNDate s(在这种方法中,Date.3不会出现需要)

df['Date'].fillna(df['Date.1'], inplace=True)
df['Date'].fillna(df['Date.2'], inplace=True)
df = df.drop(['Date.1','Date.2','Date.3'], axis=1)

输出

print(df)
         Date  Count  Count.1  Count.2  Count.3
0    1/1/2019    1.0      1.0      NaN      1.0
1    1/2/2019    1.0      1.0      1.0      1.0
2    1/3/2019    1.0      1.0      1.0      1.0
3    1/4/2019    NaN      1.0      1.0      1.0
4    1/5/2019    1.0      1.0      1.0      1.0
5    1/6/2019    1.0      1.0      1.0      NaN
6    1/7/2019    NaN      1.0      1.0      1.0
7    1/8/2019    1.0      1.0      NaN      1.0
8    1/9/2019    1.0      1.0      1.0      1.0
9   1/10/2019    NaN      NaN      1.0      1.0
10  1/11/2019    1.0      1.0      1.0      1.0
11  1/12/2019    1.0      1.0      1.0      1.0
12  1/13/2019    1.0      1.0      1.0      NaN