我有一个数据帧是通过使用空白数据帧构成的,我通过循环将多个数据帧连接到了其中。使用以下内容。
final = pd.concat([final, out], axis=1, sort=True)
那给了我类似于
Date Count Date Count Date Count Date Count
1/1/2019 1 1/1/2019 1 N/A N/A 1/1/2019 1
1/2/2019 1 1/2/2019 1 1/2/2019 1 1/2/2019 1
1/3/2019 1 1/3/2019 1 1/3/2019 1 1/3/2019 1
N/A N/A 1/4/2019 1 1/4/2019 1 1/4/2019 1
1/5/2019 1 1/5/2019 1 1/5/2019 1 1/5/2019 1
1/6/2019 1 1/6/2019 1 1/6/2019 1 N/A N/A
N/A N/A 1/7/2019 1 1/7/2019 1 1/7/2019 1
1/8/2019 1 1/8/2019 1 N/A N/A 1/8/2019 1
1/9/2019 1 1/9/2019 1 1/9/2019 1 1/9/2019 1
N/A N/A N/A N/A 1/10/2019 1 1/10/2019 1
1/11/2019 1 1/11/2019 1 1/11/2019 1 1/11/2019 1
1/12/2019 1 1/12/2019 1 1/12/2019 1 1/12/2019 1
1/13/2019 1 1/13/2019 1 1/13/2019 1 N/A N/A
但是我的目标是要得到这个
Date Count Count Count Count
1/1/2019 1 1 N/A 1
1/2/2019 1 1 1 1
1/3/2019 1 1 1 1
1/4/2019 N/A 1 1 1
1/5/2019 1 1 1 1
1/6/2019 1 1 1 N/A
1/7/2019 N/A 1 1 1
1/8/2019 1 1 N/A 1
1/9/2019 1 1 1 1
1/10/2019 N/A N/A 1 1
1/11/2019 1 1 1 1
1/12/2019 1 1 1 1
1/13/2019 1 1 1 N/A
答案 0 :(得分:0)
要使用合并时,您正在使用concat
。我假设有一些数据将丢失。每轮串联应为:
final = final.merge(out, on='Date', how='outer')
例如,您可能还想使用对数据有意义的suffixes
。 suffixes=['','new_data']
在合并中(例如final = final.merge(out, on='Date', how='outer',suffixes=['','new_data'])
。这将有助于您了解数据来自何处
答案 1 :(得分:0)
据我所知,您想将Date
列组合在一起,以便第一Date
列中没有缺失的值。
这是输入数据
df = pd.read_clipboard()
print(df)
Date Count Date.1 Count.1 Date.2 Count.2 Date.3 Count.3
0 1/1/2019 1.0 1/1/2019 1.0 NaN NaN 1/1/2019 1.0
1 1/2/2019 1.0 1/2/2019 1.0 1/2/2019 1.0 1/2/2019 1.0
2 1/3/2019 1.0 1/3/2019 1.0 1/3/2019 1.0 1/3/2019 1.0
3 NaN NaN 1/4/2019 1.0 1/4/2019 1.0 1/4/2019 1.0
4 1/5/2019 1.0 1/5/2019 1.0 1/5/2019 1.0 1/5/2019 1.0
5 1/6/2019 1.0 1/6/2019 1.0 1/6/2019 1.0 NaN NaN
6 NaN NaN 1/7/2019 1.0 1/7/2019 1.0 1/7/2019 1.0
7 1/8/2019 1.0 1/8/2019 1.0 NaN NaN 1/8/2019 1.0
8 1/9/2019 1.0 1/9/2019 1.0 1/9/2019 1.0 1/9/2019 1.0
9 NaN NaN NaN NaN 1/10/2019 1.0 1/10/2019 1.0
10 1/11/2019 1.0 1/11/2019 1.0 1/11/2019 1.0 1/11/2019 1.0
11 1/12/2019 1.0 1/12/2019 1.0 1/12/2019 1.0 1/12/2019 1.0
12 1/13/2019 1.0 1/13/2019 1.0 1/13/2019 1.0 NaN NaN
一种可能的方法是,一次用其他Date
列填充NaN
列Date
s(在这种方法中,Date.3
不会出现需要)
df['Date'].fillna(df['Date.1'], inplace=True)
df['Date'].fillna(df['Date.2'], inplace=True)
df = df.drop(['Date.1','Date.2','Date.3'], axis=1)
输出
print(df)
Date Count Count.1 Count.2 Count.3
0 1/1/2019 1.0 1.0 NaN 1.0
1 1/2/2019 1.0 1.0 1.0 1.0
2 1/3/2019 1.0 1.0 1.0 1.0
3 1/4/2019 NaN 1.0 1.0 1.0
4 1/5/2019 1.0 1.0 1.0 1.0
5 1/6/2019 1.0 1.0 1.0 NaN
6 1/7/2019 NaN 1.0 1.0 1.0
7 1/8/2019 1.0 1.0 NaN 1.0
8 1/9/2019 1.0 1.0 1.0 1.0
9 1/10/2019 NaN NaN 1.0 1.0
10 1/11/2019 1.0 1.0 1.0 1.0
11 1/12/2019 1.0 1.0 1.0 1.0
12 1/13/2019 1.0 1.0 1.0 NaN