我有多列多数据集 说出data1:-
datetime ch1[d1] ch2[d1] ......
2019-07-27 16:40:28 20 21
2019-07-27 16:41:28 22 13
2019-07-27 16:42:28 12 21
.
.
.
.
data2:-
datetime ch1[d2] .....
2019-07-27 16:43:28 20
2019-07-27 16:44:28 22
2019-07-27 16:45:28 12
.
.
.
.
最终数据:-
datetime ch1[d1] ch2[d1] ch1[d1] ......
2019-07-27 16:40:28 20 21 nan
2019-07-27 16:41:28 22 13 nan
2019-07-27 16:42:28 12 21 nan
2019-07-27 16:43:28 nan nan 20
2019-07-27 16:44:28 nan nan 22
2019-07-27 16:45:28 nan nan 12
.
.
.
.
关于如何实现这一目标的任何建议,我正在使用Java作为编程语言,或者使用任何可能有用的第三方实用程序。
答案 0 :(得分:0)
如上面的评论所述,您可以使用外部联接来完成您要执行的操作:
data1 = pd.DataFrame({"datetime": ["2019-07-27 16:40:28", "2019-07-27 16:41:28", "2019-07-27 16:42:28"],
"ch1d1": [20, 22, 12],
"ch2d1": [21, 13, 21]})
data1 = data1.set_index("datetime")
data1
ch1d1 ch2d1
datetime
2019-07-27 16:40:28 20 21
2019-07-27 16:41:28 22 13
2019-07-27 16:42:28 12 21
data2 = pd.DataFrame({"datetime": ["2019-07-27 16:43:28", "2019-07-27 16:44:28", "2019-07-27 16:45:28"],
"ch1d2": [20, 22, 12]})
data2 = data2.set_index("datetime")
data2
ch1d2
datetime
2019-07-27 16:43:28 20
2019-07-27 16:44:28 22
2019-07-27 16:45:28 12
data1.join(data2, how = "outer")
ch1d1 ch2d1 ch1d2
datetime
2019-07-27 16:40:28 20.0 21.0 NaN
2019-07-27 16:41:28 22.0 13.0 NaN
2019-07-27 16:42:28 12.0 21.0 NaN
2019-07-27 16:43:28 NaN NaN 20.0
2019-07-27 16:44:28 NaN NaN 22.0
2019-07-27 16:45:28 NaN NaN 12.0