我不知道如何按索引和多列名称合并
我有索引中的日期和3个列作为合并字段
预期结果应该是:
A B C x y
timestamp
2019-06-10T20:00:00.000Z a b c 1.0 1.0
2019-06-10T21:00:00.000Z a b c 1.0 NaN
这就是我得到的:
A B C x y
timestamp
2019-06-10T20:00:00.000Z a b c NaN 1.0
2019-06-10T21:00:00.000Z a b c 1.0 NaN
2019-06-10T21:00:00.000Z a b c 1.0 NaN
这是我的代码:
import pandas as pd
data_list = []
left = {}
left['timestamp'] = '2019-06-10T20:00:00.000Z'
left['A'] = 'a'
left['B'] = 'b'
left['C'] = 'c'
left['x'] = 1
data_list.append(left)
left['timestamp'] = '2019-06-10T21:00:00.000Z'
left['A'] = 'a'
left['B'] = 'b'
left['C'] = 'c'
left['x'] = 1
data_list.append(left)
df_left = pd.DataFrame(data_list)
df_left = df_left.set_index('timestamp')
print(df_left.to_string())
print()
data_list = []
right = {}
right['timestamp'] = '2019-06-10T20:00:00.000Z'
right['A'] = 'a'
right['B'] = 'b'
right['C'] = 'c'
right['y'] = 1
data_list.append(right)
df_right = pd.DataFrame(data_list)
df_right = df_right.set_index('timestamp')
merged_df = pd.merge(df_left, df_right, left_index=True, right_index=True, on=['A','B','C'], how="outer")
print(merged_df.to_string())
答案 0 :(得分:-1)
您的df_left数据框是错误的。
它包含2个相同的时间戳(2019-06-10T21:00:00.000Z)。 您可以考虑像下面的代码那样创建数据框。希望对您有帮助!
import pandas as pd
left = {
'timestamp': ['2019-06-10T20:00:00.000Z', '2019-06-10T21:00:00.000Z'],
'A' : ['a', 'a'],
'B' : ['b', 'b'],
'C' : ['c', 'c'],
'x' : ['1.0', '1.0']
}
df_left = pd.DataFrame(left)
df_left
right = {
'timestamp': ['2019-06-10T21:00:00.000Z'],
'A' : ['a'],
'B' : ['b'],
'C' : ['c'],
'y' : ['1.0']
}
df_right = pd.DataFrame(right)
df_right
merged_df = df_left.merge(df_right, how='outer')
merged_df