我尝试了几种不同的方法来从Python数据分析库(PANDAS)中水平连接DataFrame对象,但到目前为止我的尝试都失败了。
我有两个数据帧:
D_1:
col2 col3
col1
str1 1 1.5728
str2 2 2.4627
str3 3 3.6143
D_2:
col2 col3
col1
str1 4 4.5345
str2 5 5.1230
str3 6 6.1233
我希望最终得到的数据帧是d_1和d_2并排:
col2 col3 col1 col2 col3
col1
str1 1 1.5728 str1 4 4.5345
str2 2 2.4627 str2 5 5.1230
str3 3 3.6143 str3 6 6.1233
以下是一些创建数据帧的代码:
import pandas as pd
column_headers = ["col1", "col2", "col3"]
d_1 = dict.fromkeys(column_headers)
d_1["col1"] = ["str1", "str2", "str3"]
d_1["col2"] = [1, 2, 3]
d_1["col3"] = [1.5728, 2.4627, 3.6143]
df_1 = pd.DataFrame(d_1)
df_1 = df_1.set_index("col1")
print("df_1:")
print(df_1)
print()
d_2 = dict.fromkeys(column_headers)
d_2["col1"] = ["str1", "str2", "str3"]
d_2["col2"] = [4, 5, 6]
d_2["col3"] = [4.5345, 5.123, 6.1233]
df_2 = pd.DataFrame(d_2)
df_2 = df_2.set_index("col1")
print("df_2:")
print(df_2)
print()
外部联接无法水平连接d_1和d_2:
merged_df = df_1.join(df_2, how='outer')
我们收到以下错误消息:
ValueError: columns overlap but no suffix specified: Index(['col2', 'col3'], dtype='object')
制作字典词典不起作用:
# Make a dictionary of dictionaries
merged_d = dict()
merged_d[1] = d_1
merged_d[2] = d_2
merged_df = pd.DataFrame(merged_d)
print(merged_df)
生成的DataFrame如下所示:
1 2
col1 [str1, str2, str3] [str1, str2, str3]
col2 [1, 2, 3] [4, 5, 6]
col3 [1.5728, 2.4627, 3.6143] [4.5345, 5.123, 6.1233]
制作DataFrames字典似乎也不起作用:
merged_d = dict()
merged_d[1] = df_1
merged_d[2] = df_2
merged_df = pd.DataFrame(merged_d)
print(merged_df)
我们收到以下错误消息:
ValueError: If using all scalar values, you must pass an index
将索引传递给DataFrame构造函数没有多大帮助:
merged_df = pd.DataFrame(data = merged_d, index = [1, 2])
我们收到错误:
Value Error: cannot copy sequence with size 2 to array axis with dimension 3
答案 0 :(得分:6)
使用concat与轴1而不是合并
ndf = pd.concat([df_1, df_2], axis=1)
col2 col3 col2 col3
col1
str1 1 1.5728 4 4.5345
str2 2 2.4627 5 5.1230
str3 3 3.6143 6 6.1233