给定输入的期望输出：

Question

我尝试了几种不同的方法来从Python数据分析库（PANDAS）中水平连接DataFrame对象，但到目前为止我的尝试都失败了。

给定输入的期望输出：

我有两个数据帧：
D_1：

      col2    col3
col1                
str1     1  1.5728
str2     2  2.4627
str3     3  3.6143

D_2：

      col2    col3
col1              
str1     4  4.5345
str2     5  5.1230
str3     6  6.1233

我希望最终得到的数据帧是d_1和d_2并排：

      col2    col3    col1  col2   col3
col1                                  
str1     1  1.5728    str1     4  4.5345
str2     2  2.4627    str2     5  5.1230
str3     3  3.6143    str3     6  6.1233

创建测试输入：

以下是一些创建数据帧的代码：

import pandas as pd

column_headers = ["col1", "col2", "col3"]
d_1 = dict.fromkeys(column_headers)
d_1["col1"] = ["str1", "str2", "str3"]
d_1["col2"] = [1, 2, 3]
d_1["col3"] = [1.5728, 2.4627, 3.6143]
df_1 = pd.DataFrame(d_1)
df_1 = df_1.set_index("col1")
print("df_1:")
print(df_1)
print()


d_2 = dict.fromkeys(column_headers)
d_2["col1"] = ["str1", "str2", "str3"]
d_2["col2"] = [4, 5, 6]
d_2["col3"] = [4.5345, 5.123, 6.1233]
df_2 = pd.DataFrame(d_2)
df_2 = df_2.set_index("col1")
print("df_2:")
print(df_2)
print()

尝试失败：

解决方案1失败

外部联接无法水平连接d_1和d_2：

merged_df = df_1.join(df_2, how='outer')

我们收到以下错误消息：

ValueError: columns overlap but no suffix specified: Index(['col2', 'col3'], dtype='object')

解决方案2失败：

制作字典词典不起作用：

# Make a dictionary of dictionaries
merged_d = dict()
merged_d[1] = d_1
merged_d[2] = d_2
merged_df = pd.DataFrame(merged_d)
print(merged_df)

生成的DataFrame如下所示：

                             1                        2
col1        [str1, str2, str3]       [str1, str2, str3]
col2                 [1, 2, 3]                [4, 5, 6]
col3  [1.5728, 2.4627, 3.6143]  [4.5345, 5.123, 6.1233]

解决方案3失败：

Subattempt 3a：

制作DataFrames字典似乎也不起作用：

merged_d = dict()
merged_d[1] = df_1
merged_d[2] = df_2
merged_df = pd.DataFrame(merged_d)
print(merged_df)

我们收到以下错误消息：

ValueError: If using all scalar values, you must pass an index

Subattempt 3b：

将索引传递给DataFrame构造函数没有多大帮助：

merged_df = pd.DataFrame(data = merged_d, index = [1, 2])

我们收到错误：

Value Error: cannot copy sequence with size 2 to array axis with dimension 3

Answer 1

使用concat与轴1而不是合并

ndf = pd.concat([df_1, df_2], axis=1)

     col2    col3  col2    col3
col1                            
str1     1  1.5728     4  4.5345
str2     2  2.4627     5  5.1230
str3     3  3.6143     6  6.1233

如何在python中水平连接pandas数据帧

给定输入的期望输出：

创建测试输入：

尝试失败：

解决方案1失败

解决方案2失败：

解决方案3失败：

Subattempt 3a：

Subattempt 3b：

1 个答案: