我有两个两个数据集:
df1:
Name Answers Questions People-reached Reputation
Alex Gaynor 154 44 ~1.4m 8,871
df2:
Project Total-score Post
python 337 93
django-templates 22 4
slug 12 1
google-app-engine 8 1
django 235 57
clang 22 2
在Python中(熊猫或其他库)有什么办法可以合并两种数据框,以便df2成为df1中的新列?
所需的输出为:
Name Answers Questions People-reached Reputation Project-details
Alex Gaynor 154 44 ~1.4m 8,871 python 337 93
django-templates 22 4
slug 12 1
google-app-engine 8 1
答案 0 :(得分:1)
如果需要保留添加字段的列式结构,则可以创建列MultiIndex。
如果您只需要将信息存储在df2
中作为df1
中的一列,则可以创建一列包含df2.values
的列表。
选项1:保留列结构
# first merge df1 and df2
df2.index = ["Alex Gaynor"] * len(df2)
merged = df1.merge(df2, left_on="Name", right_index=True)
# now create multi-index columns
top_lvl = df1.columns.tolist() + ["project_details"]*3
bottom_lvl = [" "]*len(df.columns) + df2.columns.tolist()
merged.columns = [top_lvl, bottom_lvl]
merged
Name Answers Questions People-reached Reputation project_details \
Project
0 Alex Gaynor 154 44 ~1.4m 8,871 python
0 Alex Gaynor 154 44 ~1.4m 8,871 django-templates
0 Alex Gaynor 154 44 ~1.4m 8,871 slug
0 Alex Gaynor 154 44 ~1.4m 8,871 google-app-engine
0 Alex Gaynor 154 44 ~1.4m 8,871 django
0 Alex Gaynor 154 44 ~1.4m 8,871 clang
Total-score Post
0 337 93
0 22 4
0 12 1
0 8 1
0 235 57
0 22 2
如果您确实需要第一行下面的所有df1
条目都为空白,则可以执行以下操作:
merged.iloc[1:, :5] = ""
merged
Name Answers Questions People-reached Reputation project_details \
Project
0 Alex Gaynor 154 44 ~1.4m 8,871 python
0 django-templates
0 slug
0 google-app-engine
0 django
0 clang
Total-score Post
0 337 93
0 22 4
0 12 1
0 8 1
0 235 57
0 22 2
选项2:只需将df2
信息存储在列中
df1["project_details"] = [df2.values]
df1
Name Answers Questions People-reached Reputation \
0 Alex Gaynor 154 44 ~1.4m 8,871
project_details
0 [[python, 337, 93], [django-templates, 22, 4],...
答案 1 :(得分:1)
您可以将数据框制成字符串,然后将值添加到新列的第一行:
# make df into string
df_string = df2.to_string(index=False, header=False)
# make new column
df1["project_details"] = np.nan
# add df_string to first row in new column
df1.iloc[0, df1.columns.get_loc('project_details')] = df_string