在Python中将数据框转换为其他数据框的列

时间:2018-08-17 02:54:35

标签: python pandas dataframe

我有两个两个数据集:

df1:

Name        Answers Questions People-reached Reputation  
Alex Gaynor   154        44          ~1.4m     8,871 

df2:

 Project               Total-score Post     
 python                    337      93  
 django-templates          22       4  
 slug                      12       1  
 google-app-engine         8        1  
 django                    235      57  
 clang                     22       2  

在Python中(熊猫或其他库)有什么办法可以合并两种数据框,以便df2成为df1中的新列?

所需的输出为:

Name       Answers     Questions   People-reached    Reputation   Project-details
Alex Gaynor   154        44          ~1.4m             8,871   python 337 93  
                                                              django-templates 22 4   
                                                               slug   12  1  
                                                              google-app-engine 8 1

2 个答案:

答案 0 :(得分:1)

如果需要保留添加字段的列式结构,则可以创建列MultiIndex。

如果您只需要将信息存储在df2中作为df1中的一列,则可以创建一列包含df2.values的列表。

选项1:保留列结构

# first merge df1 and df2
df2.index = ["Alex Gaynor"] * len(df2)
merged = df1.merge(df2, left_on="Name", right_index=True)

# now create multi-index columns
top_lvl = df1.columns.tolist() + ["project_details"]*3
bottom_lvl = [" "]*len(df.columns) + df2.columns.tolist()
merged.columns = [top_lvl, bottom_lvl]

merged

          Name Answers Questions People-reached Reputation    project_details  \
                                                                      Project   
0  Alex Gaynor     154        44          ~1.4m      8,871             python   
0  Alex Gaynor     154        44          ~1.4m      8,871   django-templates   
0  Alex Gaynor     154        44          ~1.4m      8,871               slug   
0  Alex Gaynor     154        44          ~1.4m      8,871  google-app-engine   
0  Alex Gaynor     154        44          ~1.4m      8,871             django   
0  Alex Gaynor     154        44          ~1.4m      8,871              clang   


  Total-score Post  
0         337   93  
0          22    4  
0          12    1  
0           8    1  
0         235   57  
0          22    2  

如果您确实需要第一行下面的所有df1条目都为空白,则可以执行以下操作:

merged.iloc[1:, :5] = ""
merged
          Name Answers Questions People-reached Reputation    project_details  \
                                                                      Project   
0  Alex Gaynor     154        44          ~1.4m      8,871             python   
0                                                            django-templates   
0                                                                        slug   
0                                                           google-app-engine   
0                                                                      django   
0                                                                       clang   


  Total-score Post  
0         337   93  
0          22    4  
0          12    1  
0           8    1  
0         235   57  
0          22    2  

选项2:只需将df2信息存储在列中

df1["project_details"] = [df2.values]
df1
          Name  Answers  Questions People-reached Reputation  \
0  Alex Gaynor      154         44          ~1.4m      8,871   

                                     project_details  
0  [[python, 337, 93], [django-templates, 22, 4],...  

答案 1 :(得分:1)

您可以将数据框制成字符串,然后将值添加到新列的第一行:

# make df into string
df_string = df2.to_string(index=False, header=False)

# make new column
df1["project_details"] = np.nan

# add df_string to first row in new column
df1.iloc[0, df1.columns.get_loc('project_details')] = df_string