Question

这个问题类似于简单的mysql操作 -

UPDATE hpaai_month_div t, fahafa_monthly s 
SET t.col1=s.col1 WHERE t.col2=s.col2 AND t.year=s.year AND t.month=s.month;

数据：

CSV A
month    year   col2      col1
abc      2000   DEFSSDS   190
def      2001   GHISFDS   210
ghi      2002   SJDYHGF   910

CSV B
month   year    col2     col1    stat_fips
abc     2000    DEFSSDS   0        a
def     2001    GHISFDS   0        b
ghi     2002    SJDYHGF   0        c


Resulting CSV :

month    year   col2      col1    stat_fips
abc     2000    DEFSSDS   190       a
def     2001    GHISFDS   210       b
ghi     2002    SJDYHGF   910       c

到目前为止

代码:(不按预期工作）

   df_a = pd.read_csv('a.csv')
   df_b = pd.read_csv('b.csv')
   merged_df = pd.merge(df_a, df_b, on="col1", how="left")

   merged_df = pd.concat([merged_df], axis=1)
   merged_df.to_csv('final_output.csv', encoding='utf-8', index=False)
   print open('final_output.csv').read()

如何获取数据作为结果csv

Answer 1

您似乎需要merge，最后删除列col_：

#default inner join 
df = pd.merge(df1, df2, on=['col2','year','month'], suffixes=('','_'))
       .drop('col1_',axis=1)
print (df)
  month  year     col2  col1 stat_fips
0   abc  2000  DEFSSDS   190         a
1   def  2001  GHISFDS   210         b
2   ghi  2002  SJDYHGF   910         c

df = pd.merge(df1, df2, on=['col2','year','month'])
print (df)
  month  year     col2  col1_x  col1_y stat_fips
0   abc  2000  DEFSSDS     190       0         a
1   def  2001  GHISFDS     210       0         b
2   ghi  2002  SJDYHGF     910       0         c


df = pd.merge(df1, df2, on=['col2','year','month'], suffixes=('','_'))
print (df)
  month  year     col2  col1  col1_ stat_fips
0   abc  2000  DEFSSDS   190      0         a
1   def  2001  GHISFDS   210      0         b
2   ghi  2002  SJDYHGF   910      0         c

Answer 2

如果您提前从'col1'删除'df_b'，则可以merge使用其默认设置。

df_a.merge(df_b.drop('col1', 1))

  month  year     col2  col1 stat_fips
0   abc  2000  DEFSSDS   190         a
1   def  2001  GHISFDS   210         b
2   ghi  2002  SJDYHGF   910         c

如何使用Python pandas Df将csvs与多于1个相同的列合并，并仅添加不同的列

2 个答案: