如何更改列值取决于熊猫的其他列?

时间:2020-07-03 10:01:23

标签: python python-3.x pandas dataframe

我正在尝试根据其他某些列的值更改一列的值。你能帮我怎么做吗?

示例:

table                sql                                                  object_type
VW_MDCL_INSIGT       select * from MEDAFF_REF_SPECTRUM.MEDICAL_INSIGHT    VIEW
TBL_MDCL_INSIGT      select * from medaff_ref_spectrum.CALL
FBMS_INTERACTION     select * from medaff_ref_spectrum.TERR
VW_FBMS_INTERACTIONS select * from MEDAFF_REF_SPECTRUM.FBMS_INTERACTIONS  VIEW

预期输出:

table                sql                                                            object_type
VW_MDCL_INSIGT       create or replace VW_MDCL_INSIGT as select * from MEDAFF_REF_SPECTRUM.MEDICAL_INSIGHT              VIEW
TBL_MDCL_INSIGT      select * from medaff_ref_spectrum.CALL
FBMS_INTERACTION     select * from medaff_ref_spectrum.TERR
VW_FBMS_INTERACTIONS create or replace VW_FBMS_INTERACTIONS as select * from MEDAFF_REF_SPECTRUM.FBMS_INTERACTIONS      VIEW

只要object_type = VIEW,只需在SQL中添加create or replace+ table+ as

我尝试了以下代码。你能告诉我我在做什么错吗?

import pandas as pd 
import csv


df = pd.read_csv("D:/Users/SPate233/Downloads/iMedical/sqoop/metadata_consump.txt", delimiter='|')
for index, row in df.iterrows():
    if(row['object_type'] == 'VIEW'):
        row['sql'] = 'create or replace '+row['tablename']+' as '+row['sql']

print(df['sql'])

1 个答案:

答案 0 :(得分:0)

解决方案无法正常工作的原因是因为您正在修改数据框行的副本,有关在循环时设置数据框新值的信息,请参见helpful link。您可以尝试使用np.where更好:

df['sql']=np.where(df.object_type.eq('VIEW'), 'create or replace'+' '+df.table+' '+df.sql, df.sql)

print(df)

输出:

                  table                                                sql object_type
0        VW_MDCL_INSIGT  create or replace VW_MDCL_INSIGT select * from...        VIEW
1       TBL_MDCL_INSIGT             select * from medaff_ref_spectrum.CALL            
2      FBMS_INTERACTION             select * from medaff_ref_spectrum.TERR            
3  VW_FBMS_INTERACTIONS  create or replace VW_FBMS_INTERACTIONS select ...        VIEW