我正在尝试根据其他某些列的值更改一列的值。你能帮我怎么做吗?
示例:
table sql object_type
VW_MDCL_INSIGT select * from MEDAFF_REF_SPECTRUM.MEDICAL_INSIGHT VIEW
TBL_MDCL_INSIGT select * from medaff_ref_spectrum.CALL
FBMS_INTERACTION select * from medaff_ref_spectrum.TERR
VW_FBMS_INTERACTIONS select * from MEDAFF_REF_SPECTRUM.FBMS_INTERACTIONS VIEW
预期输出:
table sql object_type
VW_MDCL_INSIGT create or replace VW_MDCL_INSIGT as select * from MEDAFF_REF_SPECTRUM.MEDICAL_INSIGHT VIEW
TBL_MDCL_INSIGT select * from medaff_ref_spectrum.CALL
FBMS_INTERACTION select * from medaff_ref_spectrum.TERR
VW_FBMS_INTERACTIONS create or replace VW_FBMS_INTERACTIONS as select * from MEDAFF_REF_SPECTRUM.FBMS_INTERACTIONS VIEW
只要object_type = VIEW,只需在SQL中添加create or replace+ table+ as
。
我尝试了以下代码。你能告诉我我在做什么错吗?
import pandas as pd
import csv
df = pd.read_csv("D:/Users/SPate233/Downloads/iMedical/sqoop/metadata_consump.txt", delimiter='|')
for index, row in df.iterrows():
if(row['object_type'] == 'VIEW'):
row['sql'] = 'create or replace '+row['tablename']+' as '+row['sql']
print(df['sql'])
答案 0 :(得分:0)
解决方案无法正常工作的原因是因为您正在修改数据框行的副本,有关在循环时设置数据框新值的信息,请参见helpful link。您可以尝试使用np.where
更好:
df['sql']=np.where(df.object_type.eq('VIEW'), 'create or replace'+' '+df.table+' '+df.sql, df.sql)
print(df)
输出:
table sql object_type
0 VW_MDCL_INSIGT create or replace VW_MDCL_INSIGT select * from... VIEW
1 TBL_MDCL_INSIGT select * from medaff_ref_spectrum.CALL
2 FBMS_INTERACTION select * from medaff_ref_spectrum.TERR
3 VW_FBMS_INTERACTIONS create or replace VW_FBMS_INTERACTIONS select ... VIEW