答案 0 :(得分:0)
您可以这样做:
df = pd.DataFrame([[1, 'select from [mary_flowers]'], [2, 'select from [esther_pots]'], [3, 'select from [somthing]']], columns=['item_id', 'Column_x'])
df['view_name'] = df['Column_x'].str.extract(r'\[(\w*)\]', expand=True)[0]
df.loc[~df['view_name'].isin(list(df2['view_name'])), 'view_name'] = np.nan
df
输出:
item_id Column_x view_name
0 1 select from [mary_flowers] mary_flowers
1 2 select from [esther_pots] esther_pots
2 3 select from [somthing] NaN
说明:这将从[]中提取table_name
,然后检查它是否在您的第二df
中,如果不是,则将其更改为np.nan
。
编辑: 如果“ Column_x”中可以有多个表名,请使用:
df = pd.DataFrame([[1, 'select from [mary_flowers] join [tom_trucks]'], [2, 'select from [esther_pots]'], [3, 'select from [somthing]']], columns=['item_id', 'Column_x'])
names = ['mary_flowers', 'esther_pots', 'tom_trucks']
df['view_name'] = df['Column_x'].str.findall(r'\[(\w*)\]')
df['view_name'] = df['view_name'].map(lambda views: [v for v in views if v in list(df2['view_name'])])
df
输出:
item_id Column_x view_name
0 1 select from [mary_flowers] join [tom_trucks] [mary_flowers, tom_trucks]
1 2 select from [esther_pots] [esther_pots]
2 3 select from [somthing] []