如何检查pandas列中的所有子字符串是否相同?

时间:2018-03-03 13:09:37

标签: python string pandas

我有column  我想检查所有字符串是否都有anr12子字符串。怎么检查这个?如果所有子串都相同,那么如何删除这个特定的子串呢?

1 个答案:

答案 0 :(得分:0)

我认为您希望containsall一起检查所有True,然后str.replace

df = pd.DataFrame({'A':['123anr12', '345anr12']})
print (df)
          A
0  123anr12
1  345anr12

if df['A'].str.contains('anr12').all():
    df['A'] = df['A'].str.replace('anr12','')
print (df)

     A
0  123
1  345

EDIT1:您可以使用dictionary进行查询:

train_df = pd.DataFrame({'477':['123nbf12', '34nbf12'], 
                         '479':['tt1', '32'], 
                         '482':['anr1234', '345anr12a12']})

obj_features = ['477', '479', '482'] #it's column names 
substring = ['nbf', 'tt1', 'anr12'] # get rid of 'nbf', 'tt1', 'anr12' substrings 
d = dict(zip(obj_features, substring))
print (d)
{'477': 'nbf', '479': 'tt1', '482': 'anr12'}

for k, v in d.items():
    if train_df[k].str.contains(v).all(): 
        train_df[k] = train_df[k].str.replace(v,'')         
print (train_df)
     477  479     482
0  12312  tt1      34
1   3412   32  345a12