Question

我有一个数据帧为df，列为Column1 如果column1值为Not Applicable，则返回False，或者检查列中元素的前2个字符是否应为字母，并返回True或False

df['Column1'].apply(lambda x : False if x in ['Not Applicable'] else x[0:2] should be alphabetic)

如何在lambda函数的else部分检查前两个字符是否为字母？

Answer 1

此解决方案不需要正则表达式。如果要检查这两个字母是否为字母，请使用str.isalpha()函数。

df['Column1'].apply(lambda x : False if x in ['Not Applicable'] else x[0:2].isalpha())

根据OP的要求，re.match：

import re
df['Column1'].apply(lambda x : False if x in ['Not Applicable'] else re.match('[a-z]{2}', x[0:2].lower()) )

如果匹配则

re.match返回匹配对象，否则将返回None，因此您可以使用返回值的真实性。

Answer 2

我认为numpy.where和str.isalpha需要indexing with str：

df = pd.DataFrame({'col1':['Not Applicable dds','*7df Not Applicable','sd ds', '#@( 444']})

df['a'] = np.where(df['col1'].str.contains('Not Applicable'), False,
                   df['col1'].str[:2].str.isalpha())
print (df)
                  col1      a
0   Not Applicable dds  False
1  *7df Not Applicable  False
2                sd ds   True
3              #@( 444  False

正则表达式，前两个字符作为字母Python

2 个答案: