如何将datafarme列传递给commonregex libraray并隐藏与regex值匹配的数据
from commonregex import CommonRegex
address = ['61 Park Street, Camden, ME, 04843, US', '1208 BEECHCRAFT BLVD','6704 BEECHCRAFT', 'PO BOX 469', '6461 44TH AVE' , '11026 BELLE HAVEN DR']
df = pd.DataFrame(address, columns = ['Address'])
parser = CommonRegex()
parser.street_addresses(df.Address)
它引发错误
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
df['new'] = df.Address.apply(lambda x: next(iter(parser.street_addresses(x)), ''))
print (df)
Address new
0 61 Park Street, Camden, ME, 04843, US XXXXXXXXXXXXXXX
1 1208 BEECHCRAFT BLVD
2 6704 BEECHCRAFT
3 PO BOX 469
4 6461 44TH AVE XXXXXXXXXXXXX
5 11026 BELLE HAVEN DR XXXXXXXXXXXXXXXXXXX
答案 0 :(得分:1)
使用Series.apply
:
df['new'] = df.Address.apply(parser.street_addresses)
或者:
df['new'] = df.Address.apply(lambda x: parser.street_addresses(x))
或list comprehension
:
df['new'] = [parser.street_addresses(x) for x in df.Address]
print (df)
Address new
0 61 Park Street, Camden, ME, 04843, US [61 Park Street,]
1 1208 BEECHCRAFT BLVD []
2 6704 BEECHCRAFT []
3 PO BOX 469 []
4 6461 44TH AVE [6461 44TH AVE]
5 11026 BELLE HAVEN DR [1026 BELLE HAVEN DR]
如果要第一个匹配的值,请在空列表中添加next
和iter
作为可能的匹配默认值:
df['new'] = df.Address.apply(lambda x: next(iter(parser.street_addresses(x)), ''))
print (df)
Address new
0 61 Park Street, Camden, ME, 04843, US 61 Park Street,
1 1208 BEECHCRAFT BLVD
2 6704 BEECHCRAFT
3 PO BOX 469
4 6461 44TH AVE 6461 44TH AVE
5 11026 BELLE HAVEN DR 1026 BELLE HAVEN DR
或者如果需要,请使用分隔符将列表中的所有可能值连接起来:
df['new'] = df.Address.apply(parser.street_addresses).str.join(', ')