我正在创建一个小型财务管理程序,该程序将从CSV导入我的交易到Python。我想根据在“详细信息” 列中找到的字符串将值分配给新列“类别” 。我可以做到这一点,但是我的问题是,如果我有很多可能的字符串,该怎么办?例如,str.contains('RALPHS')
会将列值替换为'groceries',依此类推。
例如,下面有一个字符串列表:
dining = ['CARLS', 'SUBWAY', 'DOMINOS']
,如果在我的系列中找到了这些字符串中的任何一个,则它将把相应的类别系列更新为“正在用餐”。
下面是一个小的可运行示例。
import pandas as pd
import numpy as np
data = [
[-68.23 , 'PAYPAL TRANSFER'],
[-12.46, 'RALPHS #0079'],
[-8.51, 'SAVE AS YOU GO'],
[25.34, 'VENMO CASHOUT'],
[-2.23 , 'PAYPAL TRANSFER'],
[-64.29 , 'PAYPAL TRANSFER'],
[-7.06, 'SUBWAY'],
[-7.03, 'CARLS JR'],
[-2.35, 'SHELL OIL'],
[-35.23, 'CHEVRON GAS']
]
df = pd.DataFrame(data, columns=['amount', 'details'])
df['category'] = np.nan
str_xfer = 'TRANSFER'
df['category'] = (df['details'].str.contains(str_xfer)).astype(int)
df['category'] = df['category'].replace(
to_replace=1,
value='transfer')
df
amount details category
0 -68.23 PAYPAL TRANSFER transfer
1 -12.46 RALPHS 0
2 -8.51 SAVE AS YOU GO 0
3 25.34 VENMO CASHOUT 0
4 -2.23 PAYPAL TRANSFER transfer
5 -64.29 PAYPAL TRANSFER transfer
6 -7.06 SUBWAY 0
7 -7.03 CARLS JR 0
8 -2.35 SHELL OIL 0
9 -35.23 CHEVRON GAS 0
非常感谢。
答案 0 :(得分:4)
如果您有一个值,我们可以使用str.extract
:
[-100.]
[-100.]
[-98.99]
[-94.95]
[-78.79]
[-30.17904355]
[-3.55271368e-15]
df['category'] = df['details'].str.extract(f'({str_xfer})')
如果您要匹配多个字符串,我们必须先用 amount details category
0 -68.23 PAYPAL TRANSFER TRANSFER
1 -12.46 RALPHS #0079 NaN
2 -8.51 SAVE AS YOU GO NaN
3 25.34 VENMO CASHOUT NaN
4 -2.23 PAYPAL TRANSFER TRANSFER
5 -64.29 PAYPAL TRANSFER TRANSFER
来分隔字符串,|
是正则表达式中的或运算符。>
str_xfer = ['TRANSFER', 'RALPHS', 'CASHOUT']
str_xfer = '|'.join(str_xfer)
df['category'] = df['details'].str.extract(f'({str_xfer})')
amount details category
0 -68.23 PAYPAL TRANSFER TRANSFER
1 -12.46 RALPHS #0079 RALPHS
2 -8.51 SAVE AS YOU GO NaN
3 25.34 VENMO CASHOUT CASHOUT
4 -2.23 PAYPAL TRANSFER TRANSFER
5 -64.29 PAYPAL TRANSFER TRANSFER
答案 1 :(得分:1)
我认为您需要str.findall
df['category']=df.details.str.findall('TRANSFER').str[0].fillna(0)
df
amount details category
0 -68.23 PAYPAL TRANSFER TRANSFER
1 -12.46 RALPHS #0079 0
2 -8.51 SAVE AS YOU GO 0
3 25.34 VENMO CASHOUT 0
4 -2.23 PAYPAL TRANSFER TRANSFER
5 -64.29 PAYPAL TRANSFER TRANSFER
如果您在str_xfer
中添加多个'|'
的字符串,则
df.details.str.findall('TRANSFER|VENMO').str[0]
0 TRANSFER
1 NaN
2 NaN
3 VENMO
4 TRANSFER
5 TRANSFER
Name: details, dtype: object