假设我的代码与此类似:
import pandas as pd
df=pd.DataFrame({'Name': [ 'Jay Leno', 'JayLin', 'Jay-Jameson', 'LinLeno', 'Lin Jameson', 'Python Leno', 'Python Lin', 'Python Jameson', 'Lin Jay', 'Python Monte'],
'Class': ['Rat','L','H','L','L','H', 'H','L','L','Circus']})
df['status']=''
pattern1=['^Jay(\s|-)?(Leno|Lin|Jameson)$','^Python(\s|-)?(Jay|Leno|Lin|Jameson|Monte)$','^Lin(\s|-)?(Leno|Jay|Jameson|Monte)$' ]
pattern2=['^Python(\s|-)?(Jay|Leno|Lin|Jameson|Monte)$' ]
pattern3=['^Lin(\s|-)?(Leno|Jay|Jameson|Monte)$' ]
for i in range(len(pattern1)):
df.loc[df.Name.str.contains(pattern1[i]),'status'] = 'A'
for i in range(len(pattern2)):
df.loc[df.Name.str.contains(pattern2[i]),'status'] = 'B'
for i in range(len(pattern3)):
df.loc[df.Name.str.contains(pattern3[i]),'status'] = 'C'
print (df)
打印哪些:
C:\Python33\lib\site-packages\pandas\core\strings.py:184: UserWarning: This pattern has match groups. To actually get the groups, use str.extract.
" groups, use str.extract.", UserWarning)
Class Name status
0 Rat Jay Leno A
1 L JayLin A
2 H Jay-Jameson A
3 L LinLeno C
4 L Lin Jameson C
5 H Python Leno B
6 H Python Lin B
7 L Python Jameson B
8 L Lin Jay C
9 Circus Python Monte B
[10 rows x 3 columns]
我的问题是如何删除错误,是否有办法通过更少的代码更有效地循环?我知道有一些叫做列表推导的东西,但我对如何使用它们很困惑。
我知道
可以抑制错误pd.options.mode.chained_assignment = None
答案 0 :(得分:8)
使用non-capturing parentheses (?:...)
:
pattern1=['^Jay(?:\s|-)?(?:Leno|Lin|Jameson)$','^Python(?:\s|-)?(?:Jay|Leno|Lin|Jameson|Monte)$','^Lin(?:\s|-)?(?:Leno|Jay|Jameson|Monte)$' ]
pattern2=['^Python(?:\s|-)?(?:Jay|Leno|Lin|Jameson|Monte)$' ]
pattern3=['^Lin(?:\s|-)?(?:Leno|Jay|Jameson|Monte)$' ]
警告来自this code:
if regex.groups > 0:
warnings.warn("This pattern has match groups. To actually get the"
" groups, use str.extract.", UserWarning)
所以只要没有团体,就没有警告。