我要使用re
和contain
:
这个模式由使用|
的几种模式组成,但我也想使用“ But not”。
我想对工作类型进行分类。
#Data:
v=pd.Series(['New wiring system for an extra room','Build a wall and add a new door',
'Fix a shelving unit'])
v=v.str.lower()
print(v)
#I construct this pattern:
pattern_cons='ret|wall|ceiling|buil|holes|cons'
pattern_nrg= 'wiring|media|elect'
pattern_plumb='water'
pattern_carp= 'shelving|table|door'
pattern_work=pd.Series([pattern_cons,pattern_nrg,pattern_plumb,
pattern_carp])
# Use this code : (I loop this)
for x in range(4):
pattern=pattern_work
vector={'pattern': pattern_work[x],'type_work':class_str[x]}
print(vector)
s=v.str.contains(vector['pattern'], flags=re.IGNORECASE, regex=True)
print(s)
我得到以下输出:
0 new wiring system for an extra room
1 build a wall and add a new door
2 fix a shelving unit
dtype: object
{'pattern': 'ret|wall|ceiling|buil|holes|cons', 'type_work': 'cons'}
0 False
1 True
2 False
dtype: bool
{'pattern': 'wiring|media|elect', 'type_work': 'nrg'}
0 True
1 False
2 False
dtype: bool
{'pattern': 'water', 'type_work': 'plumb'}
0 False
1 False
2 False
dtype: bool
{'pattern': 'shelving|table|door', 'type_work': 'carp'}
0 False
1 True # ------- **** I WANT THIS "False" **** ------- #
2 True
dtype: bool
'Build a wall and add a new door'
被分类为cons
类和carp
类。
但是我希望字符串是False
的{{1}}。
pattern_carp
的模式。我的意思是这样的吗? :?!buil
答案 0 :(得分:0)
嗯,尽管不是最优雅的方法,但我已经找到了解决方法。 您将如何改进?,实际上,我正在丢失信息。
我添加了这个:
v=v.replace**("Build a wall and add a new door", "Build a wall and add a new")
现在的代码是:
v=pd.Series(['New wiring system for an extra room','Build a wall and add a new door',
'Fix a shelving unit'])
v=v.replace("Build a wall and add a new door", "Build a wall and add a new")
print(v.replace("Build a wall and add a new door", "Build a wall and add a new"))
v=v.str.lower()
print(v)
pattern_cons='ret|wall|ceiling|buil|holes|cons'
pattern_nrg= 'wiring|media|elect'
pattern_plumb='water'
pattern_carp= 'shelving|table|door'
pattern_work=pd.Series([pattern_cons,pattern_nrg,
pattern_plumb, pattern_carp])
# Use this code : (I loop this)
for x in range(4):
pattern=pattern_work
vector={'pattern': pattern_work[x],'type_work':class_str[x]}
print(vector)
s=v.str.contains(vector['pattern'], flags=re.IGNORECASE, regex=True)
print(s)
现在我得到了正确的答案(我想要的答案):
0 New wiring system for an extra room
1 Build a wall and add a new
2 Fix a shelving unit
dtype: object
0 new wiring system for an extra room
1 build a wall and add a new
2 fix a shelving unit
dtype: object
{'pattern': 'ret|wall|ceiling|buil|holes|cons', 'type_work': 'cons'}
0 False
1 True
2 False
dtype: bool
{'pattern': 'wiring|media|elect', 'type_work': 'nrg'}
0 True
1 False
2 False
dtype: bool
{'pattern': 'water', 'type_work': 'plumb'}
0 False
1 False
2 False
dtype: bool
{'pattern': 'shelving|table|door', 'type_work': 'carp'}
0 False
1 False
2 True
dtype: bool