Traceback (most recent call last):
File ".\api.py", line 12, in <module>
'fb570dcb58ade2614a00539e355fbbb33325e55510d47e8bc8ca10f11033b868'
File "C:\Users\EstDorisMaribelMarca\AppData\Local\Programs\Python\Python37\lib\site-packages\pythonzimbra\tools\auth.py", line 104, in authenticate
server.send_request(auth_request, response)
File "C:\Users\EstDorisMaribelMarca\AppData\Local\Programs\Python\Python37\lib\site-packages\pythonzimbra\communication.py", line 125, in
send_request
self.timeout
File "C:\Users\EstDorisMaribelMarca\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\EstDorisMaribelMarca\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Users\EstDorisMaribelMarca\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 543, in _open
'_open', req)
File "C:\Users\EstDorisMaribelMarca\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\EstDorisMaribelMarca\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 1360, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Users\EstDorisMaribelMarca\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 1319, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)>
我正在尝试使用正则表达式来更改所有90岁以上的年龄。因此, import pandas as pd
dataframe = pd.DataFrame({'Data' : ['A 90-year-old or 96-year-old and 110-year-old is 90 days ',
'For all 82-year-old is the 94-year-old why 28A ',
'But the fact is 101-year-old 109-year-old cool 100',],
'ID': [1,2,3]
})
#tried this regex
dataframe['New'] = dataframe['Data'].str.replace(r'\d+(-year-old)', r'>90')
dataframe
Data ID New
0 A 90-year-old or 96-year-old and 110-year-old is 90 days 1 A >90 or >90 and >90 is 90 days
1 For all 82-year-old is the 94-year-old why 28A 2 For all >90 is the >90 why 28A
2 But the fact is 101-year-old 109-year-old cool 100 3 But the fact is >90 >90 cool 100
将更改为90-year-old
作为示例。但是>90
或任何90岁以下的年龄都不应该。我已经接近我想要的状态,但是82-year-old
仍然更改为82-year-old
,但不应该
如何在此代码行中更改正则表达式
>90
,以便仅 dataframe['New'] = dataframe['Data'].str.replace(r'\d+(-year-old)', r'>90')
及更高版本(例如90-year-old
,91-year-old
,98-year-old
等)更改为105-year-old
?
答案 0 :(得分:0)
您可以使用涵盖两种情况的正则表达式指定此名称:9[1-9]
和\d{3,}
:
dataframe['New'] = dataframe['Data'].str.replace(r'(9[1-9]|\d{3,})(-year-old)', r'>90')
因此,第一部分9[1-9]
匹配91
和99
之间的所有值,第二部分匹配所有三位数或更多的数字(1234
当然是非常不太可能)。
对于给定的样本数据,我们获得:
>>> dataframe['Data'].str.replace(r'(9[1-9]|\d{3,})(-year-old)', r'>90')
0 A 90-year-old or >90 and >90 is 90 days
1 For all 82-year-old is the >90 why 28A
2 But the fact is >90 >90 cool 100
Name: Data, dtype: object
如果要包含90
,可以将正则表达式更改为:
>>> dataframe['Data'].str.replace(r'(9\d|\d{3,})(-year-old)', r'>90')
0 A >90 or >90 and >90 is 90 days
1 For all 82-year-old is the >90 why 28A
2 But the fact is >90 >90 cool 100
Name: Data, dtype: object