我有一个段落,将其分成几行并删除了所有标点符号。现在,我想检查是否有任何行包含任何数字,然后是单词“ degrees”,以便可以输出它。
例如,在句子上
The temperature of the room was 32 degrees
我想找到子字符串
32 degrees
在句子上
6 degrees of freedom in this rigid body
我想找到子字符串
6 degrees
如果以特定数字开头的特定单词,是否有办法一致地找到它?
答案 0 :(得分:1)
这是我的看法:
Python 3.7.4 (default, Aug 12 2019, 14:45:07)
[GCC 9.1.1 20190605 (Red Hat 9.1.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> degree=re.compile(r'\d+\s*degree[s]?')
>>> s='32 degrees should be matched as 0 degree and 0degree should be as well, but not this last "degree" here.'
>>> degree.findall(s)
['32 degrees', '0 degree', '0degree']
>>>
答案 1 :(得分:1)
使用正则表达式r'\b\d+\s*degree[s]?\b'
import re
s = '''32 degrees, 0 degree and 0degree should be matched
but not a56 degrees or 13 degreess'''
print(re.findall(r'\b\d+\s*degree[s]?\b', s))
输出
['32 degrees', '0 degree', '0degree']
答案 2 :(得分:0)
使用正则表达式:
import re
FIND_DEGREES = re.compile(r'(\d+) degrees')
lines = [
'The temperature of the room was 32 degrees'
]
for line in lines:
match = FIND_DEGREES.search(line)
if match:
print(f'Temp: {match.group(1)} found in "{match.group(0)}"')
输出:
Temp: 32 found in "32 degrees"
请注意,如果学位出现一次以上,则应考虑使用.findall
而不是.search
。
答案 3 :(得分:0)
如前所述,使用正则表达式。
import re
substring = re.compile(r"\d+\sdegrees")
for line in lines:
print(substring.findall(line))