Question

我有一个段落，将其分成几行并删除了所有标点符号。现在，我想检查是否有任何行包含任何数字，然后是单词“ degrees”，以便可以输出它。

例如，在句子上

The temperature of the room was 32 degrees

我想找到子字符串

32 degrees

在句子上

6 degrees of freedom in this rigid body

我想找到子字符串

6 degrees

如果以特定数字开头的特定单词，是否有办法一致地找到它？

Answer 1

这是我的看法：

Python 3.7.4 (default, Aug 12 2019, 14:45:07) 
[GCC 9.1.1 20190605 (Red Hat 9.1.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> degree=re.compile(r'\d+\s*degree[s]?')
>>> s='32 degrees should be matched as 0 degree and 0degree should be as well, but not this last "degree" here.'
>>> degree.findall(s)
['32 degrees', '0 degree', '0degree']
>>>

Answer 2

使用正则表达式r'\b\d+\s*degree[s]?\b'

import re
s = '''32 degrees, 0 degree and 0degree should be matched 
       but not a56 degrees or 13 degreess'''
print(re.findall(r'\b\d+\s*degree[s]?\b', s))

输出

['32 degrees', '0 degree', '0degree']

Answer 3

使用正则表达式：

import re
FIND_DEGREES = re.compile(r'(\d+) degrees')
lines = [
  'The temperature of the room was 32 degrees'
]
for line in lines:
    match = FIND_DEGREES.search(line)
    if match:
        print(f'Temp: {match.group(1)} found in "{match.group(0)}"')

输出：

Temp: 32 found in "32 degrees"

请注意，如果学位出现一次以上，则应考虑使用.findall而不是.search。

Answer 4

如前所述，使用正则表达式。


import re

substring = re.compile(r"\d+\sdegrees")
for line in lines:
    print(substring.findall(line))

如何在字符串中搜索任何数字后跟特定单词？

4 个答案: