不相关的笔记

Question

import re
nos="to do with your newfound skills.  338  3803"
for x in nos:
    y=re.findall("[0-9]+",nos)
print("total is :",sum(y))

尽管变量y返回一个列表，但是仍然必须将其明确地提及为y = list（），但也会出现此错误：

“ TypeError：+不支持的操作数类型：“ int”和“ str””

Answer 1

在注释部分，您已经有答案解释了为什么它不起作用，但是在这种情况下，请考虑使用列表理解，

nos="to do with your newfound skills.  338  3803"
print(sum([int(s) for s in nos.split() if s.isdigit()]))
>>>>4141

甚至更好，如@EdwardMinnix所述

print(sum(int(s) for s in nos.split() if s.isdigit()))

Answer 2

您可以尝试将int映射到您的字符串而不是循环。在这里，我使用\d+来计算整数。您可以像以前一样很好地使用[0-9]+。

import re
nos="to do with your newfound skills.  338  3803"
y = map(int, re.findall(r'\d+', nos))
print (sum(y))

Answer 3

您假设re.findall("[0-9]+",nos)返回了int，但没有返回。

请记住，正则表达式可与str一起使用，因此它返回该整数的str表示形式。

您需要将其转换为int才能对其执行算术运算。

一种实现此目的的好方法是利用map方法，该方法接收一个可调用对象，并在可用参数作为参数多次执行它，并返回所有输出的list。

y = map(int, re.findall('[0-9]+', nos))

不相关的笔记

以下是我要注意的几件事：

不要在字符串上循环

由于根本不使用迭代字符，因此无需遍历字符串。

import re
nos="to do with your newfound skills.  338  3803"
y=map(int, re.findall("[0-9]+",nos))
print("total is :",sum(y))

您可能会不使用正则表达式而逃脱

如果您知道数字将由空格分隔，则可以将逻辑更改为以下内容：

text = "to do with your newfound skills.  338  3803"
numbers = map(int, filter(str.isdigit, nos.split()))
print (sum(numbers))

Answer 4

尝试一下。

import spacy
from spacy.pipeline import EntityRuler
from spacy import displacy
from spacy.matcher import Matcher

sentences = ["now she's a software engineer" , "she's got a cat", "he's a tennis player", "He thinks that she's 30 years old"]

nlp = spacy.load('en_core_web_sm')

def normalize(sentence):
    ans = []
    doc = nlp(sentence)


    #print([(t.text, t.pos_ , t.dep_) for t in doc])
    matcher = Matcher(nlp.vocab)
    pattern = [{"POS": "PRON"}, {"LOWER": "'s"}, {"LOWER": "got"}]
    matcher.add("case_has", None, pattern)
    pattern = [{"POS": "PRON"}, {"LOWER": "'s"}, {"LOWER": "been"}]
    matcher.add("case_has", None, pattern)
    pattern = [{"POS": "PRON"}, {"LOWER": "'s"}, {"POS": "DET"}]
    matcher.add("case_is", None, pattern)
    pattern = [{"POS": "PRON"}, {"LOWER": "'s"}, {"IS_DIGIT": True}]
    matcher.add("case_is", None, pattern)
    # .. add more cases

    matches = matcher(doc)
    for match_id, start, end in matches:
        string_id = nlp.vocab.strings[match_id]  
        for idx, t in enumerate(doc):
            if string_id == 'case_has' and t.text == "'s" and idx >= start and idx < end:
                ans.append("has")
                continue
            if string_id == 'case_is' and t.text == "'s" and idx >= start and idx < end:
                ans.append("is")
                continue
            else:
                ans.append(t.text)
    return(' '.join(ans))

for s in sentences:
    print(s)
    print(normalize(s))
    print()

findall（）返回一个列表，但不添加列表中的元素

4 个答案:

不相关的笔记

不要在字符串上循环

您可能会不使用正则表达式而逃脱