Spacy LIKE_NUM强制转换为等效的python数字

时间:2019-11-28 12:53:44

标签: python nlp spacy

spacy是否提供从LIKE_NUM令牌到python浮点十进制的快速转换。 Spacy可以匹配LIKE_NUM令牌,例如“ 31,2”,“ 10.9”,“ 10”,“十”等。它是否也提供一种快速获取python数字的方法?我期望像.get_value()这样的方法向我返回数字(而不是字符串),但是找不到任何内容。

nlp = spacy.load('en_core_web_sm')
matcher = Matcher(nlp.vocab) 
text = "this is just a text and a number 10,2 or 10.2 meaning ten point two"
doc = nlp(text)

pattern = [{"LIKE_NUM": True}]

matcher.add("number_match", None, pattern)

matches = matcher(doc)
print("All matches:")
for match_id, start, end in matches:
    string_id = nlp.vocab.strings[match_id]  # Get string representation
    span = doc[start:end]  # The matched span
    print(match_id, string_id, start, end, span.text)

    print(type(span.text))

输出为:

All matches:
13316671205374851783 number_match 8 9 10,2
<class 'str'>
13316671205374851783 number_match 10 11 10.2
<class 'str'>
13316671205374851783 number_match 12 13 ten
<class 'str'>
13316671205374851783 number_match 14 15 two
<class 'str'>

0 个答案:

没有答案