我想将单词中表示的数字转换为数字。
例如,
thirty four thousand four fifty
到其对应的数值34450
。
还有一些模糊转换,如"Please pay thirty-four thousand four fifty dollars"
,然后输出为34450
。
答案 0 :(得分:2)
对于数字到单词,请尝试“num2words”包: https://pypi.python.org/pypi/num2words
对于num的单词,我在这里稍微调整了代码: Is there a way to convert number words to Integers?
from num2words import num2words
def text2int(textnum, numwords={}):
if not numwords:
units = [
"zero", "one", "two", "three", "four", "five", "six", "seven", "eight",
"nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen",
"sixteen", "seventeen", "eighteen", "nineteen",
]
tens = ["", "", "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"]
scales = ["hundred", "thousand", "million", "billion", "trillion"]
numwords["and"] = (1, 0)
for idx, word in enumerate(units): numwords[word] = (1, idx)
for idx, word in enumerate(tens): numwords[word] = (1, idx * 10)
for idx, word in enumerate(scales): numwords[word] = (10 ** (idx * 3 or 2), 0)
current = result = 0
for word in textnum.split():
if word not in numwords:
raise Exception("Illegal word: " + word)
scale, increment = numwords[word]
current = current * scale + increment
if scale > 100:
result += current
current = 0
return result + current
#### My update to incorporate decimals
num = 5000222223.28
fullText = num2words(num).replace('-',' ').replace(',',' ')
print fullText
decimalSplit = fullText.split('point ')
if len(decimalSplit) > 1:
decimalSplit2 = decimalSplit[1].split(' ')
decPart = sum([float(text2int(decimalSplit2[x]))/(10)**(x+1) for x in range(len(decimalSplit2))])
else:
decPart = 0
intPart = float(text2int(decimalSplit[0]))
Value = intPart + decPart
print Value
- >五十二亿二千二百二十三点二八
- > 5000222223.28