我想将包含字母字符的单词转换为Java中的代表性数字。
例如,four hundred four
应评估为404
。
如果这些字母像asdf
那样是胡言乱语,那就错了。
我知道我可以convert bare Characters to their ascii equivalent Integer将这些加在一起,但我只想提取英文单词短语背后的数字。
答案 0 :(得分:15)
以下是我在尝试解决同一问题时提出的一些代码。请记住,我不是专业人士,也没有疯狂的经验。它并不慢,但我相信它可以更快/更干净/等等。我用它将语音识别的单词转换为数字,用于我自己的“贾维斯”钢铁侠计算。它可以处理10亿以下的数字,虽然它很容易扩展到包含更高的数量,但成本很少。
public static final String[] DIGITS = {"one", "two", "three", "four", "five", "six", "seven", "eight", "nine"};
public static final String[] TENS = {null, "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"};
public static final String[] TEENS = {"ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", "seventeen", "eighteen", "nineteen"};
public static final String[] MAGNITUDES = {"hundred", "thousand", "million", "point"};
public static final String[] ZERO = {"zero", "oh"};
public static String replaceNumbers (String input) {
String result = "";
String[] decimal = input.split(MAGNITUDES[3]);
String[] millions = decimal[0].split(MAGNITUDES[2]);
for (int i = 0; i < millions.length; i++) {
String[] thousands = millions[i].split(MAGNITUDES[1]);
for (int j = 0; j < thousands.length; j++) {
int[] triplet = {0, 0, 0};
StringTokenizer set = new StringTokenizer(thousands[j]);
if (set.countTokens() == 1) { //If there is only one token given in triplet
String uno = set.nextToken();
triplet[0] = 0;
for (int k = 0; k < DIGITS.length; k++) {
if (uno.equals(DIGITS[k])) {
triplet[1] = 0;
triplet[2] = k + 1;
}
if (uno.equals(TENS[k])) {
triplet[1] = k + 1;
triplet[2] = 0;
}
}
}
else if (set.countTokens() == 2) { //If there are two tokens given in triplet
String uno = set.nextToken();
String dos = set.nextToken();
if (dos.equals(MAGNITUDES[0])) { //If one of the two tokens is "hundred"
for (int k = 0; k < DIGITS.length; k++) {
if (uno.equals(DIGITS[k])) {
triplet[0] = k + 1;
triplet[1] = 0;
triplet[2] = 0;
}
}
}
else {
triplet[0] = 0;
for (int k = 0; k < DIGITS.length; k++) {
if (uno.equals(TENS[k])) {
triplet[1] = k + 1;
}
if (dos.equals(DIGITS[k])) {
triplet[2] = k + 1;
}
}
}
}
else if (set.countTokens() == 3) { //If there are three tokens given in triplet
String uno = set.nextToken();
String dos = set.nextToken();
String tres = set.nextToken();
for (int k = 0; k < DIGITS.length; k++) {
if (uno.equals(DIGITS[k])) {
triplet[0] = k + 1;
}
if (tres.equals(DIGITS[k])) {
triplet[1] = 0;
triplet[2] = k + 1;
}
if (tres.equals(TENS[k])) {
triplet[1] = k + 1;
triplet[2] = 0;
}
}
}
else if (set.countTokens() == 4) { //If there are four tokens given in triplet
String uno = set.nextToken();
String dos = set.nextToken();
String tres = set.nextToken();
String cuatro = set.nextToken();
for (int k = 0; k < DIGITS.length; k++) {
if (uno.equals(DIGITS[k])) {
triplet[0] = k + 1;
}
if (cuatro.equals(DIGITS[k])) {
triplet[2] = k + 1;
}
if (tres.equals(TENS[k])) {
triplet[1] = k + 1;
}
}
}
else {
triplet[0] = 0;
triplet[1] = 0;
triplet[2] = 0;
}
result = result + Integer.toString(triplet[0]) + Integer.toString(triplet[1]) + Integer.toString(triplet[2]);
}
}
if (decimal.length > 1) { //The number is a decimal
StringTokenizer decimalDigits = new StringTokenizer(decimal[1]);
result = result + ".";
System.out.println(decimalDigits.countTokens() + " decimal digits");
while (decimalDigits.hasMoreTokens()) {
String w = decimalDigits.nextToken();
System.out.println(w);
if (w.equals(ZERO[0]) || w.equals(ZERO[1])) {
result = result + "0";
}
for (int j = 0; j < DIGITS.length; j++) {
if (w.equals(DIGITS[j])) {
result = result + Integer.toString(j + 1);
}
}
}
}
return result;
}
输入必须是语法正确的语法,否则会出现问题(创建一个删除“和”的函数)。一个字符串输入“二百二十五万三千零八八五二两”返回:
two hundred two million fifty three thousand point zero eight five eight oh two
202053000.085802
It took 2 milliseconds.
答案 1 :(得分:11)
基本策略是使用您使用的value
变量。每当您看到字符串“one”,“two”,“eleven”,“seven”时,您都会将该金额添加到value
。当你看到像“百”,“千”,“百万”这样的字符串时,你会乘以<{em> value
这个数量。
对于较大的数字,您可能需要创建一些小计并在最后组合。处理像111,374这样写成“十一万三千七百四十四”的数字的步骤将是
value[0] += 1
(现在1
)value[0] *= 100
(现在100
)value[0] += 11
(现在111
)value[0] *= 1000
(现在111000
)value[1] += 3
value[1] *= 100
(现在300
)value[1] += 70
(现在370
)value[1] += 4
现在(374)您仍然需要弄清楚如何决定何时将其构建为多个值。当您遇到比最近看到的乘数小的乘数(“百”)时,您似乎应该开始一个新的小计。
答案 2 :(得分:5)
public class InNumerals5Digits {
static String testcase1 = "ninety nine thousand nine hundred ninety nine";//
public static void main(String args[]){
InNumerals5Digits testInstance = new InNumerals5Digits();
int result = testInstance.inNumerals(testcase1);
System.out.println("Result : "+result);
}
//write your code here
public int inNumerals(String inwords)
{
int wordnum = 0;
String[] arrinwords = inwords.split(" ");
int arrinwordsLength = arrinwords.length;
if(inwords.equals("zero"))
{
return 0;
}
if(inwords.contains("thousand"))
{
int indexofthousand = inwords.indexOf("thousand");
//System.out.println(indexofthousand);
String beforethousand = inwords.substring(0,indexofthousand);
//System.out.println(beforethousand);
String[] arrbeforethousand = beforethousand.split(" ");
int arrbeforethousandLength = arrbeforethousand.length;
//System.out.println(arrbeforethousandLength);
if(arrbeforethousandLength==2)
{
wordnum = wordnum + 1000*(wordtonum(arrbeforethousand[0]) + wordtonum(arrbeforethousand[1]));
//System.out.println(wordnum);
}
if(arrbeforethousandLength==1)
{
wordnum = wordnum + 1000*(wordtonum(arrbeforethousand[0]));
//System.out.println(wordnum);
}
}
if(inwords.contains("hundred"))
{
int indexofhundred = inwords.indexOf("hundred");
//System.out.println(indexofhundred);
String beforehundred = inwords.substring(0,indexofhundred);
//System.out.println(beforehundred);
String[] arrbeforehundred = beforehundred.split(" ");
int arrbeforehundredLength = arrbeforehundred.length;
wordnum = wordnum + 100*(wordtonum(arrbeforehundred[arrbeforehundredLength-1]));
String afterhundred = inwords.substring(indexofhundred+8);//7 for 7 char of hundred and 1 space
//System.out.println(afterhundred);
String[] arrafterhundred = afterhundred.split(" ");
int arrafterhundredLength = arrafterhundred.length;
if(arrafterhundredLength==1)
{
wordnum = wordnum + (wordtonum(arrafterhundred[0]));
}
if(arrafterhundredLength==2)
{
wordnum = wordnum + (wordtonum(arrafterhundred[1]) + wordtonum(arrafterhundred[0]));
}
//System.out.println(wordnum);
}
if(!inwords.contains("thousand") && !inwords.contains("hundred"))
{
if(arrinwordsLength==1)
{
wordnum = wordnum + (wordtonum(arrinwords[0]));
}
if(arrinwordsLength==2)
{
wordnum = wordnum + (wordtonum(arrinwords[1]) + wordtonum(arrinwords[0]));
}
//System.out.println(wordnum);
}
return wordnum;
}
public int wordtonum(String word)
{
int num = 0;
switch (word) {
case "one": num = 1;
break;
case "two": num = 2;
break;
case "three": num = 3;
break;
case "four": num = 4;
break;
case "five": num = 5;
break;
case "six": num = 6;
break;
case "seven": num = 7;
break;
case "eight": num = 8;
break;
case "nine": num = 9;
break;
case "ten": num = 10;
break;
case "eleven": num = 11;
break;
case "twelve": num = 12;
break;
case "thirteen": num = 13;
break;
case "fourteen": num = 14;
break;
case "fifteen": num = 15;
break;
case "sixteen": num = 16;
break;
case "seventeen": num = 17;
break;
case "eighteen": num = 18;
break;
case "nineteen": num = 19;
break;
case "twenty": num = 20;
break;
case "thirty": num = 30;
break;
case "forty": num = 40;
break;
case "fifty": num = 50;
break;
case "sixty": num = 60;
break;
case "seventy": num = 70;
break;
case"eighty": num = 80;
break;
case "ninety": num = 90;
break;
case "hundred": num = 100;
break;
case "thousand": num = 1000;
break;
/*default: num = "Invalid month";
break;*/
}
return num;
}
}