Question

我一直在努力完成编程任务。基本上，我们必须编写一个程序，将英语句子翻译成Pig Latin中的一个句子。我们需要的第一个方法是标记字符串，我们不允许使用Java中常用的Split方法。在过去的两天里，我一直试图这样做而没有运气，这是我到目前为止所做的：

  public class PigLatin 
    { 
        public static void main(String[] args) 
        { 
              String s = "Hello there my name is John"; 
              Tokenize(s); 
        } 

        public static String[] Tokenize(String english) 
        { 
             String[] tokenized = new String[english.length()]; 
             for (int i = 0; i < english.length(); i++) 
             { 
                   int j= 0; 
                   while (english.charAt(i) != ' ') 
                   { 
                         String m = ""; 
                         m = m + english.charAt(i); 
                         if (english.charAt(i) == ' ') 
                         { 
                              j++; 
                         } 
                         else 
                         { 
                               break; 
                         } 
                    } 
          for (int l = 0; l < tokenized.length; l++) { 
          System.out.print(tokenized[l] + ", "); 
        }
      }
    return tokenized;
    }
}

所有这一切都打印了一大堆＆＃34; null＆＃34; s。如果有人可以提供任何意见，我会真的很感激你！

提前谢谢你更新：我们应该假设没有标点或额外的空格，所以基本上只要有空格，它就是一个新词

Answer 1

如果我理解您的问题，以及您Tokenize打算做什么;然后我将开始编写一个函数来分割String

static String[] splitOnWhiteSpace(String str) {
    List<String> al = new ArrayList<>();
    StringBuilder sb = new StringBuilder();
    for (char ch : str.toCharArray()) {
        if (Character.isWhitespace(ch)) {
            if (sb.length() > 0) {
                al.add(sb.toString());
                sb.setLength(0);
            }
        } else {
            sb.append(ch);
        }
    }
    if (sb.length() > 0) {
        al.add(sb.toString());
    }
    String[] ret = new String[al.size()];
    return al.toArray(ret);
}

然后使用Arrays.toString(Object[])打印

public static void main(String[] args) {
    String s = "Hello there my name is John";
    String[] words = splitOnWhiteSpace(s);
    System.out.println(Arrays.toString(words));
}

Answer 2

如果你被允许使用StringTokenizer对象（我认为这是作业所要求的，它看起来像这样：

StringTokenizer st = new StringTokenizer("this is a test");
 while (st.hasMoreTokens()) {
     System.out.println(st.nextToken());
 }

将产生输出：

 this
 is
 a
 test

取自here。

字符串被拆分为标记并存储在堆栈中。 while循环遍历令牌，您可以在其中应用猪拉丁逻辑。

Answer 3

一些提示让您进行“手动拆分”工作。

有一种方法String#indexOf(int ch, int fromIndex)可以帮助您找到下一个字符
有一个方法String#substring(int beginIndex, int endIndex)来提取字符串的某些部分。

这里有一些伪代码，告诉你如何拆分它（你需要更多的安全处理，我会留给你）

List<String> results = ...;
int startIndex = 0;
int endIndex = 0;

while (startIndex < inputString.length) {
    endIndex = get next index of space after startIndex
    if no space found {
        endIndex = inputString.length
    }
    String result = get substring of inputString from startIndex to endIndex-1
    results.add(result)
    startIndex = endIndex + 1  // move startIndex to next position after space
}

// here, results contains all splitted words

Answer 4

             String english = "hello my fellow friend"
             ArrayList tokenized = new ArrayList<String>(); 
             String m = "";
             int j = 0; //index for tokenised array list.
             for (int i = 0; i < english.length(); i++) 
             { 

                   //the condition's position do matter here, if you  
                   //change them, english.charAt(i) will give index      
                   //out of bounds exception
                   while( i < english.length() && english.charAt(i) != ' ') 
                   { 
                         m = m + english.charAt(i); 
                         i++;

                   }
                   //add to array list if there is some string
                   //if its only  ' ', array will be empty so we are OK.
                   if(m.length() > 0 )
                   {
                       tokenized.add(m);
                       j++;
                       m = "";

                   }

             }    
          //print the array list
          for (int l = 0; l < tokenized.size(); l++) { 
          System.out.print(tokenized.get(l) + ", "); 

                        }

这打印出来，“你好，我的，伙伴，朋友” 我使用了数组列表，因为第一眼看到数组的长度不清楚。

Tokenize方法：将字符串拆分为数组

4 个答案: