Question

我正在尝试比较字符串的2个索引。基本上，如果字符串的第一个索引不等于第二个索引，请在searchIndex()中创建新的void main方法。

这表示如果用户在搜索引擎中输入2个单词的查询，结果应该显示第一个单词的匹配文本文件和第二个单词的匹配文本文件，而不是显示匹配的总数和将文本文件相互混合，而不知道哪个单词与哪个文本文件相关。

如果我在文本框中输入Japan

输出：

Searching for 'Japan '
3
Files\whalehunt.txt
Files\japan.txt
Files\innovation.txt

但如果我输入了两个字：

输出：

Searching for 'Japan amazon '
5
Files\whalehunt.txt
Files\japan.txt
Files\peru.txt
Files\correspondent.txt
Files\innovation.txt

如果是2个单词，则用户不知道哪个单词是哪个文件。这一切都搞砸了。我想要做的是比较查询字符串的索引，以匹配两个单词是否相同。如果不是，则应在searchIndex()中添加新的void main方法，并将第二个字分配给它。

所以不是这样：

public static void main(String[] args) throws Exception {

        createIndex();

        searchIndex("Japan amazon ");

    }

执行此操作：

 public static void main(String[] args) throws Exception {

            createIndex();

                searchIndex("Japan ");
                searchIndex("amazon");


        }

我试过的是：

public static void searchIndex(String searchString) throws IOException, ParseException {

        for(int n=0;n<=1000;n++)
        {
            if (searchString.substring(0) != searchString.substring(1))
            {

                void main.searchIndex(searchString.); //**Error**
            }
        }

任何帮助都会感激不尽!!

的问候。

Answer 1

你的serachIndex（）方法是完全错误的，恕我直言。您正在比较字符串的索引0和1 1000次。为什么不使用tokenizer或String.split（）从字符串中单独生成单词？像这样：

public static void searchIndex(String searchString) throws IOException, ParseException {
    searchString = searchString.trim();
    if (searchString.length < 1)
        return;
    String[] words = searchString.split(" ");
    if (words.length > 1) {
        for (String word : words)
            searchIndex(word);
    } else {
          // Do normal stuff here
    }
}

BTW，我假设您了解Apache Lucene等工具和MapReduce等算法。

Answer 2

您可以使用break iterator或使用更简单的String Tokenizer

 public static void breakSentenceIntoWords(String source) {
         BreakIterator boundary = BreakIterator.getWordInstance();
         boundary.setText(source);
         int start = boundary.first();
         for (int end = boundary.next();
              end != BreakIterator.DONE;
              start = end, end = boundary.next()) {
              String newWordToSearch = source.substring(start+1,end);
              // perform search and other ops here
         }
     }

比较java中字符串的索引

2 个答案: