Question

我有一个字符串列表，我想按其字典顺序排序 - 按重量排序（单词出现在指定URL中的次数/此URL中的单词数）。

问题在于方法＆＃34; searchPrefix＆＃34;当我创建一个新的比较器时，它显然不会识别我用来计算重量的那个类的字段。

尝试过的事情： 1.使用SortedMap然后不需要实现Comparator，只需要指示具体说明实现Comparator。 2.使用getter（也没有工作，因为我在班级和方法中工作）; 3.将列表实现为List＆gt; urlList = new ArrayList ...也没有用。

（比较器的实现是我想做的）如何更改它？

package il.ac.tau.cs.sw1.searchengine;

import java.util.*

public class MyWordIndex implements WordIndex {

    public SortedMap<String, HashMap<String, Integer>> words;
    public HashMap<String, Integer> urls;

    public MyWordIndex() {
        this.words = new TreeMap<String, HashMap<String, Integer>>();;
        this.urls = new HashMap<String, Integer>();
    }

    @Override
    public void index(Collection<String> words, String strURL) {
        this.urls.put(strURL, words.size()); // to every page- how many words in it.
        String subPrefix = "";
        HashMap<String, Integer> help1; // how many times a word appears on that page
        for (String word : words) {
            if (word == null || word == "") // not a valid word
                continue;
            word.toLowerCase();
            help1 = new HashMap<String, Integer>();
            for (int i = 0; i < word.length(); i++) {
                subPrefix = word.substring(0, i);
                if (this.words.get(subPrefix) == null) { // new prefix
                    help1.put(strURL, 1);
                    this.words.put(subPrefix, help1);
                }
                else {  // prefix exists
                    if (this.words.get(subPrefix).get(strURL) == null)//new URL with old prefix
                        this.words.get(subPrefix).put(strURL, 1);
                    else                           // both url and prefix exists   
                        this.words.get(subPrefix).put(strURL, help1.get(strURL) + 1);
                }
            }
        }
    }

    @Override
    public List<String> searchPrefix(String prefix) {
        prefix.toLowerCase();
        List<String> urlList = new ArrayList<String>();
        for (String word : this.words.keySet()) {
            if (word.startsWith(prefix)) {
                for (String strUrl : this.words.get(word).keySet()) {
                    urlList.add(strUrl);
                }
            }
        }
        Collections.sort(urlList, new Comparator<String>() {
            @Override
            public int compare(String strUrl1, String strUrl2) {
                Double d1 =  this.words.get(word).get(strUrl1) / this.urls.get(strUrl1);
                Double d2 =  this.words.get(word).get(strUrl2) / this.urls.get(strUrl2);
                return Double.compare(d1, d2);
            }
        });

        ........
    }

Answer 1

这些更改使您更接近解决方案。

Double d1 =  MyWordIndex.this.words.get(word).get(strUrl1) / (double) MyWordIndex.this.urls.get(strUrl1);
Double d2 =  MyWordIndex.this.words.get(word).get(strUrl2) / (double) MyWordIndex.this.urls.get(strUrl2);

我不知道word应该是什么，因为范围内没有该名称的变量。

Answer 2

索引方法中for循环的建议：

for (int i = 1; i < word.length(); i++) { // no point starting at 0 - empty string
    subPrefix = word.substring(0, i);
    if (this.words.get(subPrefix) == null) { // new prefix
        help1.put(strURL, 1);
        this.words.put(subPrefix, help1);
    }
    else {  // prefix exists
        Integer count = this.words.get(subPrefix).get(strURL);
        if (count == null)//new URL with old prefix
            count = 0;
        this.words.get(subPrefix).put(strURL, count + 1);
    }
}

虽然我们对此表示赞同，但我可以建议Guava multiset自动为您计算：

import com.google.common.collect.Multiset;
import com.google.common.collect.HashMultiset;

public class MultiTest{

    public final Multiset<String> words;

    public MultiTest() {
        words = HashMultiset.create();
    }

    public static void main(String []args) {
        MultiTest test = new MultiTest();
        test.words.add("Mandible");
        test.words.add("Incredible");
        test.words.add("Commendable");
        test.words.add("Mandible");
        System.out.println(test.words.count("Mandible")); // 2
    }
}

最后要解决你的问题，这应该有用，还没有测试过：

@Override
public List<String> searchPrefix(String prefix) {
    prefix = prefix.toLowerCase(); // Strings are immutable so this returns a new String
    Map<String, Double> urlList = new HashMap<String, Double>();
    for (String word : this.words.keySet()) {
        if (word.startsWith(prefix)) {
            for (String strUrl : this.words.get(word).keySet()) {
                Double v = urlList.get(strUrl);
                if (v == null) v = 0;
                urlList.put(strUrl, v + this.words.get(word).get(strUrl));
            }
        }
    }
    List<String> myUrls = new ArrayList<String>(urlList.keySet());
    Collections.sort(myUrls, new Comparator<String>() {
        @Override
        public int compare(String strUrl1, String strUrl2) {
            return Double.compare(urlList.get(strUrl1) / MyWordIndex.this.urls.get(strUrl1),
                                  urlList.get(strUrl2) / MyWordIndex.this.urls.get(strUrl2));
        }
    });

    return myUrls;
}

实现比较器来排序字符串列表

2 个答案: