标记化和计数标记

时间:2014-01-12 14:06:10

标签: java eclipse count token

我的TextView文本已动态更改。我希望使用分隔符空格“”对此文本进行标记,然后将计数标记发送到另一个文本视图

这是我的代码

public void onClick(View v) {
    // TODO Auto-generated method stub
    if (v.getId()==R.id.button5){
        Intent i = new Intent(this, Tokenizing.class);

        String test = ((TextView)findViewById(R.id.textView6)).getText().toString();
        test = test.toLowerCase();
        test = test.replaceAll("\\W", " ");
        StringBuilder result = new StringBuilder();
        StringTokenizer st2 = new StringTokenizer(test);
            while (st2.hasMoreTokens()) {
                String st3 = st2.nextToken();
                System.out.println(st3 + st2.countTokens());
            //  System.out.println("Count Token" + st2.countTokens());
                result.append(st3+'\n');
        }

        i.putExtra("result", result.toString());
        startActivity(i);
        //Log.i("Test Klik Next", result);
    }

结果

       stopwords 
       are 
       commonly 
       occurring 
       words

令牌化过程进展顺利,但我没有得到结果计数令牌,我的编码有问题吗?

我想要这样的预期输出

       (number of tokens)
        stopwords 
        are 
        commonly 
        occurring 
        words

2 个答案:

答案 0 :(得分:1)

您可能希望在循环生成的令牌之前在代码中添加一行:

result.append(st2.countTokens() + "\n");
while (st2.hasMoreTokens()) {

您可能希望在while循环中注释掉sops以避免混淆。

或者,您可以在不迭代使用常规字符串拆分创建的标记的情况下实现此目的:

    String test = "This is a test String proving the concept";
    StringBuilder result = new StringBuilder();

    String[] tokens = test.split("\\s");
    result.append(tokens.length + "\n");

    for (String str:tokens) {
             result.append(str+'\n');
            }
     System.out.println(result);

<强>输出:

8
This
is
a
test
String
proving
the
concept

答案 1 :(得分:0)

StringTokenizer有方法countTokens() - &gt;计算在生成异常之前可以调用此标记生成器的nextToken方法的次数。

因此,对于所需的输出,必须在循环之前调用它:

public void onClick(View v) {
    if (v.getId()==R.id.button5){
        Intent i = new Intent(this, Tokenizing.class);

        String test = ((TextView)findViewById(R.id.textView6)).getText().toString();
        test = test.toLowerCase();
        test = test.replaceAll("\\W", " ");
        StringBuilder result = new StringBuilder();
        StringTokenizer st2 = new StringTokenizer(test);
        int len=st2.countTokens();
        System.out.println(len);
        result.append(len+"\n");
            while (st2.hasMoreTokens()) {
                String st3 = st2.nextToken();
                System.out.println(st3);
                result.append(st3+'\n');
        }

        i.putExtra("result", result.toString());
        startActivity(i);
        //Log.i("Test Klik Next", result);
    }