我正在开发一个系统,根据其上下文给出一个给定单词(显然是多义词)的确切意义。这个研究领域称为Word Sense Disambiguation。为此,我需要给定单词的根(steam)和包含它的文本。我将解析文本并使用其根目录查找给定单词的所有出现次数。
例如,如果给定的单词是“love”。系统将解析文本并返回所有“爱”的出现,如“可爱,被爱,被爱......”
以下是我尝试的但不幸的是我没有得到我想要的东西!
public class Partenn1 {
public static void main(String[] args) {
int c=0;
String w = "tissue";
try (BufferedReader br = new BufferedReader(new FileReader("D:/Sc46.txt")))
{
String line;
while ((line = br.readLine()) != null)
{
String[] WrdsLine = line.split(" ");
boolean findwrd = false;
for( String WrdLine : WrdsLine )
{
for (int a=0; a<WrdsLine.length; a++)
{
if ( WrdsLine[a].indexOf(w)!=0)
{
c++; //It's just a counter for verification of the numbre of the occ.
findwrd = true;
}
}
}
}
System.out.println(c);
}
catch (IOException e) {}
}
}
答案 0 :(得分:3)
单词的根也称为单词的前缀。这可以通过在具有相应前缀的字符串上调用方法startsWith来实现。
以下代码正确地打印出&#39; 2&#39;,因为&#39; tissue2&#39;和&#39;组织3&#39;从&#39;组织开始。
int count = 0;
final String prefix = "tissue";
try (BufferedReader br = new BufferedReader(new StringReader("tissue2 tiss tiss3 tissue3"))) {
String line;
while ((line = br.readLine()) != null) {
// Get all the words on this line
final String[] wordsInLine = line.split(" ");
for (final String s : wordsInLine) {
// Check that the word starts with the prefix.
if (s.startsWith(prefix)) {
count++;
}
}
}
System.out.println(count);
} catch (final IOException ignored) {
}
答案 1 :(得分:1)
不再需要一个for
循环。这里需要w
字符串:
while ((line = br.readLine()) != null) {
String[] WrdsLine = line.split(" "); // split
for( String WrdLine : WrdsLine ) {
if ( WrdLine.contains(w)) { // if match - print
System.out.println(WrdLine);
}
}
}