我正在尝试使用Apache OpenNLP从文件中获取城市名称,但我首先尝试在单个String上使用该库。我使用Eclipse Kepler安装了Maven和OpenNLP,但是当我尝试运行该应用程序时,它给了我一堆错误:
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import org.xml.sax.SAXException;
public class CityNames {
public String Tokens[];
public static void main(String[] args) throws IOException, SAXException{
CityNames toi = new CityNames();
String cnt;
cnt="John is planning to specialize in Electrical Engineering in UC Berkley and pursue a career with IBM.";
toi.tokenization(cnt);
String cities = toi.namefind(toi.Tokens);
String org = toi.orgfind(toi.Tokens);
System.out.println("City name is : "+cities);
System.out.println("organization name is: "+org);
}
public String namefind(String cnt[]) {
InputStream is;
TokenNameFinderModel tnf;
NameFinderME nf;
String sd = "";
try {
is = new FileInputStream("en-ner-location.bin");
tnf = new TokenNameFinderModel(is);
nf = new NameFinderME(tnf);
Span sp[] = nf.find(cnt);
String a[] = Span.spansToStrings(sp, cnt);
StringBuilder fd = new StringBuilder();
int l = a.length;
for (int j = 0; j < l; j++) {
fd = fd.append(a[j] + "\n");
}
sd = fd.toString();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (InvalidFormatException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return sd;
}
public String orgfind(String cnt[]) {
InputStream is;
TokenNameFinderModel tnf;
NameFinderME nf;
String sd = "";
try {
is = new FileInputStream("en-ner-organization.bin");
tnf = new TokenNameFinderModel(is);
nf = new NameFinderME(tnf);
Span sp[] = nf.find(cnt);
String a[] = Span.spansToStrings(sp, cnt);
StringBuilder fd = new StringBuilder();
int l = a.length;
for (int j = 0; j < l; j++) {
fd = fd.append(a[j] + "\n");
}
sd = fd.toString();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (InvalidFormatException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return sd;
}
public void tokenization(String tokens) {
InputStream is;
TokenizerModel tm;
try {
is = new FileInputStream("en-token.bin");
tm = new TokenizerModel(is);
Tokenizer tz = new TokenizerME(tm);
Tokens = tz.tokenize(tokens);
// System.out.println(Tokens[1]);
} catch (IOException e) {
e.printStackTrace();
}
}
}
例如,TokenNameFinderModel
无法解析为类型。
在我看来,我不认为Eclipse会识别OpenNLP包。
有什么建议吗?
这是我的pom.xml文件:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>CityNames</groupId>
<artifactId>CityNames</artifactId>
<version>0.0.1-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>org.apache.opennlp</groupId>
<artifactId>opennlp-tools</artifactId>
<version>1.5.3</version>
</dependency>
</dependencies>
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
<projectDescription>
<name>CityNames</name>
<comment></comment>
<projects></projects>
<buildSpec>
<buildCommand>
<name>org.eclipse.jdt.core.javabuilder</name>
<arguments>
</arguments>
</buildCommand>
</buildSpec>
<natures>
<nature>org.eclipse.jdt.core.javanature</nature>
</natures>
</projectDescription>
</project>
答案 0 :(得分:0)
基于您的 pom.xml 文件,它看起来就是您拥有所需的一切。
如果您正在使用Eclipse,那么您应该可以按Ctrl + Shift + O
来组织您的import语句并引入您当前缺少的语句(即TokenNameFinderModel
类和其他OpenNLP课程。)
完成后,这些导入语句应出现在课程顶部:
import opennlp.tools.namefind.NameFinderME;
import opennlp.tools.namefind.TokenNameFinderModel;
import opennlp.tools.tokenize.Tokenizer;
import opennlp.tools.tokenize.TokenizerME;
import opennlp.tools.tokenize.TokenizerModel;
import opennlp.tools.util.InvalidFormatException;
import opennlp.tools.util.Span;