R代码

Question

我在尝试使用openNLPmodels.pt进行POS标记葡萄牙语句子时遇到错误。然而，英语模型，即openNLPmodels.en可以正常使用英语句子。

感谢任何帮助。

R代码

# R Code #
install.packages("openNLPmodels.pt", repos = "http://datacube.wu.ac.at/", type="source") 

library(openNLP)
library(NLP)
library(openNLPmodels.pt)

s <- paste("Um esquilo preto raro se tornou um visitante regular de um jardim suburbano.")

# For reference here is the English version of sentence #
# s <- paste("A rare black squirrel has become a regular visitor to a suburban garden.")
###

## Sentence token annotations.
sent_token_annotator <- Maxent_Sent_Token_Annotator(language = "pt", probs = FALSE, model ='openNLPmodels.pt')

# Code End #

错误

# Error #
Error in .jnew("java.io.FileInputStream", model) : 
  java.io.FileNotFoundException: openNLPmodels.pt (The system cannot find the file specified)

Answer 1

我找到了一个愚蠢的解决方案，虽然有效：

1）从这里下载葡萄牙语模型（我拿了pt-pos-maxent.bin）： http://opennlp.sourceforge.net/models-1.5/

2）然后在路径中替换英文模型：依赖上，您的电脑\ r \ WIN库\ 3.4 \ openNLPdata \型号\ EN-POS-maxent.bin

完成这些步骤后，再运行命令：

Three.MeshYourMaterial({side:THREE.BackSide})

如下所示： How to use OpenNLP to get POS tags in R?

然后分类器对英语表现不好，对葡萄牙语表现良好;）

Answer 2

我尝试修改您发送的代码，问题似乎是＆＃34; model =＆＃39; openNLPmodels.pt＆＃39;

如果您将其设置为＆＃39; model = NULL＆＃39;，它可能会起作用，如

sent_token_annotator <- Maxent_Sent_Token_Annotator(language = "pt", probs = FALSE, model =NULL)

当您使用选项＆＃34; NULL＆＃34;在模型中，使用了您选择的语言的默认选项，因此它应该没问题。

请注意，您使用的第二个和第三个参数是默认选项，因此您可以省略它们。

我遇到的问题是使用Parse_Annotator命令，但这是我很快在这里发布的另一个问题。

R葡萄牙语的openNLPmodels.pt错误

R代码

错误

2 个答案: