在R中的text2vec包中,找不到函数" create_vocab_corpus"

时间:2016-05-01 12:27:51

标签: r text2vec

我试图了解http://dsnotes.com/articles/text2vec中的public void input(String path, PrintWriter out) throws FileNotFoundException, IOException { String finalstring; FileInputStream in = new FileInputStream(path); BufferedReader br = new BufferedReader(new InputStreamReader(in)); Path FILE_PATH = Paths.get("C:/10", "tweets_6.txt"); BufferedWriter writer = Files.newBufferedWriter(FILE_PATH, StandardCharsets.UTF_8, StandardOpenOption.APPEND); String line; while((line = br.readLine()) != null) { finalstring = line; URLEntity u; finalstring = finalstring.replaceAll("https?://\\S+\\s?", ""); finalstring=finalstring.replace("#engineeringproblems", " "); finalstring=finalstring.replace("#", " "); // Stemming Algorithm StringTokenizer st = new StringTokenizer(finalstring); String finalstring1; finalstring = ""; while (st.hasMoreTokens()) { KrovetzStemmer ks = new KrovetzStemmer(); finalstring1 = ks.stem(st.nextToken()); // repeated characters remover finalstring1 = finalstring1.replaceAll("(.)\\2{2,}", "$2"); FileInputStream in1 = new FileInputStream("C:\\10\\NonWords.txt"); BufferedReader br1 = new BufferedReader(new InputStreamReader(in1)); FileInputStream in2 = new FileInputStream("C:\\10\\StopWords.txt"); BufferedReader br2 = new BufferedReader(new InputStreamReader(in2)); String line1; String line2; while((line1 = br1.readLine()) != null) { if(finalstring1.equals(line1)) { finalstring += finalstring1 + " "; } } while((line2 = br2.readLine()) != null) { if(finalstring1.equals(line2)) { finalstring += finalstring1 + " "; } } } writer.write(finalstring); writer.newLine(); } } 包 但是在接下来的步骤中:

现在我们可以构建DTM。同样,由于与语料库构造相关的所有函数都有流API,我们必须创建迭代器并将其提供给create_vocab_corpus函数:

text2vec

此代码抛出错误:

  

错误:无法找到功能" create_vocab_corpus"

1 个答案:

答案 0 :(得分:1)

请参阅最新版本教程(0.3):https://cran.r-project.org/web/packages/text2vec/vignettes/text-vectorization.html。 v 0.3中有一些API中断。