Question

在这些日子里，我从文档中构建了一个关键字列表：

txtdir<-"E:/2015Ddtrial"
txtnames<-list.files(txtdir, pattern = ".txt", all.files = FALSE,
                     recursive = TRUE, include.dirs = FALSE, full.names=TRUE)
# prepared for data input
pkw<-vector()
#extract the keywords of the papers in the text files
for(i in 1:length(txtnames)) {
  fc<-file(txtnames[i], open = "r", encoding = "UTF-8")
  #alternative method of reading files: txtls<-readLines(con = fc, n=6, encoding = "unknown", skipNul = TRUE)
  txtls<-scan(file = fc, what=character(), nmax = 6, sep = "\n", blank.lines.skip = TRUE, skipNul = TRUE, fileEncoding = "UTF-8")
  rawvec<-grep("关键词", txtls, value = TRUE)
  first<-regexpr("关键词", rawvec, ignore.case = FALSE)
  last<-regexpr("分类号", rawvec, ignore.case = FALSE)
  pkw<-append(pkw, substring(rawvec, first+4, last-2))
  close(fc)
}
#combine all vectors into one line
kwdict<-gsub("^[[:blank:]+]|[[:blank:]+]$", "", pkw)

这是我用于为矩阵准备检索词列表的多语言列表（kwdict）的示例。我没有意识到这是一个没有确认的清单，直到学者向我提出这个问题。字符＆＃34;列表＆＃34;如下所示：

[1] "  本体 构建   语义 协同   知识库   可视化   系统 架构    "                                                                         [2] ""                                                                          [3] "  知识 服务   知识 组织 体系   本体   语义 网 技术    "                                                                         [4] ""                                                                          [5] "    服务 接口   知识 组织   开放 查询   语义 推理   Web   服务 "                                                                        
[6] "    Solr   分面 搜索   标准 信息管理 "                                                                            [7] "  语义   Ｗ ｉ ｋ ｉ   标注   导航   检索   Ｓ ｅ ｍ ａ ｎ ｔ ｉ ｃ Ｍ ｅ ｄ ｉ ａ Ｗ ｉ ｋ ｉ   Ｐ Ａ Ｕ Ｘ   Ｉ ｋ ｅ Ｗ ｉ ｋ ｉ    "
[8] "  Liferay   主从 模式   集成 知识 平台    "                                                                         [9] "    数据 摄取   SKE   本体   属性 映射   三元组 存储    "                                                                         [10] "本体   实体 检索   查询 问答   关联 检索   可视化"

我尝试使用“unlist”将其分解为单词

expansion <- unlist(kwdict, recursive = FALSE)

但结果与我输入的内容（上面列出的）相同，没有变化。现在我意识到它可能不是单独使用“unlist”？但我不知道该怎么做。你能为我推荐另一种方法吗？

在多语言＆＃34;列表＆＃34;中取消列表失败

0 个答案: