我正在从文本文件中读取一些数据,然后从中提取某些句子。最后,我试图使用创建的对象填充数据框,并为此使用for循环。
我尝试为此编写一个循环:
us_doc1 <- "This is a sample text which has lots of bla bla bla text data. Employee Handbook 2019. docx Page 1."
us_doc2 <- "This is another sample text which has lots of bla bla bla text data. Employee Handbook 2019. docx Page 2."
us_docx <- unlist(list(us_doc1, us_doc2))
usDoc <- data.frame()
len <- length(us_doc)
for (i in 1:len) {
# read every page
x <- us_doc[i]
# remove white spaces from the text
x <- gsub("\\s+", " ", x)
# grep sentence
x <- unlist(strsplit(x, "\\."))
PageRefrence <- x[grep(pattern = "docx Page", x, ignore.case = T)]
handbook <- x[grep(pattern = "Employee Handbook 2019.docx", x, ignore.case = T)]
usDoc <- rbind(i, handbook, PageRefrence, x, usDoc)
}
expected a data frame:
v1 v2 v3 v4
1 Employee Handbook 2019.docx docx Page 1 This is a.....
2 Employee Handbook 2019.docx docx Page 2 This is another...