Question

我如何尝试为以下函数编写摘要方法，使用它来查找文本块中的单词总数以及文本中使用的最常用单词？

编辑 - 换句话说，我想返回以下函数的摘要对象，并将其格式化为摘要。

findwords = function(tf) {
  txt = unlist(strsplit(tf,' '))
  wl = list()
  for(i in 1:length(txt)) {
    wrd = txt[i]
    wl[[wrd]] = c(wl[[wrd]],i)
  }
   return(wl)
}

我试过了

summary.findwords = function(obj) {
 txt = unlist(strsplit(obj,' '))
 cat(“the total number of words\n”)
 print(length(txt))
 cat(“the frequency of words\n”)
 print(rev(sort(table(txt))))
}

Answer 1

我认为这会让你开始。这是您的函数的略微修改版本，只是将类myClass添加到结果中。

findwords = function(tf) {
    txt = unlist(strsplit(tf,' '))
    wl = list()
    for(i in seq_along(txt)) {
        wrd = txt[i]
        wl[[wrd]] = c(wl[[wrd]], i)
    }
    class(wl) <- "myClass"
    return(wl)
}

以及打印和摘要方法（真正简化的例子）。

print.myClass <- function(x, ...){
    cl <- oldClass(x)
    oldClass(x) <- cl[cl != "myClass"]
    NextMethod("print")
    invisible(x)
}

summary.myClass <- function(x) {
    stopifnot(inherits(x, "myClass"))
    cat("\t\n", 
        sprintf("Unique Words (Length): %s\n", length(x)), 
        sprintf("Total Words: %s", sum(sapply(x, length))))
}

然后使用流行词语的随机样本进行测试

library(qdapDictionaries)
data(Top25Words)
samp <- paste(sample(Top25Words, 200, TRUE), collapse = " ")
fw <- findwords(samp)
class(fw)
# [1] "myClass"
head(fw, 3)
# $that
# [1]   1  36  54  63  76 165 182 191
# 
# $the
# [1]   2  68  70  92  97 132 151 168 186
# 
# $they
# [1]   3  75 199

summary(fw)

# Unique Words (Length): 25
# Total Words: 200

Answer 2

我不确定这是不是你想要的：

str = "How would I attempt to write a Summary Method for the following function, using it to find the total number of words in a block of text and the most frequent words used in the text"

ll = unlist(strsplit(str, ' '))
length(ll)
[1] 36

rev(sort(table(ll)))
ll
      the     words        to      text        of        in         a     write     would     using      used     total   Summary 
        4         2         2         2         2         2         2         1         1         1         1         1         1 
   number      most    Method        it         I       How function,  frequent       for following      find     block   attempt 
        1         1         1         1         1         1         1         1         1         1         1         1         1 
      and 
        1

或者如果你想要一个数据框：

data.frame(rev(sort(table(ll))))
      rev.sort.table.ll...
the                          4
words                        2
to                           2
text                         2
of                           2
in                           2
a                            2
write                        1
would                        1
using                        1
used                         1
total                        1
Summary                      1
number                       1
most                         1
Method                       1
it                           1
I                            1
How                          1
function,                    1
frequent                     1
for                          1
following                    1
find                         1
block                        1
attempt                      1
and                          1

功能摘要方法

2 个答案: