Question

我创建了一个将清除文本然后执行unigram，bigram和trigram的函数，但出现此错误：

错误（函数（...，row.names = NULL，check.rows = FALSE， check.names = TRUE ，：参数暗示不同的行数： 30，19，10

该函数可用于某些列，而另一些则无法使用，但出现此错误，我不知道为什么。

这是我的代码：

why <- function(L) {

  L <- removePunctuation(L)
  L <- gsub("^[[:space:]]*","",L)
  unigram <- L %>% 
       tokens() %>% 
       tokens_ngrams(n = 1, concatenator = " ") %>% 
       dfm() %>% 
       topfeatures(30)
  df1 <- data.frame(word_unigram = names(unigram), count_unigram = unigram)
  rownames(df1) <- NULL 
  bigram <- L %>% 
       tokens() %>%     
       tokens_ngrams(n = 2, concatenator = " ") %>% 
       dfm() %>% 
       topfeatures(30)
  df2 <- data.frame(word_bigram = names(bigram), count_bigram = bigram)
  rownames(df2) <- NULL

  trigram <-L %>% 
       tokens() %>% 
       tokens_ngrams(n = 3, concatenator = " ") %>% 
       dfm() %>% 
       topfeatures(30)
  df3 <- data.frame(word_trigram = names(trigram), count_trigram = trigram)
  rownames(df3) <- NULL 

  return(list(df1, df2, df3))
}

datafinal <- data.frame(lapply(data[16:21], function (L) why(L)))

即使我通过data[16:21]中的一列也不起作用。有帮助吗？

以下是一些示例数据：

  N             O       P                  Q            R                 S  
 yes            no     no                  no          happy birthday   
I am happy             hello friends       I am student                   yes

函数参数中的错误表示行数不同

0 个答案: