Question

我有一个很大的数据框架。我必须计算以_xxsgt结尾的单词逐列发生的次数。对此有任何建议吗？

  Col1             Col2  
  a               54_xxsgt   
  123_xxsgt       e     
  d               f  
  429_s_xxsgt     g

期望的输出：

Col1：2（123_xxsgt和429_s_xxsgt发生）
Col2：1（54_xxsgt发生） ....

最佳

乙

Answer 1

# First, a reproducible example)
set.seed(42)
dd <- sample(letters[1:20], 100, replace = TRUE)
ix <- as.character(sample(c("", "_xxsgt"), 100, replace = TRUE))
dd <- paste(dd, ix, sep="")
df <- as.data.frame(matrix(dd, ncol=10))

# solution
sapply(df, function(x) length(grep("_xxsgt", x)))
V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 
 6   7   6   9   4   5   4   5   6   4

Answer 2

试试这个：

> DF <- read.table(text=" Col1             Col2  
    a               54_xxsgt   
    123_xxsgt       e     
    d               f  
    429_s_xxsgt     g ", header=T)
> 
> apply(DF, 2, function(x) sum(grepl('_xxsgt', x)))
Col1 Col2 
   2    1

根据特定模式进行搜索和计数

2 个答案: