我有一个很大的数据框架。我必须计算以_xxsgt结尾的单词逐列发生的次数。 对此有任何建议吗?
Col1 Col2 a 54_xxsgt 123_xxsgt e d f 429_s_xxsgt g
期望的输出:
Col1:2(123_xxsgt和429_s_xxsgt发生)
Col2:1(54_xxsgt发生)
....
最佳
乙
答案 0 :(得分:4)
# First, a reproducible example)
set.seed(42)
dd <- sample(letters[1:20], 100, replace = TRUE)
ix <- as.character(sample(c("", "_xxsgt"), 100, replace = TRUE))
dd <- paste(dd, ix, sep="")
df <- as.data.frame(matrix(dd, ncol=10))
# solution
sapply(df, function(x) length(grep("_xxsgt", x)))
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
6 7 6 9 4 5 4 5 6 4
答案 1 :(得分:3)
试试这个:
> DF <- read.table(text=" Col1 Col2
a 54_xxsgt
123_xxsgt e
d f
429_s_xxsgt g ", header=T)
>
> apply(DF, 2, function(x) sum(grepl('_xxsgt', x)))
Col1 Col2
2 1