我需要像这样对数据进行子集化:
a1 <- data[,grep("a_cool_[1-3]*", names(data))]
a2 <- data[,grep("word_an[1-3]*", names(data))]
a3 <- data[,grep("word_ne[1-3]*", names(data))]
a4 <- data[,grep("word_an[1-3]*", names(data))]
a5 <- data[,grep("word_sam[1-3]*", names(data))]
a6 <- data[,grep("word_snap[1-3]*", names(data))]
a7 <- data[,grep("word_app[1-3]*", names(data))]
我认为使用其他功能(例如`* apply()系列函数)可以简化此过程,但我不确定如何。
答案 0 :(得分:0)
试试这个:
#Create a dataset
thedata <- data.frame(matrix(rnorm(220),nrow = 20,ncol = 11))
varnames <- c('a_cool_3', 'word_an1', 'word_an2', 'word_ne3', 'word_an', 'word_sam3', 'word_snap1', 'word_app3', 'randomcol', 'anotherone','yetanother')
names(thedata) <- varnames
#Create the patterns you wish your column names to have (please note that | means 'OR' in regex)
patterns <- "a_cool_[1-3]*|word_an[1-3]*|word_ne[1-3]*|word_sam[1-3]*|word_snap[1-3]*|word_app[1-3]*"
#Use the grep function to grab the columns with those patterns
output_df <- thedata[,grep(pattern=patterns,names(thedata),perl=T)]
#It only prints out columns with the patterns
head(output_df)
a_cool_3 word_an1 word_an2 word_ne3 word_an word_sam3 word_snap1 word_app3
1 1.8225436 0.7570277 -0.4114735 -0.87751389 0.2845020 1.2813361 0.5506499685 -1.3622255
2 0.0178158 0.5977225 2.5022158 -0.80579000 -0.2524916 1.0446857 -0.5382501876 0.8778370
3 -0.4222182 0.1785882 -0.9802086 0.71497031 0.2719002 -0.4319695 0.8670455296 -0.8917643
4 -0.1642998 1.7782387 0.6997389 0.06620839 -0.9951579 -0.1363725 -0.5289680333 -0.1564115
5 0.6785524 0.7319884 0.2843869 -2.25325312 -0.4032888 0.3661970 1.4291588013 -0.2203280
6 -0.7548342 -2.1009707 -2.0157028 -0.34596984 -0.6964674 0.2260157 -0.0001932224 -0.2866768
我希望这会有所帮助。