我有一个数据框:
df = read.table(text="X1 X2 X3 X4 X5 X6 X7
C U C D B C C
D C B A C D U
D C B A C D D
C D U U B C D
C D B D C U C
D C C A B C D
U D C U U C C", header=T, stringsAsFactors=F)
我想分别连接每一行的所有列及其列名,但是用" U"将被排除在外。找出哪些行和列有" U",使用
which(df == "U", arr.ind=TRUE)
结果预计为:
output = read.table(text="'X1 X3 X4 X5 X6 X7' 'C C D B C C'
'X1 X2 X3 X4 X5 X6' 'D C B A C D'
'X1 X2 X3 X4 X5 X6 X7' 'D C B A C D D'
'X1 X2 X5 X6 X7' 'C D B C D'
'X1 X2 X3 X4 X5 X7' 'C D B D C C'
'X1 X2 X3 X4 X5 X6 X7' 'D C C A B C D'
'X2 X3 X6 X7' 'D C C C'", header=F, stringsAsFactors=F)
我不知道如何在不使用循环的情况下获得预期结果。感谢。
答案 0 :(得分:2)
一个更简单的选择是apply
MARGIN = 1
t(apply(df, 1, function(x) {
i1 <- x!="U"
c(V1=paste(names(x)[i1], collapse=" "),
V2= paste(x[i1], collapse=" ")) }))
要单独获取值,另一个选项是paste
,然后执行gsub
trimws(gsub("\\s*U", "", do.call(paste, df)))
或者@RHertel提到
gsub("\\sU|U\\s","",do.call(paste,df))
答案 1 :(得分:0)
这是另一种使用grepl
查找字符索引的方法。
res=t(apply(df, 1, function(x)
c(v1=paste(names(x)[which(grepl("U", x)==F)], collapse = " "),
v2=paste(x[which(grepl("U", x)==F)], collapse = " "))
))