我有以下数据集
path value
1 b,b,a,c 3
2 c,b 2
3 a 10
4 b,c,a,b 0
5 e,f 0
6 a,f 1
df <- data.frame (path= c("b,b,a,c", "c,b", "a", "b,c,a,b" ,"e,f" ,"a,f"), value = c(3,2,10,0,0,1))
我希望删除重复的列路径。当我使用此代码时,数据格式会发生变化:
df$path <- sapply(strsplit(as.character(df$path), split=","),
function(x) unique(x))
它给了我像数据框
的数据 path value
1 c("b", "a", "c") 3
2 c( "c", "b ") 2
...
但是,我希望有这样的数据:
path value
1 b, a, c 3
2 c, b 2
3 a 10
4 b, c, a 0
5 e, f 0
6 a, f 1
答案 0 :(得分:1)
将unique(x)
替换为弗兰克建议的paste(unique(x), collapse = ', ')
或toString(unique(x))
。
df <- data.frame (
path= c("b,b,a,c", "c,b", "a", "b,c,a,b" ,"e,f" ,"a,f"),
value = c(3,2,10,0,0,1))
df$path <- sapply(strsplit(as.character(df$path), split=","),
function(x) paste(unique(x), collapse = ', '))
# or
df$path <- sapply(strsplit(as.character(df$path), split=","),
function(x) toString(unique(x)))
df
# path value
# 1 b, a, c 3
# 2 c, b 2
# 3 a 10
# 4 b, c, a 0
# 5 e, f 0
# 6 a, f 1