我有一个矩阵500行,1000列。每个col都有逗号之间的4个元素,我需要删除逗号。
数据看起来就是这样。
1 2 3 4 ... 1000
1 12,1,20 14,15,12 10,10,20 1,0,10 ... 1,5,3
2 12,1,20 14,15,12 10,10,20 1,0,10 ... 1,5,3
3 12,1,20 14,15,12 10,10,20 1,0,10 ... 1,5,3
.
.
500 12,1,20 14,15,12 10,10,20 1,0,10 ... 1,5,3
我的代码是
mat=matrix(data=NA, nrow=257, ncol=3)
n=1000
k=500
for(i in 1:n){
mat[i]<-colsplit(as.character(data[,i]), "," , c("a","b","c"))
}
不工作,我的循环中缺少。 谁能帮我解决一下,谢谢
答案 0 :(得分:2)
如果要基于,
作为分隔符
library(data.table)
library(splitstackshape)
df1 <- cSplit(df, 1:ncol(df), sep=",")[,lapply(.SD, as.numeric)]
df1
# X1_1 X1_2 X1_3 X2_1 X2_2 X2_3 X3_1 X3_2 X3_3 X4_1 X4_2 X4_3
#1: 12 1 20 14 15 12 10 10 20 1 0 10
#2: 12 1 20 14 15 12 10 10 20 1 0 10
#3: 12 1 20 14 15 12 10 10 20 1 0 10
或者使用cSplit_f
对矩形数据更快(基于splitstackshape
包的作者的评论(@Ananda Mahto)
cSplit_f(df, 1:ncol(df), sep=",")[,lapply(.SD, as.numeric)]
str(df1)
# Classes ‘data.table’ and 'data.frame': 3 obs. of 12 variables:
# $ X1_1: num 12 12 12
# $ X1_2: num 1 1 1
# $ X1_3: num 20 20 20
# $ X2_1: num 14 14 14
# $ X2_2: num 15 15 15
# $ X2_3: num 12 12 12
# $ X3_1: num 10 10 10
# $ X3_2: num 10 10 10
# $ X3_3: num 20 20 20
# $ X4_1: num 1 1 1
# $ X4_2: num 0 0 0
# $ X4_3: num 10 10 10
df <- structure(list(X1 = c("12,1,20", "12,1,20", "12,1,20"), X2 = c("14,15,12",
"14,15,12", "14,15,12"), X3 = c("10,10,20", "10,10,20", "10,10,20"
), X4 = c("1,0,10", "1,0,10", "1,0,10")), .Names = c("X1", "X2",
"X3", "X4"), class = "data.frame", row.names = c("1", "2", "3"
))