我想将一组分类值编码为二进制值。首先,我使用intToBin(x$y)
现在我想将这个二进制文件分成不同的列
0101
0100
0110
0101
0101
0100
到
0 1 0 1
0 1 0 0
0 1 1 0
0 1 0 1
依此类推,同时想将其转换为数值。它应该可以扩展到更多的字符串。
我使用separate(x$y, sep = l)
进行转换。但是我收到了一个错误。请帮我纠正代码或提供任何其他替代方案。将值更改为二进制的目的是使用XGBoost
构建模型。
答案 0 :(得分:1)
这是一种方式:
d=c("0101","0111","0011","1101")
# Split into columns
d2=do.call(rbind, strsplit(as.character(d), split="")) #see elmo's comments
# Make numeric and transform to dataframe (instead of matrix)
d2=as.data.frame(apply(d2,2, function(x) as.numeric(as.character(x))))
答案 1 :(得分:1)
a = c("0101","0100","0110","0101","0101","0100")
data.frame(t(matrix(unlist(strsplit(a,"")),nrow = 4)))
OR
data.frame(t(sapply(a, function(x) unlist(strsplit(x,"")))))
#You may get a warning about identical row names
或者如果你想要的东西在a
中的元素的位数不均匀时有效,
a = c("01101","0100","0110","0101","0101","0100") #Note 1st element has 5 digits
b = sapply(a, function(x) unlist(strsplit(x,"")))
data.frame(t(sapply(b, '[', seq(max(sapply(b,length))))))
#You may get a warning about identical row names