我有一个包含100列的数据集(名称为Col_1,Col_2 ... Col_100),结果如下:“A”,“B”,“C”......我不知道有多少不同所有数据集中都有字符。我正在尝试将每个值转换为一个列,使其具有如下矩阵:
A B C D
0 1 0 1
1 1 0 1
我正在尝试这个:
library(reshape2)
train <- read.csv("train.csv",head=TRUE,sep=",")
train
recast(train, id ~ value, id.var = 1, fun.aggregate = function(x) (length(x) > 0) + 0L)
但我收到以下错误:
Error in eval(substitute(expr), envir, enclos) :
n must be a positive integer
In addition: Warning messages:
1: attributes are not identical across measure variables; they will be dropped
2: In split_indices(.group, .n) :
NAs introduced by coercion to integer range
我可以做什么来退回我想要的桌子?
答案 0 :(得分:0)
也许这就是你要找的东西。第一步收集可能的值。第二步使每个变量都知道潜在的值。这允许table
在缺少特定值时产生0计数,以便rbind
构造正确的输出。
# collect all possible values
allLevels <- levels(unlist(sapply(df, unique)))
# provide all levels to each variable in the data.frame
dfNew <- data.frame(lapply(df, function(i) factor(i, levels=allLevels)))
# produce the count for each variable
do.call(rbind, lapply(dfNew, table))
a b c d e g i j
x 3 2 8 2 0 0 0 0
y 0 0 2 4 4 1 3 1
数据强>
set.seed(1234)
df <- data.frame(x=sample(letters[1:4], 15, replace=TRUE),
y=sample(letters[3:10], 15, replace=TRUE))