我有以下for循环,它为Code变量中的每个级别创建了一个虚拟变量,我想将其编写为函数,以便可以在apply-function中使用它:
for(level in data$Letters){
data[paste(level, sep="")] <- ifelse(data$Letters == level, 1, 0)
}
这是一个有关我的数据外观的示例(原始数据框要大得多):
Letters <- c("A","B","C")
Numbers <- c(1,0,1)
Numbers <- as.integer(Numbers)
data <- data.frame(Letters,Numbers)
这就是我想要的:
Result <- matrix(c(1,0,0,
0,1,0,
0,0,1),3,3)
Final <- cbind(data,Result)
有没有办法将for循环重写为函数?
答案 0 :(得分:2)
您可以使用outer
:
with(data, outer(Letters, levels(Letters), "=="))*1
# [,1] [,2] [,3]
# [1,] 1 0 0
# [2,] 0 1 0
# [3,] 0 0 1
...,然后将其与原始数据框一起很好地cbind
,您可以执行以下操作:
df <- data.frame(Letters,Numbers)
# better to avoid using `data` as a name for a data frame
df2 <- with(df, outer(Letters, levels(Letters), "=="))*1
cbind(df, setNames(as.data.frame(df2), levels(df$Letters)))
# Letters Numbers A B C
# 1 A 1 1 0 0
# 2 B 0 0 1 0
# 3 C 1 0 0 1
或者,您可以使用sapply
:
sapply(levels(df$Letters), function(x) df$Letters==x)*1
# notice that the result is a matrix rather than a data frame
# but it is still safe to cbind it to a data frame:
cbind(df, sapply(levels(df$Letters), function(x) df$Letters==x)*1)
lapply
也可以使用,但是在这种情况下,似乎sapply
自动标记了列,而lapply
没有自动标记,因此您必须使用setNames手动进行,例如:
as.data.frame(lapply((function(.) setNames(.,.)) (levels(df$Letters)), function(x) (df$Letters==x)*1))
...或分步进行:
N <- levels(df$Letters)
N <- setNames(N,N)
out <- lapply(N, "==", df$Letters)
out <- as.data.frame(out)*1