用功能替换循环

时间:2018-10-03 09:15:07

标签: r function for-loop apply

我有以下for循环,它为Code变量中的每个级别创建了一个虚拟变量,我想将其编写为函数,以便可以在apply-function中使用它:

for(level in data$Letters){
  data[paste(level, sep="")] <- ifelse(data$Letters == level, 1, 0)
}

这是一个有关我的数据外观的示例(原始数据框要大得多):

Letters <- c("A","B","C")
Numbers <- c(1,0,1)
Numbers <- as.integer(Numbers)

data <- data.frame(Letters,Numbers)

这就是我想要的:

Result <- matrix(c(1,0,0,
                   0,1,0,
                   0,0,1),3,3)
Final <- cbind(data,Result)

有没有办法将for循环重写为函数?

1 个答案:

答案 0 :(得分:2)

您可以使用outer

with(data, outer(Letters, levels(Letters), "=="))*1
#        [,1] [,2] [,3]
#  [1,]    1    0    0
#  [2,]    0    1    0
#  [3,]    0    0    1

...,然后将其与原始数据框一起很好地cbind,您可以执行以下操作:

df <- data.frame(Letters,Numbers) 
# better to avoid using `data` as a name for a data frame
df2 <- with(df, outer(Letters, levels(Letters), "=="))*1 
cbind(df, setNames(as.data.frame(df2), levels(df$Letters)))
#   Letters Numbers A B C
# 1       A       1 1 0 0
# 2       B       0 0 1 0
# 3       C       1 0 0 1

或者,您可以使用sapply

sapply(levels(df$Letters), function(x) df$Letters==x)*1
# notice that the result is a matrix rather than a data frame
# but it is still safe to cbind it to a data frame:
cbind(df, sapply(levels(df$Letters), function(x) df$Letters==x)*1)

lapply也可以使用,但是在这种情况下,似乎sapply自动标记了列,而lapply没有自动标记,因此您必须使用setNames手动进行,例如:

as.data.frame(lapply((function(.) setNames(.,.)) (levels(df$Letters)), function(x) (df$Letters==x)*1))

...或分步进行:

N <- levels(df$Letters)
N <- setNames(N,N)
out <- lapply(N, "==", df$Letters)
out <- as.data.frame(out)*1