在R中,向量不能包含不同的类型。一切都必须例如是一个整数或一切都必须是人物等。这有时令我头疼。例如。当我想为data.frame添加一个边距时,需要一些coloumns为数字而另一些是字符。
以下是可重现的例子:
# dummy data.frame
set.seed(42)
test <- data.frame("name"=sample(letters[1:4], 10, replace=TRUE),
"val1" = runif(10,2,5),
"val2"=rnorm(10,10,5),
"Status"=sample(c("In progres", "Done"), 10, replace=TRUE),
stringsAsFactors = FALSE)
# check that e.g. "val1" is indeed numeric
is.numeric(test$val1)
# TRUE
# create coloumn sums for my margin.
tmpSums <- colSums(test[,c(2:3)])
# Are the sums numeric?
is.numeric(tmpSums[1])
#TRUE
# So add the margin
test2 <- rbind(test, c("All", tmpSums, "Mixed"))
# is it numeric
is.numeric(test2$val1)
#FALSE
# DAMN. Because the vector `c("All", tmpSums, "Mixed")` contains strings
# the whole vector is forced to be a string. And when doing the rbind
# the orginal data.frame is forced to a new type also
# my current workaround is to convert back to numeric
# but this seems convoluted, back and forward.
valColoumns <- grepl("val", names(test2))
test2[,valColoumns] <- apply(test2[,valColoumns],2, function(x) as.numeric(x))
is.numeric(test2$val1)
# finally. It works.
必须有更简单/更好的方式吗?
答案 0 :(得分:4)
在list
中使用rbind
对象,例如:
test2 <- rbind(test, c("All", unname(as.list(tmpSums)), "Mixed"))
rbind
的第二个参数是一个列表,删除了会导致rbind
失败的冲突名称:
c("All", unname(as.list(tmpSums)), "Mixed")
#[[1]]
#[1] "All"
#
#[[2]]
#[1] 37.70092
#
#[[3]]
#[1] 91.82716
#
#[[4]]
#[1] "Mixed"
答案 1 :(得分:1)
以下是使用data.table
的选项。我们转换了&#39; data.frame&#39;到&#39; data.table&#39; (setDT(test)
),使用sum
获取数字列的lapply
,连接(c
)与应代表其他列的值,将其放在{list
中1}}并使用rbindlist
library(data.table)
rAll <- setDT(test)[, c(name="All", lapply(.SD, sum),
Status="Mixed"), .SDcols= val1:val2]
rbindlist(list(test, rAll))
如果我们需要让它更自动一点,
i1 <- sapply(test, is.numeric)
v1 <- setNames(list("All", "Mixed"), setdiff(names(test),
names(test)[i1]))
rAll <- setDT(test)[, c(v1, lapply(.SD, sum)),
.SDcols=i1][, names(test), with=FALSE]
rbindlist(list(test, rAll))