如何在R中相应列(考虑多个列)中提到的每个计数重复行?
data <- data.frame(
city=c("A","B","C","D","E","F","G"),
score=c(83,94,1,21,2,3,0),
J=c(2,0,1,0,3,0,0),
K=c(0,2,0,3,0,1,0),
L=c(1,1,0,4,0,0,0))
data
原始数据框:
必需的数据框:
考虑所有列数,PS重复4次的城市D,其中城市k的k列中的3行的计数为1,而列L的4行的计数为1。
答案 0 :(得分:4)
另一个data.table解决方案:
library(data.table)
setDT(data)
data[, lapply(.SD, function(x){
g <- pmax(max(unlist(.SD)), 1)
rep(1:0, c(x, g - x)) }), by = .(city, score)]
# city score number number2 number3
# 1: A 83 1 0 1
# 2: A 83 1 0 0
# 3: B 94 0 1 1
# 4: B 94 0 1 0
# 5: C 1 1 0 0
# 6: D 21 0 1 1
# 7: D 21 0 1 1
# 8: D 21 0 1 1
# 9: D 21 0 0 1
# 10: E 2 1 0 0
# 11: E 2 1 0 0
# 12: E 2 1 0 0
# 13: F 3 0 1 0
# 14: G 0 0 0 0
所有数字均等于零的行将得到正确处理。如果您不希望这样的行,请用g <- pmax(max(unlist(.SD)), 1)
替换g <- max(unlist(.SD))
:
data[, lapply(.SD, function(x){
g <- max(unlist(.SD))
rep(1:0, c(x, g - x)) }), by = .(city, score)]
答案 1 :(得分:2)
data.table
解决方案:
数据:(确保您没有stringsAsFactors = F
的因素)
data <- data.frame(
city=c("A","B","C","D","E","F","G"),
score=c(83,94,1,21,2,3,0),
number=c(2,0,1,0,3,0,0),
number2=c(0,2,0,3,0,1,0),
number3=c(1,1,0,4,0,0,0),stringsAsFactors = F)
代码:(让我们有一个功能fun1
为我们工作)
data.table::setDT(data)
fun1 <- function(x) {
transpose(
transpose(
lapply(x, function(u) if(u != 0) rep(1,u) else 0), fill = 0
)
)
}
data[, structure(fun1(.SD), .Names = names(.SD)), by = c("city","score")]
结果:
# city score number number2 number3
#1: A 83 1 0 1
#2: A 83 1 0 0
#3: B 94 0 1 1
#4: B 94 0 1 0
#5: C 1 1 0 0
#6: D 21 0 1 1
#7: D 21 0 1 1
#8: D 21 0 1 1
#9: D 21 0 0 1
#10: E 2 1 0 0
#11: E 2 1 0 0
#12: E 2 1 0 0
#13: F 3 0 1 0
#14: G 0 0 0 0
答案 2 :(得分:1)
请注意,根据您提供的示例数据,预期输出中会有一些错误(请参阅@markus注释)。
这是一个使用tidyverse
的{{1}}选项
splitstackshape::cSplit
说明:我们的想法是将每个library(splitstackshape)
library(tidyverse)
data %>%
rowwise() %>%
mutate_at(vars(starts_with("number")), funs(toString(rep(1, .)))) %>%
group_by(city) %>%
cSplit(grep("^number", names(data), value = T), direction = "long") %>%
filter_at(vars(starts_with("number")), any_vars(!is.na(.))) %>%
replace(., is.na(.), 0)
# city score number number2 number3
#1 A 83 1 0 1
#2 A 83 1 0 0
#3 B 94 0 1 1
#4 B 94 0 1 0
#5 C 1 1 0 0
#6 D 21 0 1 1
#7 D 21 0 1 1
#8 D 21 0 1 1
#9 D 21 0 0 1
#10 E 2 1 0 0
#11 E 2 1 0 0
#12 E 2 1 0 0
#13 F 3 0 1 0
条目替换为与其值相对应的number
数量的vector
,然后我们将其转换为逗号分隔的{ {1}}个向量与1
。然后,我们使用character
将这些条目分成多行,删除所有toString
行,并用splitstackshape::cSplit
s替换NA
s。