我有基于行的迁移数据。
param <- c("A", "B", "C")
df <- data.frame(Case1 = c("A", "A", "B", "B"),
Case2 = c("A", "C", "A", "B"),
Val = c(0.5,0.4,0.3,0.7))
所以这个数据框看起来像
Case1 Case2 Val
1 A A 0.5
2 A C 0.4
3 B A 0.3
4 B B 0.7
这种基于行的数据框应该在一种“迁移矩阵”中进行转换。
dd <- data.frame(cA = c(0.5, 0.3, 0),
cB = c(0, 0.7, 0),
cC = c(0.4,0,0))
rownames(dd) <- paste0("Case1","_", param)
colnames(dd) <- paste0("Case2","_", param)
所以迁移矩阵看起来像
Case2_A Case2_B Case2_C
Case1_A 0.5 0.0 0.4
Case1_B 0.3 0.7 0.0
Case1_C 0.0 0.0 0.0
有人知道在R中这样做的好方法吗?非常感谢你!
答案 0 :(得分:1)
您可以使用dplyr
和tidyr
:
library(dplyr); library(tidyr)
df %>%
complete(Case1 = LETTERS[1:3], Case2 = LETTERS[1:3]) %>%
mutate_at(vars(starts_with("Case")), funs(paste("Case", ., sep = "_"))) %>%
spread(Case2, Val, fill = 0.0)
# Source: local data frame [3 x 4]
# Case1 Case_A Case_B Case_C
# <chr> <dbl> <dbl> <dbl>
#1 Case_A 0.5 0.0 0.4
#2 Case_B 0.3 0.7 0.0
#3 Case_C 0.0 0.0 0.0
或者,如果您想具体保留列号:
df %>%
complete(Case1 = LETTERS[1:3], Case2 = LETTERS[1:3]) %>%
mutate(Case1 = paste('Case1', Case1, sep = "_"),
Case2 = paste('Case2', Case2, sep = "_")) %>%
spread(Case2, Val, fill = 0.0)
# Source: local data frame [3 x 4]
# Case1 Case2_A Case2_B Case2_C
# <chr> <dbl> <dbl> <dbl>
# 1 Case1_A 0.5 0.0 0.4
# 2 Case1_B 0.3 0.7 0.0
# 3 Case1_C 0.0 0.0 0.0
答案 1 :(得分:0)
以基地R:
df
Case1 Case2 Val
1 A A 0.5
2 A C 0.4
3 B A 0.3
4 B B 0.7
library(reshape2)
levels(df$Case1) <- c(levels(df$Case1), 'C')
df <- dcast(df, Case1~Case2, value.var='Val', drop=FALSE)
rownames(df) <- paste('Case1', df[,1], sep='_')
df <- df[-1]
names(df) <- paste('Case2', names(df), sep='_')
df[is.na(df)] <- 0.0
df
Case2_A Case2_B Case2_C
Case1_A 0.5 0.0 0.4
Case1_B 0.3 0.7 0.0
Case1_C 0.0 0.0 0.0
答案 2 :(得分:0)
base R
选项将前两个colunms转换为xtabs
并factor
从{{1} levels
级别转换为unique
后使用unlist
选项修改了列,以便不删除某些组合。
Un1 <- sort(unique(unlist(df[1:2])))
df[1:2] <- lapply(df[1:2], factor, levels = Un1)
res <- xtabs(Val~Case1+Case2, df)
如果我们需要dimnames
dimnames(res) <- Map(paste, names(dimnames(res)), dimnames(res), MoreArgs = list(sep="_"))
names(dimnames(res)) <- NULL
res
# Case2_A Case2_B Case2_C
#Case1_A 0.5 0.0 0.4
#Case1_B 0.3 0.7 0.0
#Case1_C 0.0 0.0 0.0