所以我在R里有一张像这样的表:
id col1 col2 col3 col4 col5 col6 col7 col8 col9
101 1 1111 202 2 1120 5512 3 1221 900
102 1 2999 1110 2 2000 5000 3 80 200
103 1 1121 333 2 111 222 3 101 1000
.
.
我正在尝试将每个主题的长行分成多行,如下所示:
id trial col1 col2
101 1 1111 202
101 2 1120 5512
101 3 1221 900
102 1 2999 1110
102 2 2000 5000
102 3 80 200
103 1 1121 333
103 2 111 222
103 3 101 1000
我很感激任何帮助,因为我是这里的新手。我想我想把col作为三元组阅读然后编译它们但不知道如何去做。
答案 0 :(得分:3)
您的数据存在的问题是它以非常规的方式存储。通常,当数据从宽格式转换为长格式时,宽数据中的变量名称将成为长数据中的数据点,因此称为数据透视表。为了解决这个问题,我建议您按如下方式转换数据:
d <- d[, !grepl("col[147]", names(d))]
names(d)[-1] <- paste(sort(rep(1:3, 2)), paste0("col", 1:2))
完成此操作后,使用tidyr
包重新整形数据相对简单。
d %>%
gather(key, value, -id) %>%
separate(key, c("trial", "new"), sep = "\\s") %>%
spread(new, value)
答案 1 :(得分:2)
(有点手册)这个怎么样?
res <- cbind(rep(df[,1], each = nrow(df)), matrix(c(t(df[-1])), ncol = 3, byrow = TRUE))
colnames(res) <- c("id", "trial", "col1", "col2")
res
id trial col1 col2
[1,] 101 1 1111 202
[2,] 101 2 1120 5512
[3,] 101 3 1221 900
[4,] 102 1 2999 1110
[5,] 102 2 2000 5000
[6,] 102 3 80 200
[7,] 103 1 1121 333
[8,] 103 2 111 222
[9,] 103 3 101 1000
答案 2 :(得分:2)
以下是array
cbind(rep(df1$id,
each=nrow(df1)),apply(aperm(array(unlist(df1[-1]),
dim=c(3,3,3)), c(3,2,1)), 2, c))
# [,1] [,2] [,3] [,4]
# [1,] 101 1 1111 202
# [2,] 101 2 1120 5512
# [3,] 101 3 1221 900
# [4,] 102 1 2999 1110
# [5,] 102 2 2000 5000
# [6,] 102 3 80 200
# [7,] 103 1 1121 333
# [8,] 103 2 111 222
# [9,] 103 3 101 1000
df1 <- structure(list(id = 101:103, col1 = c(1L, 1L, 1L),
col2 = c(1111L,
2999L, 1121L), col3 = c(202L, 1110L, 333L), col4 = c(2L, 2L,
2L), col5 = c(1120L, 2000L, 111L), col6 = c(5512L, 5000L, 222L
), col7 = c(3L, 3L, 3L), col8 = c(1221L, 80L, 101L),
col9 = c(900L,
200L, 1000L)), .Names = c("id", "col1", "col2", "col3",
"col4",
"col5", "col6", "col7", "col8", "col9"), class = "data.frame",
row.names = c(NA, -3L))
答案 3 :(得分:1)
text1 = "
id col1 col2 col3 col4 col5 col6 col7 col8 col9
101 1 1111 202 2 1120 5512 3 1221 900
102 1 2999 1110 2 2000 5000 3 80 200
103 1 1121 333 2 111 222 3 101 1000
"
df1 <- read.table(text=text1, head=T, as.is=T)
library(plyr)
ddply(df1, .(id), function(df){
df1 <- df[, 2:4]
df2 <- df[, 5:7]
df3 <- df[, 8:10]
names(df1) <- c("trial", "col1", "col2")
names(df2) <- c("trial", "col1", "col2")
names(df3) <- c("trial", "col1", "col2")
df.n <- do.call(rbind, list(df1, df2, df3))
return(df.n)
})
# id trial col1 col2
# 1 101 1 1111 202
# 2 101 2 1120 5512
# 3 101 3 1221 900
# 4 102 1 2999 1110
# 5 102 2 2000 5000
# 6 102 3 80 200
# 7 103 1 1121 333
# 8 103 2 111 222
# 9 103 3 101 1000