继续我之前的post,我现在还需要使用另外一列ID值来将行转移到列中。
NUM <- c(1,2,3,1,2,3,1,2,3,1)
ID <- c("DJ45","DJ45","DJ45","DJ46","DJ46","DJ46","DJ47","DJ47","DJ47","DJ48")
Type <- c("A", "F", "C", "B", "D", "A", "E", "C", "F", "D")
Points <- c(9.2,60.8,22.9,1012.7,18.7,11.1,67.2,63.1,16.7,58.4)
df1 <- data.frame(ID,NUM,Type,Points)
df1:
+------+-----+------+--------+
| ID | Num | Type | Points |
+------+-----+------+--------+
| DJ45 | 1 | A | 9.2 |
| DJ45 | 2 | F | 60.8 |
| DJ45 | 3 | C | 22.9 |
| DJ46 | 1 | B | 1012.7 |
| DJ46 | 2 | D | 18.7 |
| DJ46 | 3 | A | 11.1 |
| DJ47 | 1 | E | 67.2 |
| DJ47 | 2 | C | 63.1 |
| DJ47 | 3 | F | 16.7 |
| DJ48 | 1 | D | 58.4 |
+------+-----+------+--------+
我想要的输出是
+------+-----+------+--------+------+------+------+------+
| ID | Num | A | B | C | D | E | F |
+------+-----+------+--------+------+------+------+------+
| DJ45 | 1 | 9.2 | N/A | N/A | N/A | N/A | N/A |
| DJ45 | 2 | N/A | N/A | N/A | N/A | N/A | 60.8 |
| DJ45 | 3 | N/A | N/A | 22.9 | N/A | N/A | N/A |
| DJ46 | 1 | N/A | 1012.7 | N/A | N/A | N/A | N/A |
| DJ46 | 2 | N/A | N/A | N/A | 18.7 | N/A | N/A |
| DJ46 | 3 | 11.1 | N/A | N/A | N/A | N/A | N/A |
| DJ47 | 1 | N/A | N/A | N/A | N/A | 67.2 | N/A |
| DJ47 | 2 | N/A | N/A | 63.1 | N/A | N/A | N/A |
| DJ47 | 3 | N/A | N/A | N/A | N/A | N/A | 16.7 |
| DJ48 | 1 | N/A | N/A | N/A | 58.4 | N/A | N/A |
+------+-----+------+--------+------+------+------+------+
我在R中使用spread
函数但是收到错误的重复标识符。这是因为我现在有2列(ID&amp; NUM)而不是之前的一列(NUM)。请让我知道如何做到这一点。
答案 0 :(得分:4)
不知道你尝试了什么,我建议:
spread(df1, Type, Points)
# ID NUM A B C D E F
# 1 DJ45 1 9.2 NA NA NA NA NA
# 2 DJ45 2 NA NA NA NA NA 60.8
# 3 DJ45 3 NA NA 22.9 NA NA NA
# 4 DJ46 1 NA 1012.7 NA NA NA NA
# 5 DJ46 2 NA NA NA 18.7 NA NA
# 6 DJ46 3 11.1 NA NA NA NA NA
# 7 DJ47 1 NA NA NA NA 67.2 NA
# 8 DJ47 2 NA NA 63.1 NA NA NA
# 9 DJ47 3 NA NA NA NA NA 16.7
# 10 DJ48 1 NA NA NA 58.4 NA NA
如果您收到有关重复标识符的错误,那是因为&#34; ID&#34;和&#34; Num&#34;在您的实际数据中有一个或多个重复的条目(在您的示例数据中,他们不会)。
如果是这种情况,您需要添加另一列以使其唯一。
将dplyr
添加到链中,可能类似于:
df1 %>%
group_by(ID, NUM) %>%
mutate(id2 = sequence(n())) %>%
spread(Type, Points)
演示假设错误:
df2 <- rbind(df1, df1[1:3, ]) ## Duplicate the first three rows
spread(df2, Type, Points)
# Error: Duplicate identifiers for rows (1, 11), (3, 13), (2, 12)
library(dplyr)
df2 %>%
group_by(ID, NUM) %>%
mutate(id2 = sequence(n())) %>%
spread(Type, Points)
# Source: local data frame [13 x 9]
#
# ID NUM id2 A B C D E F
# 1 DJ45 1 1 9.2 NA NA NA NA NA
# 2 DJ45 1 2 9.2 NA NA NA NA NA
# 3 DJ45 2 1 NA NA NA NA NA 60.8
# 4 DJ45 2 2 NA NA NA NA NA 60.8
# 5 DJ45 3 1 NA NA 22.9 NA NA NA
# 6 DJ45 3 2 NA NA 22.9 NA NA NA
# 7 DJ46 1 1 NA 1012.7 NA NA NA NA
# 8 DJ46 2 1 NA NA NA 18.7 NA NA
# 9 DJ46 3 1 11.1 NA NA NA NA NA
# 10 DJ47 1 1 NA NA NA NA 67.2 NA
# 11 DJ47 2 1 NA NA 63.1 NA NA NA
# 12 DJ47 3 1 NA NA NA NA NA 16.7
# 13 DJ48 1 1 NA NA NA 58.4 NA NA