我有一张表如下
Ab1 Ab2 Ab3
Meronem Eus Biclar
Aug Tazocin
Aug Pc
Aug Eth Amukin
Aug
Tazocin
Kefzol Avelox Meronem
Aug Amukin Tazocin
Kefzol Tazocin
我想根据与Ab类对应的Ab对数据表进行分类
Ab Ab class
Aug bl
Pentrexyl Pcl
Zitromax Mld
Azactam Mb
Kefzol Cp
Biclar Mld
Eth Mld
Pc Pcl
Meronem Cb
Tazocin bl
Amukin Am
Eus Ts
我想以这种方式进入决赛桌。
Ab1 Ab2 Ab3 bl Pcl Mld Mb Cp Cb Am Ts
Meronem Eus Biclar 0 0 1 0 0 1 0 1
Aug Tazocin 2 0 0 0 0 0 0 0
Aug Pc 1 1 0 0 0 0 0 0
Aug Eth Amukin 1 0 1 0 0 0 1 0
我尝试为键分配值并匹配它们。但我无法找到解决办法。任何帮助,将不胜感激。
答案 0 :(得分:2)
这是一个解决方案,它使用一些数据整形来获得正确格式的第二部分(AB类),然后简单地将它绑定到第一部分。
library(reshape2)
#ad a line ID
ab$line_ID <- 1:nrow(ab)
#turn to long format
ab_long <- melt(ab, id.var="line_ID", value.name="Ab")
#merge with ab-classdata (removing NA's for convenience)
ab_long_merge <- merge(ab_long[!is.na(ab_long$Ab),], ab_classes, by="Ab", all.x=T)
#create our table (as a dataframe in right format using dcast)
ab_wide_merge <- dcast(line_ID~Abclass, data=ab_long_merge, fun.agg=length, value.var="Abclass")[,-1] #-1 to remove line
#create our desired output
output <- cbind(ab[,1:3], ab_wide_merge)
> output
Ab1 Ab2 Ab3 Am bl Cb Cp Mld Pcl Ts NA
1 Meronem Eus Biclar 0 0 1 0 1 0 1 0
2 Aug Tazocin <NA> 0 2 0 0 0 0 0 0
3 Aug Pc <NA> 0 1 0 0 0 1 0 0
4 Aug Eth Amukin 1 1 0 0 1 0 0 0
5 Aug <NA> <NA> 0 1 0 0 0 0 0 0
6 Tazocin <NA> <NA> 0 1 0 0 0 0 0 0
7 Kefzol Avelox Meronem 0 0 1 1 0 0 0 1
8 Aug Amukin Tazocin 1 2 0 0 0 0 0 0
9 Kefzol Tazocin <NA> 0 1 0 1 0 0 0 0
使用的数据:
ab <- structure(list(Ab1 = c("Meronem", "Aug", "Aug", "Aug", "Aug",
"Tazocin", "Kefzol", "Aug", "Kefzol"), Ab2 = c("Eus", "Tazocin",
"Pc", "Eth", NA, NA, "Avelox", "Amukin", "Tazocin"), Ab3 = c("Biclar",
NA, NA, "Amukin", NA, NA, "Meronem", "Tazocin", NA)), .Names = c("Ab1",
"Ab2", "Ab3"), row.names = c(NA, -9L), class = "data.frame")
ab_classes <- structure(list(Ab = structure(c(2L, 10L, 12L, 3L, 7L, 4L, 5L,
9L, 8L, 11L, 1L, 6L), .Label = c("Amukin", "Aug", "Azactam",
"Biclar", "Eth", "Eus", "Kefzol", "Meronem", "Pc", "Pentrexyl",
"Tazocin", "Zitromax"), class = "factor"), Abclass = structure(c(2L,
7L, 6L, 5L, 4L, 6L, 6L, 7L, 3L, 2L, 1L, 8L), .Label = c("Am",
"bl", "Cb", "Cp", "Mb", "Mld", "Pcl", "Ts"), class = "factor")), .Names = c("Ab",
"Abclass"), class = "data.frame", row.names = c(NA, -12L))
#read in using read.table