我有这种宽格式的数据,我想将其转换为长格式
Cond Construct Line Plant Tube_shoot weight_shoot Tube_root weight_root
1 Standard NA NA 2 199.95 - -
2 Cd0 IIF 43.1 1 3 51.87 4 10.39
3 Cd0 IIF 43.1 2 5 81.80 6 15.05
4 Cd0 IIF 43.1 3 7 101.56 8 16.70
我基本上想要的是将Tube_shoot和weight_shoot保持在一起,即将这两列作为熔化处理。但因为我只能使用
id.vars=c("Cond","Construct","Line","Plant")
结果不是我想要的。
到目前为止,我有两个(丑陋的)解决方案:
我融化两次,首先通过measure.vars = c(" Tube_shoot"," Tube_root"),然后按重量,然后删除一半的行结果完全错了。这对我来说是不可行的,因为我有不同长度的数据,而且我总是要检查我是否选择了正确的行。
我粘贴"管"用"体重"进入一个新专栏,取出其他专栏,融化它们,然后再把它们分开。
在excel中逐个复制它们。但是有了数百行,我宁愿学习如何在R中做到这一点。
我确信有更好的方法。
最终我想要的是什么:
Cond Construct Line Plant Tube weight
1 Standard NA NA 2 199.95
2 Cd0 IIF 43.1 1 3 51.87
3 Cd0 IIF 43.1 2 5 81.80
4 Cd0 IIF 43.1 3 7 101.56
2 Cd0 IIF 43.1 1 4 10.39
3 Cd0 IIF 43.1 2 6 15.05
4 Cd0 IIF 43.1 3 8 16.70
答案 0 :(得分:2)
你可以尝试
res <- reshape(df1, idvar=c('Cond', 'Construct', 'Line', 'Plant'),
varying=5:8, direction='long', sep="_")
res1 <- res[res$weight!='-', -5]
row.names(res1) <- NULL
res1
# Cond Construct Line Plant Tube weight_shoot
#1 Standard NA NA 2 199.95
#2 Cd0 IIF 43.1 1 3 51.87
#3 Cd0 IIF 43.1 2 5 81.8
#4 Cd0 IIF 43.1 3 7 101.56
#5 Cd0 IIF 43.1 1 4 10.39
#6 Cd0 IIF 43.1 2 6 15.05
#7 Cd0 IIF 43.1 3 8 16.70
df1 <- structure(list(Cond = c("Standard", "Cd0", "Cd0", "Cd0"),
Construct = c("", "IIF", "IIF", "IIF"), Line = c(NA, 43.1, 43.1, 43.1),
Plant = c(NA, 1L, 2L, 3L), Tube_shoot = c(2L, 3L, 5L, 7L), weight_shoot =
c(199.95,51.87, 81.8, 101.56), Tube_root = c("-", "4", "6", "8"),
weight_root = c("-", "10.39", "15.05", "16.70")), .Names = c("Cond",
"Construct", "Line", "Plant", "Tube_shoot", "weight_shoot", "Tube_root",
"weight_root"), class = "data.frame", row.names = c("1", "2", "3", "4"))
答案 1 :(得分:1)
您可能需要考虑我的&#34; splitstackshape&#34;中的merged.stack
。包,你可以用它来做:
library(splitstackshape)
merged.stack(as.data.table(df1, keep.rownames = TRUE),
var.stubs = c("Tube", "weight"), sep = "_")
# rn Cond Construct Line Plant .time_1 Tube weight
# 1: 1 Standard NA NA root - -
# 2: 1 Standard NA NA shoot 2 199.95
# 3: 2 Cd0 IIF 43.1 1 root 4 10.39
# 4: 2 Cd0 IIF 43.1 1 shoot 3 51.87
# 5: 3 Cd0 IIF 43.1 2 root 6 15.05
# 6: 3 Cd0 IIF 43.1 2 shoot 5 81.8
# 7: 4 Cd0 IIF 43.1 3 root 8 16.70
# 8: 4 Cd0 IIF 43.1 3 shoot 7 101.56
当然,您还可以在末尾添加[Tube != "-" | weight != "-"]
以删除&#34; Tube&#34;或者&#34;体重&#34;有&#34; - &#34; ...但请注意,这样做并不会将这些列神奇地转换为数字: - )
答案 2 :(得分:1)
另一种选择,使用dplyr和tidyr:
library(dplyr)
libarary(tidyr)
gather(df1, x, Tube, c(Tube_shoot, Tube_root)) %>%
mutate(weight = ifelse(grepl("*root$", x), weight_root, weight_shoot)) %>%
select(-c(weight_shoot, weight_root, x))
# Cond Construct Line Plant Tube weight
#1 Standard NA NA 2 199.95
#2 Cd0 IIF 43.1 1 3 51.87
#3 Cd0 IIF 43.1 2 5 81.8
#4 Cd0 IIF 43.1 3 7 101.56
#5 Standard NA NA - -
#6 Cd0 IIF 43.1 1 4 10.39
#7 Cd0 IIF 43.1 2 6 15.05
#8 Cd0 IIF 43.1 3 8 16.70