我有两个观测值不同的数据帧(一个长2220欧,另一个宽37欧)。数据帧共享变量“ SID”,尽管在长数据帧中,每个SID值有60行,而在宽行中只有1行。宽数据帧具有一个附加变量“ Experimenter”,每个SID都有一个对应的实验者编号。我想在长数据帧中创建一个“ Experimenter”列,尽管每个SID都有60个实例,并且我希望每次SID值出现时(这样60次)都添加并重复相应的Experimenter值。
针对每个主题的嵌套if-else命令似乎非常繁琐,所以我希望有替代方法
我已经添加了每个数据帧的dput输出,不幸的是,我不确定如何嵌入它们。现在,在长数据帧“ SID”中将其命名为“ Subject”,但它们是相同的变量。
宽幅:
structure(list(SID = 7301:7302, Experimenter = c(2L, 1L)), .Names = c("SID",
"Experimenter"), class = "data.frame", row.names = c(NA, -2L))
长:
structure(list(Subject = c(7301L, 7301L, 7301L), Session = c(1L,
1L, 1L), Stimtype = structure(c(1L, 1L, 1L), .Label = "Control", class =
"factor"),
Valence = structure(c(1L, 1L, 1L), .Label = "Neutral", class = "factor"),
Block = c(1L, 1L, 1L), Image = c(12L, 17L, 22L), Group = structure(c(1L,
3L, 2L), .Label = c("Neutral_1660", "Neutral_5300", "Neutral_7233"
), class = "factor"), Response = c(1L, 1L, 1L), Stimulus = c(1660L,
7233L, 5300L)), .Names = c("Subject", "Session", "Stimtype",
"Valence", "Block", "Image", "Group", "Response", "Stimulus"), class =
"data.frame", row.names = c(NA,
-3L))
如果我们正在查看这些图像,那么我要做的就是每当“主题”为“ 7301”时在长数据框中插入“ Experimenter”变量,其值为“ 2”。广泛的数据)等等。
先谢谢您。
答案 0 :(得分:0)
除非我误解了,这似乎是merge
/ left_join
以R为底
merge(df2, df1, by.x = "Subject", by.y = "SID")
# Subject Session Stimtype Valence Block Image Group Response Stimulus
#1 7301 1 Control Neutral 1 12 Neutral_1660 1 1660
#2 7301 1 Control Neutral 1 17 Neutral_7233 1 7233
#3 7301 1 Control Neutral 1 22 Neutral_5300 1 5300
# Experimenter
#1 2
#2 2
#3 2
或使用dplyr
library(dplyr)
left_join(df2, df1, by = c("Subject" = "SID"))
给出相同的结果
df1 <- structure(list(SID = 7301:7302, Experimenter = c(2L, 1L)), .Names = c("SID",
"Experimenter"), class = "data.frame", row.names = c(NA, -2L))
df2 <- structure(list(Subject = c(7301L, 7301L, 7301L), Session = c(1L,
1L, 1L), Stimtype = structure(c(1L, 1L, 1L), .Label = "Control", class =
"factor"),
Valence = structure(c(1L, 1L, 1L), .Label = "Neutral", class = "factor"),
Block = c(1L, 1L, 1L), Image = c(12L, 17L, 22L), Group = structure(c(1L,
3L, 2L), .Label = c("Neutral_1660", "Neutral_5300", "Neutral_7233"
), class = "factor"), Response = c(1L, 1L, 1L), Stimulus = c(1660L,
7233L, 5300L)), .Names = c("Subject", "Session", "Stimtype",
"Valence", "Block", "Image", "Group", "Response", "Stimulus"), class =
"data.frame", row.names = c(NA,
-3L))