我有一个数据帧(alter.hh2),如下所示:
wk hh brd count flavor mean_multi h_size
1 W52 1213 546 1 PEACH 2.11 2
2 W52 4493 546 1 BBA 1.63 2
5 W53 2093 5367 4 PEA 2.12 2
6 W53 2043 5366 5 RBYA 1.93 1
9 W53 2093 546 8 VANI 1.78 2
并且对于每一行,我想要将同一周的flavor(如果是不同的值),mean_multi和brd的值附加到它,如下所示,同时保持每行中的其余值不变:
wk hh brd count flavor mean_multi h_size flavor2 brd2 mean_multi2
W52 1213 546 1 PEACH 2.11 2 BBA 546 1.63
W52 4493 546 1 BBA 1.63 2 PEACH 546 2.11
W53 2093 5367 4 PEA 2.12 2 RBYA 5367 1.93
W53 2043 5366 5 RBYA 1.93 1 PEA 5366 2.12
如果每周有超过2个值,我希望结果如下(迭代):
wk hh brd count flavor mean_multi h_size flavor2 brd2 mean_multi2 flavor3 brd3 mean_multi3
W53 2093 5367 4 PEA 2.12 2 RBYA 5366 1.93 VANI 546 1.78
W53 2043 5366 5 RBYA 1.93 1 PEA 5367 2.12 VANI 546 1.78
W53 2093 546 8 VANI 1.78 2 PEA 5367 2.12 RBYA 5366 1.93
我尝试使用以下代码重构包,但似乎没有给我所需的结果:
w <- reshape(alter.hh2,
timevar = c("flavor","wk"),
idvar = c("count", "hh"),
direction = "wide")
非常有见解!
答案 0 :(得分:0)
我们可以使用data.table
。获取我们需要追加的列名称(即&#34; brd&#34;,&#34; flavor&#34;,&#34; mean_multi&#34;)如果每个&#34; wk&#有不同的值34; (&#39; NM1&#39)。转换&#39; data.frame&#39;到&#39; data.table&#39; (setDT(alter.hh2)
),按&#39; hh&#39;分组,我们选择第一行(head(.SD, 1)
),然后按&#39; wk&#39;分组。并指定.SDcols
,我们遍历.SDcols
中的列并获得相反的结果并将输出分配(:=
)到新列。
library(data.table)
nm1 <- names(alter.hh2)[c(3, 5, 6)]
setDT(alter.hh2)[,head(.SD, 1) , hh][, paste0(nm1, 2) := lapply(.SD, rev),
by = wk, .SDcols = nm1][]
# hh wk brd count flavor mean_multi h_size brd2 flavor2 mean_multi2
#1: 1213 W52 546 1 PEACH 2.11 2 546 BBA 1.63
#2: 4493 W52 546 1 BBA 1.63 2 546 PEACH 2.11
#3: 2093 W53 5367 4 PEA 2.12 2 5367 PEA 2.12
#4: 2043 W53 5366 5 RBYA 1.93 1 5366 RBYA 1.93