我正在使用dplyr操作我的数据,在对数据进行分组后,我想用我的组中的第一个或第二个值减去所有值(即减去基线)。是否可以在一个管道步骤中执行此操作?
MWE:
test <- tibble(one=c("c","d","e","c","d","e"), two=c("a","a","a","b","b","b"), three=1:6)
test %>% group_by(`two`) %>% mutate(new=three-three[.$`one`=="d"])
我想要的输出是:
# A tibble: 6 x 4
# Groups: two [2]
one two three new
<chr> <chr> <int> <int>
1 c a 1 -1
2 d a 2 0
3 e a 3 1
4 c b 4 -1
5 d b 5 0
6 e b 6 1
但是我将其作为输出:
# A tibble: 6 x 4
# Groups: two [2]
one two three new
<chr> <chr> <int> <int>
1 c a 1 -1
2 d a 2 NA
3 e a 3 1
4 c b 4 -1
5 d b 5 NA
6 e b 6 1
答案 0 :(得分:1)
MasterFile
A B C D E F
Yahoo 009 899 777 Spoke to client INV# 123
WorkBook2 --Expected Results Column Q
SHEET1
A B C D E F Q
ID123 Google INV# 345 89 XX 333
SHEET2 --The result was found therefore column Q is populated with Column E from
the master file.
A B C D E F Q
ID009 Yahoo INV#123 777 444 223 **Spoke to client**
SHEET3
A B C D E F Q
ID456 MICROSOFT INV#000 676 989 123
答案 1 :(得分:1)
我们可以使用first
dplyr
test %>%
group_by(two) %>%
mutate(new=three- first(three))
# A tibble: 6 x 4
# Groups: two [2]
# one two three new
# <chr> <chr> <int> <int>
#1 c a 1 0
#2 d a 2 1
#3 e a 3 2
#4 c b 4 0
#5 d b 5 1
#6 e b 6 2
如果我们根据“one”中的字符串“c”对“三个”值进行子集化,那么我们不需要.$
,因为它将获得整个列“c”而不是“c”中的值按列分组
test %>%
group_by(`two`) %>%
mutate(new=three-three[one=="c"])