Question

我正在使用dplyr操作我的数据，在对数据进行分组后，我想用我的组中的第一个或第二个值减去所有值（即减去基线）。是否可以在一个管道步骤中执行此操作？

MWE：

test <- tibble(one=c("c","d","e","c","d","e"), two=c("a","a","a","b","b","b"), three=1:6)
test %>% group_by(`two`) %>% mutate(new=three-three[.$`one`=="d"])

我想要的输出是：

# A tibble: 6 x 4
# Groups:   two [2]
  one   two   three   new
  <chr> <chr> <int> <int>
1 c     a         1    -1
2 d     a         2     0
3 e     a         3     1
4 c     b         4    -1
5 d     b         5     0
6 e     b         6     1

但是我将其作为输出：

# A tibble: 6 x 4
# Groups:   two [2]
  one   two   three   new
  <chr> <chr> <int> <int>
1 c     a         1    -1
2 d     a         2    NA
3 e     a         3     1
4 c     b         4    -1
5 d     b         5    NA
6 e     b         6     1

Answer 1

    MasterFile

            A     B      C     D           E            F        
        Yahoo    009    899   777   Spoke to client    INV# 123      


    WorkBook2 --Expected Results Column Q

    SHEET1


         A       B         C         D       E       F       Q
          ID123   Google   INV# 345    89      XX     333   

    SHEET2 --The result was found therefore column Q is populated with Column E from
 the master file. 

         A          B          C     D         E      F      Q
        ID009     Yahoo   INV#123   777      444      223    **Spoke to client**


    SHEET3

        A             B            C          D         E      F    Q
       ID456       MICROSOFT     INV#000      676      989    123

Answer 2

我们可以使用first

中的dplyr

test %>%
   group_by(two) %>% 
   mutate(new=three- first(three))
# A tibble: 6 x 4
# Groups: two [2]
#  one   two   three   new
#  <chr> <chr> <int> <int>
#1 c     a         1     0
#2 d     a         2     1
#3 e     a         3     2
#4 c     b         4     0
#5 d     b         5     1
#6 e     b         6     2

如果我们根据“one”中的字符串“c”对“三个”值进行子集化，那么我们不需要.$，因为它将获得整个列“c”而不是“c”中的值按列分组

test %>% 
   group_by(`two`) %>%
   mutate(new=three-three[one=="c"])

从每行中减去第一个或第二个值

2 个答案: