最小最大给出NA值R dplyr

时间:2018-12-04 04:31:01

标签: r dplyr max min na

我想了解我们为每个任务收取的最低和最高金额,我正在评估超过25万行。 我不知道为什么给我NA值

    Data Sample:
    # A tibble: 279,360 x 7
       Job.ID  Task.ID   Task.Name               Task.Minutes Task.BillableRa~ Task.Billable Task.Amount
       <chr>   <chr>     <chr>                   <chr>        <chr>            <chr>         <chr>      
     1 W210238 248323800 E.2 Engineer - Progres~ 1080         137.00           Yes           2466.00    
     2 W210196 249251898 E.2 Engineer            450          137.00           Yes           1027.50    
     3 W210188 249251899 E.2 Engineer            120          137.00           Yes           274.00     
     4 W210229 249251900 E.0 Junior Engineer     90           78.00            Yes           117.00     
     5 W210179 249251901 D.3 Snr Designer        1620         127.00           Yes           3429.00    
     6 W210180 249991653 A.1 Contract Administr~ 60           108.00           Yes           108.00     
     7 W210212 249991654 D.2 Snr Drafter         60           119.00           Yes           119.00     
     8 W210198 250055633 A.1 Contract Administr~ 1500         108.00           Yes           2700.00    
     9 W210223 250055634 D.2 Snr Drafter         5940         119.00           Yes           11781.00   
    10 W210220 250057691 A.1 Contract Administr~ 270          108.00           Yes           486.00     
    # ... with 279,350 more rows**strong text** 

        code:
            x2 %>%
                x2 <- x2 %>%
    group_by(Task.Name) %>%
    mutate(Task.Ratemax= max(Task.BillableRate)) %>%
    mutate(Task.RateMin = min(Task.BillableRate)) %>%
    select(Task.Name, Task.Ratemax,Task.RateMin) %>%
    unique() 

actual outcome:
  # A tibble: 39 x 3
# Groups:   Task.Name [39]
   Task.Name                                    Task.Ratemax Task.RateMin
   <chr>                                        <chr>        <chr>       
 1 E.2 Engineer                                 168.00       127.00      
 2 E.0 Junior Engineer                          98.00        ""          
 3 D.3 Snr Designer                             140.00       119.00      
 4 A.1 Contract Administration                  75.00        102.50      
 5 D.2 Snr Drafter                              135.00       ""          
 6 E.3 Senior Engineer                          168.00       130.00      
 7 X.5 HA Design and Audit                      178.00       161.00      
 8 P.7 Contract Project Manager                 135.00       135.00      
 9 A.3 Client Meetings/Reporting and Site Visit 143.00       140.00      
10 D.1 Draftsperson                             95.00        110.00      
# ... with 29 more rows
> 

2 个答案:

答案 0 :(得分:1)

感谢Ronak Shah与我共享代码,效果很好。

x2 %>% 
    group_by(Task.Name) %>% 
    mutate(Task.Ratemax= max(as.numeric(Task.BillableRate), na.rm = TRUE),
           Task.RateMin = min(as.numeric(Task.BillableRate), na.rm = TRUE)) %>%
    select(Task.Name, Task.Ratemax,Task.RateMin) %>%
    unique()

# A tibble: 39 x 3
# Groups:   Task.Name [39]
   Task.Name                                    Task.Ratemax Task.RateMin
   <chr>                                               <dbl>        <dbl>
 1 E.2 Engineer                                         168           127
 2 E.0 Junior Engineer                                  103.           78
 3 D.3 Snr Designer                                     140           119
 4 A.1 Contract Administration                          119            75
 5 D.2 Snr Drafter                                      135           109
 6 E.3 Senior Engineer                                  168           130
 7 X.5 HA Design and Audit                              178           161
 8 P.7 Contract Project Manager                         135           135
 9 A.3 Client Meetings/Reporting and Site Visit         143           140
10 D.1 Draftsperson                                     119            95
# ... with 29 more rows
> 

答案 1 :(得分:0)

我已经使用了虚拟数据来完成任务。

charges <- structure(list(Task.name = c("a", "b", "c", "d", "a", "b", "r", 
"e", "t", "c", "d", "a", "e", "t", "y", "c", "b", "r", "w", "e", 
"a", "c", "a"), Task.rate = c(291L, 299L, 142L, 145L, 143L, 251L, 
465L, 61L, 326L, 412L, 257L, 330L, 185L, 342L, 346L, 497L, 143L, 
315L, 206L, 167L, 492L, 397L, 288L)), class = "data.frame", row.names = c(NA, 
-23L))

Answer :- 

f1 <- function(x) c(Max = max(x), Min = min(x))


f2<- do.call(data.frame, aggregate(Task.rate~Task.name, charges, f1))

这是使用基数