Question

我有这个df：

set.seed(20)
df <- data.frame(X1 = sample(c(1:10,NA), 10, replace=TRUE),
                X2 = sample(c(1:10,NA), 10, replace=TRUE),
                X3 = sample(c(1:10,NA), 10, replace=TRUE),
                stringsAsFactors = FALSE)

> df
   X1 X2 X3
1  10  8  6
2   9  9  1
3   4  1  5
4   6  9  1
5  NA  3  3
6  NA  5  1
7   2  4 10
8   1  2 NA
9   4  4  1
10  5 10  8

在哪里可以轻松使用此功能：

lapply(df, sum)
df %>% lapply(., sum)
df %>% lapply(., as.numeric)

但是，如果我想将na.rm=TRUE放在sum()中是不可能的。我一直在寻找答案，似乎唯一的解决方案是在lapply()内定义函数sum，例如：

lapply(df, function() {})

真的不可能将函数FUN的参数放在lapply中吗？另外，我遇到的问题是当我想使用管道运算符应用需要数据的函数（例如sum(data, na.rm=TRUE)）时，我无法将数据提供给该函数：

df %>% lapply(., sum(, na.rm=TRUE)) # It needs the sum argument.
df %>% lapply(., sum(., na.rm=TRUE)) # but I'm not looking to sum the whole df

Answer 1

我猜您想在这里汇总df的列。您可以按照以下步骤进行操作：

set.seed(seed = 20)

df <- data.frame(X1 = sample(c(1:10, NA), 10, replace = TRUE),
                 X2 = sample(c(1:10, NA), 10, replace = TRUE),
                 X3 = sample(c(1:10, NA), 10, replace = TRUE))

df
#>    X1 X2 X3
#> 1  10  8  6
#> 2   9  9  1
#> 3   4  1  5
#> 4   6  9  1
#> 5  NA  3  3
#> 6  NA  5  1
#> 7   2  4 10
#> 8   1  2 NA
#> 9   4  4  1
#> 10  5 10  8

lapply(df, sum, na.rm = TRUE)
#> $X1
#> [1] 41
#> 
#> $X2
#> [1] 55
#> 
#> $X3
#> [1] 36

^{由reprex package（v0.2.1）于2019-04-02创建}

一种替代方法是使用colSums(df, na.rm = TRUE)。

如何将lapply（）与带有参数的函数一起应用

1 个答案: