所以我有此功能用于逻辑(维恩图)计算 但是我无法使函数对任何大小的任何数据框通用。.
此功能仅适用于提供的数据框(仅四列)
how_much = 5000000
A <- sample(how_much, replace = TRUE, x = 1:5)
B <- sample(how_much, replace = TRUE, x = 1:5)
C <- sample(how_much, replace = TRUE, x = 1:5)
D <- sample(how_much, replace = TRUE, x = 1:5)
VennData = data.table(A, B, C, D)
Venn_Counts <- function(dataset, unique_number, operator) {
message("Operator arrgument are: `==` or`<` or `<=` or `>` or `>=`")
if(inrange(unique_number, 1, 35) ){
dataset %>% as_tibble() %>%
mutate(A = (operator(A, unique_number)),
B = (operator(B, unique_number)),
C = (operator(C, unique_number)),
D = (operator(D, unique_number))) %>%
count(A, B, C, D)
}
else {
print("Unique number must be in range from 1 to 5")
}
}
Venn_Counts(VennData, 2, operator = `<=`)
我们如何使上面的函数对具有更多列的数据框通用?
对于较小的对象,我们将得到类似:
参数设置为 unique_number = 3,运算符= ==
count A B
24 TRUE TRUE
20 TRUE FALSE
13 FALSE TRUE
43 FALSE FALSE
当我们看到有24个观测值,其中A和B都等于3时,有20个观测值的A等于3而B不等于3,有13个观测值的A不等于3和B等于3等...
答案 0 :(得分:1)
如何使用dplyr
中的范围动词:
library(data.table)
library(dplyr)
how_much = 5000000
A <- sample(how_much, replace = TRUE, x = 1:5)
B <- sample(how_much, replace = TRUE, x = 1:5)
C <- sample(how_much, replace = TRUE, x = 1:5)
D <- sample(how_much, replace = TRUE, x = 1:5)
VennData = data.table(A, B, C, D)
Venn_Counts <- function(dataset, unique_number, operator) {
message("Operator arrgument are: `==` or`<` or `<=` or `>` or `>=`")
if(inrange(unique_number, 1, 35) ){
dataset %>%
as_tibble() %>%
mutate_all( ~ operator(.x, unique_number)) %>%
group_by_all() %>%
count()
}
else {
print("Unique number must be in range from 1 to 5")
}
}
Venn_Counts(VennData, 2, operator = `<=`)
答案 1 :(得分:0)
我们可以直接比较dataset
和operator
并按所有列分组并计算计数。
Venn_Counts <- function(dataset, unique_number, operator) {
message("Operator arrgument are: `==` or`<` or `<=` or `>` or `>=`")
if(inrange(unique_number, 1, 35) ){
(operator(dataset, unique_number)) %>%
as_tibble() %>%
group_by_all() %>%
summarise(n = n())
}
else {
print("Unique number must be in range from 1 to 5")
}
}
Venn_Counts(VennData, 2, operator = `<=`)
# A B C D n
# <lgl> <lgl> <lgl> <lgl> <int>
#1 FALSE FALSE FALSE FALSE 2
#2 FALSE FALSE FALSE TRUE 3
#3 FALSE TRUE TRUE FALSE 1
#4 TRUE FALSE FALSE TRUE 2
#5 TRUE FALSE TRUE FALSE 1
#6 TRUE TRUE TRUE TRUE 1
数据
library(data.table)
library(tidyverse)
set.seed(1234)
how_much = 10
A <- sample(how_much, replace = TRUE, x = 1:5)
B <- sample(how_much, replace = TRUE, x = 1:5)
C <- sample(how_much, replace = TRUE, x = 1:5)
D <- sample(how_much, replace = TRUE, x = 1:5)
VennData = data.table(A, B, C, D)