Question

我正在尝试使用tidyverse逻辑对具有级联条件的数据框进行分类（我正在尝试学习）。我可以使用base R做到这一点，但不能使用tidyverse-我发现了一些使用tidyverse + base r（使用子集）的混合方法的示例，但找不到/理解如何仅使用dplyr / tidyverse语法来做到这一点（过滤，变异）。

问题在于，在对第一个条件进行子集化（使用过滤器）后，数据框仅包含过滤后的行，而我无法对其余标准进行子集化和分类。我可能可以使用临时的df和rbind（），但是我认为可能有一种更优雅的方法来仅使用tidyverse语法。简而言之，我只想更新符合我的条件的行，而其他所有行在原始DF中保持不变。我应该使用dplyr语法。有可能吗？

# with base R
    mydata$mytype = "NA"
    mydata$mytype[which(mydata$field1 > 300)] = "type1"
    mydata$mytype[which(mydata$field1 <= 300 & mydata$field1 > 200)] = "type2"

# with dplyr/tidyverse?
    library(tidyverse)
    mydata<-mydata%>% mutate(mytype = "NA")
    mydata<-mydata%>%filter(field1>300) %>% mutate(mytype="type1") 
    mydata<-mydata%>%filter(field1 >200, field1<=300) %>% mutate(mytype="type2")  #0 rows now

Answer 1

一种选择是将cut用作：

df$mytype  <- cut(df$field1, breaks = c(-Inf,201,301,+Inf), 
                        labels = c("NA", "Type2", "Type1"))

此后，OP由于未提供任何数据，因此尝试对向量进行以上求解：

cut(c(100, 190, 250, 260, 310), breaks = c(-Inf,201,301,+Inf), 
                labels = c("NA", "Type2", "Type1"))
#[1] NA    NA    Type2 Type2 Type1
#Levels: NA Type2 Type1

Answer 2

使用dplyr，您可以：

1-将“ breaks”设置为“ field1”及其“ labels”。

breaks <- c(-Inf, 200, 300)

labels <- c("type1", "type2)

2-要做：

df <- df %>% mutate(category=cut(field1, breaks= breaks, labels= labels))

使用filter / mutate和dplyr / tidyverse逻辑对数据库进行分类

2 个答案: