Question

我有一个名为cst的数据框，其中包含列country，ID和age。我想为每个单独的国家/地区制作age的垃圾箱（将所有ID分为十分位数或四分位数）。我是这样使用的：

cut(cst[!is.na(cst$age), "age"], quantile(cst["age"], probs = seq(0,1,0.1), na.rm = T))

但是，它为所有数据框创建了bin，但是我需要分别针对每个国家/地区。
你能帮我吗？

Answer 1

我尝试使用dplyr解决方案，看起来像这样：

library(dplyr)
cst2 <- cst %>%
  group_by(country) %>%
  mutate(
    bin = cut(age, quantile(age, probs=seq(0,1,0.1), na.rm=TRUE))
  ) %>%
  ungroup()

Answer 2

您需要做的就是在使用cut之前应用一个子集。它还不使用dplyr库。

for (c in unique(as.list(cst$country))) {
  sub <- subset(cst, country == c)
  cut(sub[!is.na(sub$age), "age"], quantile(sub["age"], probs = seq(0,1,0.1), na.rm = T))
}

如何在R中创建垃圾箱

2 个答案: