我需要根据此ID的现有值为a列中的每个ID设置一个标签。例如,如果id 1仅具有“ F”,则结果将为“女性”,如果只有“ M”,则结果为“男性”,如果混合,则结果为“混合”。
这是数据库的基础:
df=data.frame(
a=c(1,1,1,2,2,3,3,3,3,3),
b=c("F","M","F","M","M","F","F","F","F","F"))
这是预期的结果:
df$Result=c("Mixed", "Mixed", "Mixed", "Male", "Male", "Female", "Female", "Female", "Female", "Female")
a b Result
1 1 F Mixed
2 1 M Mixed
3 1 F Mixed
4 2 M Male
5 2 M Male
6 3 F Female
7 3 F Female
8 3 F Female
9 3 F Female
10 3 F Female
有人可以帮助我计算此df$Result
列吗?预先感谢!
答案 0 :(得分:2)
按“ a”分组后,检查“ b”中不同元素的数量。如果大于1,则返回“混合”,否则返回“ b”中更改后的标签
library(dplyr)
df %>%
mutate(b1 = c("Male", "Female")[(b == "F") + 1]) %>%
group_by(a) %>%
mutate(Result = case_when(n_distinct(b) > 1 ~ "Mixed", TRUE ~ b1)) %>%
select(-b1)
# A tibble: 10 x 3
# Groups: a [3]
# a b Result
# <dbl> <chr> <chr>
# 1 1 F Mixed
# 2 1 M Mixed
# 3 1 F Mixed
# 4 2 M Male
# 5 2 M Male
# 6 3 F Female
# 7 3 F Female
# 8 3 F Female
# 9 3 F Female
#10 3 F Female
df <- data.frame(
a=c(1,1,1,2,2,3,3,3,3,3),
b=c("F","M","F","M","M","F","F","F","F","F"),
stringsAsFactors = FALSE)
答案 1 :(得分:2)
具有 data.table 的解决方案:
library(data.table)
a = c(1,1,1,2,2,3,3,3,3,3)
b = c("F","M","F","M","M","F","F","F","F","F")
df = data.table(a, b)
df[, result := as.character(uniqueN(b)), a]
df[, result := ifelse(result == "1", ifelse(b == "M", "Male", "Female"), "Mixed")]
df
# a b result
# 1: 1 F Mixed
# 2: 1 M Mixed
# 3: 1 F Mixed
# 4: 2 M Male
# 5: 2 M Male
# 6: 3 F Female
# 7: 3 F Female
# 8: 3 F Female
# 9: 3 F Female
# 10: 3 F Female