我试着用一个例子说明我的问题。
示例数据框:
myData <- data.frame(Country = c("Germany","UK","Mexico","Spain"),
MyCount = c(300,800,950,125),
Continent = c("Europe","Europe","America","Europe"))
Country MyCount Continent
Germany 300 Europe
UK 800 Europe
Mexico 950 America
Spain 125 Europe
预期结果:
Country MyCount Continent
Other 425 Europe
UK 800 Europe
我试过这个。
myData %>%
filter(Continent == "Europe" & MyCount < 800)%>%
add_row(Country = "Other", MyCount = sum(MyCount), Continent = "Europe")
答案 0 :(得分:1)
@Mandy我并没有明确说明您的用例的具体要求,但这应该根据您的意见而有效。使用来自dplyr的group_by
和summarise
。
myData %>%
filter(Continent == 'Europe') %>%
mutate(grp = ifelse(MyCount < 800, 'Other', Country)) %>%
group_by(grp) %>%
summarise(MyCount = sum(MyCount))
# A tibble: 2 × 2
grp MyCount
<chr> <dbl>
1 Other 425
2 UK 800
答案 1 :(得分:1)
如果我正在分析您的样本,以下将是一种方法。您似乎想要来自欧洲的数据,然后将其汇总到MyCount和其他欧洲国家/地区的800以上的国家/地区。如果是这样,您可以将“其他”的所有级别的欧洲国家替换为MyCount中少于800的那些国家并汇总数据。
filter(myData, Continent == "Europe") %>%
group_by(Country = fct_other(Country, keep = Country[MyCount >= 800])) %>%
summarise(MyCount = sum(MyCount))
# Country MyCount
# <fctr> <dbl>
#1 UK 800
#2 Other 425
答案 2 :(得分:0)
不完全清楚您要查找的内容,但这会为您提供您在问题中发布的结果。
library(dplyr)
myData<-data.frame(Country=c("Germany","UK","Mexico","Spain"),MyCount=c(300,800,950,125),Continent=c("Europe","Europe","America","Europe"))
myData %>%
filter(Continent == 'Europe') %>%
mutate(Country = as.character(Country),
Country = ifelse(Country %in% c('UK'), Country, 'Other')) %>%
group_by(Country, Continent) %>%
summarize(MyCount = sum(MyCount)) %>%
select(Country, MyCount, Continent)
# A tibble: 2 x 3
# Groups: Country [2]
Country MyCount Continent
<chr> <dbl> <fctr>
1 Other 425 Europe
2 UK 800 Europe