我的数据框如下。 (类似,实际上还有更多的行和列)
Gender Energetic Weekly_Apple Weekly_Banana
1 Female 3 No Yes
2 Female 3 No Yes
3 Male 5 No Yes
4 Male 2 No No
5 Female 1 No No
我想要基于汇总“是”响应的简短代码,以输出以下内容:
Male Female
Apples 0 0
Bananas 1 2
每个性别吃的苹果数量= 0。 1个男性和2个女性吃苹果。
我尝试了以下方法:
count(original_data, c("Gender","Weekly_Apple"))
count(original_data, c("Gender","Weekly_Banana"))
count(original_data, c("Gender","Weekly_Grape"))
count(original_data, c("Gender","Weekly_PineApple"))
aggregate(x = original_data[c("Weekly_Apple",
"Weekly_Banana",
"Weekly_Grape")],
by = original_data[c("Gender")],
FUN = n())
答案 0 :(得分:2)
如NelsonGon所建议,我已将df1 <- t(df1)
替换为tidyr::crossing(df1)
。
library(dplyr)
df<-data.frame(
Gender=c("Female", "Female", "Male", "Male", "Female"),
Energetic =c(3,3,5,2,1),
Weekly_Apple = c("No", "No", "No", "No", "No"),
Weekly_Banana = c("Yes", "Yes", "Yes", "No", "No"))
df1 <- df %>%
group_by(Gender) %>%
summarise(
Apples = sum(Weekly_Apple=="Yes"),
Bananas = sum(Weekly_Banana =="Yes")
)
df1 <- tidyr::crossing(df1)
答案 1 :(得分:1)
一种data.table
可能是:
dcast(variable ~ Gender,
value.var = "value",
fun = function(x) sum(x == "Yes"),
data = melt(df[-2], id.vars = "Gender"))
variable Female Male
1 Weekly_Apple 0 0
2 Weekly_Banana 2 1
答案 2 :(得分:1)
您可以使用基数R:
table(reshape(cbind(df,id=1:nrow(df)),3:4,idvar = "id",dir="long",sep="_")[-(2:3)])[,,'Yes']
time
Gender Apple Banana
Female 0 2
Male 0 1
甚至
xtabs(Weekly~time+Gender,transform(reshape(cbind(df,id=1:nrow(df)),3:4,idvar = "id",dir="long",sep="_"),Weekly=Weekly=="Yes"))
Gender
time Female Male
Apple 0 0
Banana 2 1
答案 3 :(得分:1)
一个if ( v.size >= 2 )
{
mergeSort(v, 0, v.size() - 1);
}
替代项:
dplyr-tidyr
数据:
df %>%
group_by(Gender) %>%
summarise_at(vars(contains("Weekly")), function(x) sum(x=="Yes")) %>%
tidyr::gather(key, val , -Gender) %>%
tidyr::spread(Gender, val)
# A tibble: 2 x 3
key Female Male
<chr> <int> <int>
1 Weekly_Apple 0 0
2 Weekly_Banana 2 1
答案 4 :(得分:0)
具有base R
的另一个tapply
版本
t(sapply(names(df)[3:4], function(nm) with(df, tapply(df[[nm]]=="Yes", Gender,sum))))
# Female Male
#Weekly_Apple 0 0
#Weekly_Banana 2 1
或与split
sapply(split(df[3:4], df$Gender), function(x) colSums(x == "Yes"))
或其变体
sapply(split(as.data.frame(df[3:4] == "Yes"), df$Gender), colSums)