dplyr-将分组变量与分组变量的子集进行比较

时间:2019-10-23 17:18:15

标签: r group-by dplyr

假设我有一张长格式的购买表。看起来像这样:

purchases = data.frame(
    Item = c("Bike", "Bike", "Bike", "Bike", "Car", "Car", "Car", "Car"),
    Variable = c("Age", "Age", "Price", "Price", "Age", "Age", "Price", "Price"),
    Value = c("New", "Used", "Full", "Discount", "New", "Used", "Discount", "Discount")
)

我想查看按项和变量分组的值的分布。所以我可以说“售出的所有自行车中,有50%用过”或“所有汽车都打折出售。”

理想的输出应该是一个像这样的表:

enter image description here

我可以像这样在dplyr中获得计数:

purchases %>% group_by(Item, Variable, Value) %>%
    summarise(Total = n())

然后,我将每个值除以它们各自的项目和变量分组。我可以想到一些很长的答案,我有条件地在另一个变量中添加了相应的计数,但是我希望找到一种简单的方法来通过dplyr做到这一点。描述它的另一种方法可能是在分组的上一级执行计算。

1 个答案:

答案 0 :(得分:2)

version: '3.4'

services:
  webapplication2:
    environment:
      - ASPNETCORE_ENVIRONMENT=Development
      - ASPNETCORE_URLS=https://+:443;http://+:80
      - ASPNETCORE_HTTPS_PORT=44361
    ports:
      - "61225:80"
      - "44361:443"
    volumes:
      - ${APPDATA}/Microsoft/UserSecrets:/root/.microsoft/usersecrets:ro
      - ${APPDATA}/ASP.NET/Https:/root/.aspnet/https:ro
  pgdev01:
    ports:
      - "5434:5432" 
    environment:
      POSTGRES_PASSWORD: "redacted"