在R中创建频率分析结果表

时间:2018-07-05 09:03:16

标签: r dplyr crosstab tabular

我需要创建某种类型的表(模板)

Mydata 
df=structure(list(group = c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 
1L), degree = structure(c(1L, 1L, 1L, 1L, 1L, 3L, 2L, 1L, 1L, 
1L), .Label = c("Mild severity", "Moderate severity", "Severe severity"
), class = "factor")), .Names = c("group", "degree"), class = "data.frame", row.names = c(NA, 
-10L))

我进行交叉表

table(df$degree,df$group)

                    1 2 3
  Mild severity     3 3 2
  Moderate severity 0 0 1
  Severe severity   0 0 1

但是我需要此模板中的结果 [![在此处输入图片描述] [1]] [1]

如何创建具有这种结构的表?

非常重要的修改

full dput()(42个观测点)

df=structure(list(Study.Subject.ID = structure(c(1L, 2L, 3L, 4L, 
5L, 6L, 7L, 8L, 9L, 1L, 2L, 3L, 5L, 7L, 8L, 9L, 1L, 2L, 3L, 5L, 
8L, 2L, 3L, 5L, 8L, 2L, 3L, 5L, 8L, 2L, 3L, 5L, 8L, 3L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L), .Label = c("01-06-104", "01-09-108", 
"01-15-201", "01-16-202", "01-18-204", "01-27-301", "01-28-302", 
"01-33-305", "01-42-310"), class = "factor"), group = c(1L, 1L, 
2L, 2L, 2L, 3L, 3L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 
2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 2L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), Degree.of.severity = structure(c(2L, 
2L, 2L, 2L, 2L, 4L, 3L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c("Life-threatening or disabling", 
"Mild severity", "Moderate severity", "Severe severity"), class = "factor")), .Names = c("Study.Subject.ID", 
"group", "Degree.of.severity"), class = "data.frame", row.names = c(NA, 
-42L))

有一个主题的概念,还有一些副作用的概念。 一个人可以有多种副作用。 副作用可能是

severity
Moderate
Severe

我必须计算有多少人被小组分开有这种或那种副作用。 在这个组中有多少副作用?

I.E。在第一组中,我们有9人,但有两个独特的人。

01-06-104
01-09-108

,但总数轻度严重性为7。 因此,只有两个人具有轻度严重程度(X)的副作用,并且总数Mild severity为7 (Y)。 专家总数为42,因此要计算百分比,我们必须除以42(2/42)= 4,7

那为什么期望输出

    degree       group1           group2         group3 
                  X (%)Y          X (%)Y         X (%) Y

    Mild severity   2 (4,7%)7   3 (7,1%)13   3(7,1%)    12
    Moderato        1 (2,3%)1   0(0,0%%)0    2(4,7%)    6
    Severe severity 0(0,0%%)0   0(0,0%%)0     1(2,3)    1

3 个答案:

答案 0 :(得分:2)

我必须承认,我不清楚您要做什么。不幸的是,您期望的输出图像无济于事。

假设,您在问如何计算两向列联表,并同时显示计数和百分比(总计)。这是tidyverse的可能性

library(tidyverse)
df %>%
    group_by(group, degree) %>%
    summarise(n = n(), perc = n() / nrow(.)) %>%
    mutate(entry = sprintf("%i (%3.2f%%)", n, perc * 100)) %>%
    select(-n, -perc) %>%
    spread(group, entry, fill = "0 (0.0%%)")
## A tibble: 3 x 4
#  degree            `1`        `2`        `3`
#  <fct>             <chr>      <chr>      <chr>
#1 Mild severity     3 (30.00%) 3 (30.00%) 2 (20.00%)
#2 Moderate severity 0 (0.0%%)  0 (0.0%%)  1 (10.00%)
#3 Severe severity   0 (0.0%%)  0 (0.0%%)  1 (10.00%)

答案 1 :(得分:1)

您想要分数以及总数?试试:

n=table(df$degree,df$group)
df=as.data.frame(cbind(n/colSums(n)*100,n))

答案 2 :(得分:0)

使用基数R:

a = transform(data.frame(table(df)),Freq = sprintf("%d (%3.2f%%)",Freq,prop.table(Freq)*100))
data.frame(t(unstack(a,Freq~degree)))
                          X1         X2         X3
Mild.severity     3 (30.00%) 3 (30.00%) 2 (20.00%)
Moderate.severity  0 (0.00%)  0 (0.00%) 1 (10.00%)
Severe.severity    0 (0.00%)  0 (0.00%) 1 (10.00%)