Question

我最初有一个未分组数据的数据集，我将其转换为分类的（在“职业”列上），我现在想对我创建的数据使用逻辑回归模型，并用人数表示“成功”在每个职业类别中都死了。

已使用的数据和数据分组

在进行任何分组之前，我的初始数据集如下：

 Occupation Education Age Died
1  household Secondary  39   no
2    farming   primary  83  yes
3    farming   primary  60  yes
4    farming   primary  73  yes
5    farming Secondary  51   no
6    farming iliterate  62  yes

然后我使用以下方法对数据进行分组：

occu %>% group_by(Occupation, Died) %>% count()##use this to group on the occupation of the suicide victimrs

这将导致以下输出：

Occupation       Died      n
   <fct>            <fct> <int>
 1 business/service no       12
 2 business/service yes       9
 3 farming          no      939
 4 farming          yes    1093
 5 household        no      154
 6 household        yes      94
 7 others           yes       3
 8 others/unknown   no      146
 9 others/unknown   yes      10
10 professional     no       11
11 professional     yes      26
12 retiree          no        3
13 student          no       27
14 student          yes       8
15 unemployed       no       23
16 unemployed       yes       7
17 worker           yes       6

我将以上内容分组为一个表，因此我使用以下值：

dt %>% group_by(Occupation) %>% 
  mutate(total=sum(n), prop=n/total)

给出输出：

 ccupation       Died      n total   prop
   <fct>            <fct> <int> <int>  <dbl>
 1 business/service no       12    21 0.571 
 2 business/service yes       9    21 0.429 
 3 farming          no      939  2032 0.462 
 4 farming          yes    1093  2032 0.538 
 5 household        no      154   248 0.621 
 6 household        yes      94   248 0.379 
 7 others           yes       3     3 1     
 8 others/unknown   no      146   156 0.936 
 9 others/unknown   yes      10   156 0.0641

问题

我的问题是，否，我如何使用原始模型中的所有三个预测变量（教育，年龄，分组职业），对Died = yes是成功，并且对no =不失败，将对这个分组数据运行logistic回归模型

将逻辑回归模型拟合到自分组数据

0 个答案: