Question

我有一个明确的数据集，我试图总结一下，这些数据集在所提问题的性质上存在固有的差异。以下数据代表一份调查问卷，该调查问卷包含标准的封闭式问题，但也提供了可以从列表中选择多个答案的问题。＆＃34;村＆＃34;和＆＃34;收入＆＃34;代表封闭式问题。＆＃34; respons.1＆＃34; ... etc ...代表一个列表，其中被访者对每个人说是或否。

VILLAGE  INCOME         responsible.1   responsible.2   responsible.3   responsible.4   responsible.5
   j     both           DLNR             NA              DEQ              NA           Public
   k     regular.income DLNR             NA              NA               NA           NA
   k     regular.income DLNR             CRM             DEQ              Mayor        NA
   l     both           DLNR             NA              NA               Mayor        NA
   j     both           DLNR             CRM             NA               Mayor        NA
   m     regular.income DLNR             NA              NA               NA           Public

我想要的是一个三向表输出＆＃34; village＆＃34;和＃34;负责任的＆＃34;负责任的变量包含在ftable中。通过这种方式，我可以使用包含大量R包的表格进行图形和分析。

        RESPONSIBLE             
VILLAGE INCOME          responsible.1   responsible.2   responsible.3   responsible.4   responsible.5
j       both            2               1               1               1               1
k       regular income  2               1               1               1               0
l       both            1               0               0               1               0
m       regular income  1               0               0               0               1

as.data.frame(table(village, responsible.1)会让我成为第一个，但我无法弄清楚如何将整个事情包裹在一个不错的ftable中。

Answer 1

> aggregate(dat[-(1:2)], dat[1:2], function(x) sum(!is.na(x)) )
  VILLAGE         INCOME responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
1       j           both             2             1             1             1             1
2       l           both             1             0             0             1             0
3       k regular.income             2             1             1             1             0
4       m regular.income             1             0             0             0             1

我猜你实际上有另一个分组载体，也许是第一个“负责”的列？

我真的不了解排序规则，但是颠倒分组列的顺序可能更接近你发布的内容：

> aggregate(dat[-(1:2)], dat[2:1], function(x) sum(!is.na(x)) )
          INCOME VILLAGE responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
1           both       j             2             1             1             1             1
2 regular.income       k             2             1             1             1             0
3           both       l             1             0             0             1             0
4 regular.income       m             1             0             0             0             1

重新格式化R中的分类数据

1 个答案: