我目前正在使用犯罪数据:一年内犯罪(见下文)。
在刑法中,针对不同类型的犯罪(例如,危害财产罪,暴力犯罪,危害国家安全罪等)有不同的标题。然后,每个标题又分为几类犯罪文章(110 =盗窃等)。
现在,我必须在一个交叉表中汇总这些数据。不幸的是,我不能简单地总结文章以获取每个标题的可疑人数。这是因为如果一个犯罪嫌疑人犯了几项财产犯罪,则应只算一次。同样,对于嫌疑犯的总数,我不能简单地将不同头衔的数目相加,因为如果嫌疑犯犯了不同头衔的不同罪行,那么他/她应只被计数一次。因此,基本上我必须计算三个不同的交叉表,但是我不知道如何将这三个交叉表“拉”到一个表中。
DATA suspects;
INPUT personId :4.
article:$3.
title:$20.
age:3.
sex :$1.
residenceStatus :$1.
dateOfCrime : yymmdd10.;
FORMAT dateOfCrime yymmdd10.;
INFILE DATALINES DSD;
DATALINES;
10,110,Property,18,m,A,2019-01-01
10,111,Property,19,m,B,2019-02-03
10,112,Property,19,m,B,2019-02-04
10,110,Property,19,m,A,2019-01-01
10,57,Violence,18,m,A,2019-01-01
10,57,Violence,18,m,A,2019-01-02
10,57,Violence,18,m,A,2019-01-03
10,57,Violence,18,m,A,2019-01-04
10,57,Violence,18,m,A,2019-02-04
10,57,Violence,18,m,A,2019-03-04
10,57,Violence,18,m,A,2019-04-04
10,57,Violence,18,m,A,2019-05-04
11,38,State Security,42,w,B,2019-10-01
13,114,Property,19,m,A,2019-04-09
14,53,Violence,24,m,E,2019-06-06
15,50,Violence,21,w,A,2019-10-08
17,10,Forgery,37,m,B,2019-02-19
19,115,Property,18,m,A,2019-09-10
19,115,Property,18,m,A,2019-10-10
19,115,Property,18,m,E,2019-11-12
99,112,Property,41,m,A,2019-02-23
98,113,Property,55,m,A,2019-07-11
;
RUN;
PROC FORMAT;
VALUE agegrp (NOTSORTED)
1-20 = '<=20'
21-HIGH = '>20';
RUN;
表1:每条(刑法典)的罪犯人数
PROC TABULATE;
CLASS age article title;
CLASS sex residenceStatus / PRELOADFMT ORDER=DATA;
TABLE (article="Total article"),
(ALL="Total residence status" residenceStatus="") * (ALL="Total age" age="") * (ALL="Total sex" sex="") / PRINTMISS MISSTEXT="0";
FORMAT age agegrp.;
RUN;
表2:每个刑法典标题的犯罪嫌疑人数量(在同一标题下由同一人犯下的犯罪不予统计)
DATA titles;
SET offenders;
RUN;
PROC SORT;
BY dateOfCrime;
RUN;
PROC SORT NODUPKEY;
BY personId title;
RUN;
PROC TABULATE;
CLASS age title;
CLASS sex residenceStatus / PRELOADFMT ORDER=DATA;
TABLE (title="Total title"),
(ALL="Total residence status" residenceStatus="") * (ALL="Total age" age="") * (ALL="Total sex" sex="") / PRINTMISS MISSTEXT="0";
FORMAT age agegrp.;
RUN;
表3:整个刑法典中的嫌疑犯人数
DATA totals;
SET offenders;
Total = 'Total';
RUN;
PROC SORT;
BY dateOfCrime;
RUN;
PROC SORT NODUPKEY;
BY personId;
RUN;
PROC TABULATE;
CLASS age Total;
CLASS sex residenceStatus / PRELOADFMT ORDER=DATA;
TABLE total='Total Criminal Law',
(ALL="Total residence status" residenceStatus="") * (ALL="Total age" age="") * (ALL="Total sex" sex="") / PRINTMISS MISSTEXT="0";
FORMAT age agegrp.;
RUN;
有没有办法将这三个表融合为一个表?它们的结构相同,只是行名和频率计数不同。最终看起来应该像这样:
TOTAL Res.status A Res.status B
<=20 >20 <=20 >20
M F M F M F M F ...
------------------------------------------------------------------
110 |
111 |
112 |
113 |
114 |
115 |
Total Property |
50 |
53 |
57 |
Total Violence |
38 |
Total State Security |
TOTAL CRIMINAL CODE |