Question

...县... AgeGrp人口

一个............. 1 .......... 200

一个............. 2 .......... 100

一个............. 3 .......... 100

一个............所有......... 400

乙............. 1 .......... 200

所以，我有一个县的名单，我想找到18岁以下的人口占每个县人口的百分比，所以从上表中的一个例子我只想添加人口agegrp 1和2除以'所有'人口。在这种情况下，它将是300/400。我想知道是否可以为每个县做到这一点。

Answer 1

让我们将您的SAS数据集称为“ HAVE ”并说它有两个字符变量（县和 AgeGrp ）和一个数字变量（的人口）。并且假设您的数据集中始终只有一个观察点，其中每个县的 AgeGrp='All' ，其中人口的值为该县的总数。

为了安全起见，让我们对County的数据进行排序并在另一个数据步骤中对其进行处理，创建一个名为“ WANT ”的新数据集，其中包含县人口的新变量（ TOT_POP ），您想要的两个年龄组值的总和（ TOT_GRP ）并计算比例（ AgeGrpPct ）：

proc sort data=HAVE; by County; run; data WANT; retain TOT_POP TOT_GRP 0; set HAVE; by County; if first.County then do; TOT_POP = 0; TOT_GRP = 0; end; if AgeGrp in ('1','2') then TOT_GRP + Population; else if AgeGrp = 'All' then TOT_POP = Population; if last.County; AgeGrpPct = TOT_GRP / TOT_POP; keep County TOT_POP TOT_GRP AgeGrpPct; output; run;

请注意，实际上并不需要包含 AgeGrp='All' 的观察结果;您也可以创建另一个变量来收集所有年龄组的运行总计。

Answer 2

如果您需要程序方法，请为18岁以下的人创建格式，然后使用PROC FREQ计算百分比。有必要使用此方法从数据集中排除“所有”值（在源数据中包含摘要行通常是不好的做法）。 PROC TABULATE也可以用于此。

data have;
input County $ AgeGrp $ Population;
datalines;
A 1 200
A 2 100
A 3 100
A All 400
B 1 200
B 2 300
B 3 500
B All 1000
;
run;

proc format;
value $age_fmt '1','2' = '<18'
                other   = '18+';
run;

proc sort data=have;
by county;
run;

proc freq data=have (where=(agegrp ne 'All')) noprint;
by county;
table agegrp / out=want (drop=COUNT where=(agegrp in ('1','2')));
format agegrp $age_fmt.;
weight population;
run;

在条件（sas）下跨行垂直求和

2 个答案: