Question

我有一个看起来像这样的数据集：

插座名称（字符串变量）：媒体插座的名称（最多12个），文件中的最后三个插座是《卫报》，《电讯报》和《独立报》。
得分1：比例
得分2：规模

...

得分7：规模。

我想做的是计算一组21个新变量，分别针对每种情况（媒体出口），七个变量（分数），该特定出口的得分与感兴趣的三个网点的得分：《卫报》，《电讯报》和《独立报》（7个变量X 3个基准网点= 21）。本质上，我想将每个网点的得分与我的三个基准网点进行比较。

因此，例如，我应该有一个名为score1_Guardian的新变量，出口1的计算方式如下：出口1为该变量获得的得分-监护人为该变量获得的得分。变量score2_Guardian将为每个出口显示每个特定出口在该变量上获得的得分与Guardian为该变量获得的得分之间的差，依此类推。因此，在此示例中，出口The Guardian将在所有score1_Guardian到score7_Guardian变量上得分0。

Answer 1

比我在下面建议的方法更简单，但是我更喜欢这种方式-更少的代码和更少的临时变量。

首先，我根据您的参数创建一个伪数据集：

data list list/outlet (a12) score1 to score7 (7f6).
begin data
'outlet1' 1 2 3 4 5 6 7
'outlet2' 2 3 4 5 6 7 8
'outlet3' 5 6 7 8 9 1 2 
'Guardian' 7 8 9 1 2 5 6
'Telegraph' 5 12 12 3 4 4 2 
'Independent' 2 2 2 2 2 2 2 
end data.

现在我们可以开始工作了：

*going from wide to long form - just to avoid creating too many variables on the way.

varstocasese /make score from score1 to score7/index scorenum(score).
if outlet='Guardian' Guardian=score.
if outlet='Telegraph' Telegraph=score.
if outlet='Independent' Independent=score.
AGGREGATE  /OUTFILE=* MODE=ADDVARIABLES OVERWRITEVARS=YES
  /BREAK=scorenum   /Guardian=MAX(Guardian)   /Telegraph=MAX(Telegraph)   /Independent=MAX(Independent).

*now we have three new variables ready to compare.

compute Guardian=score - Guardian.
compute Telegraph=score - Telegraph.
compute Independent=score - Independent.

* last step - going back to wide format.

compute scorenum=substr(scorenum,6,1).
CASESTOVARS  /id=outlet /index=scorenum/sep="_".

比较不同行中的值以创建新变量

1 个答案: