Question

调查问卷我从受访者那里得到的数据按照他们的重要程度排列了20个项目。标尺的下端包含一个＆＃34; bin＆＃34;受访者可以扔掉他们认为对他们完全不重要的20件物品中的任何一件。结果是一个包含20个变量的数据集（每个项目1个）。每个变量都会收到一个介于1和100之间的数字（如果项目被抛入bin中，则为0）

我想将条目重新编码为每个受访者的变量排名。因此，所有变量都会收到1到20之间的数字，相对于受访者对其进行排名的数字。

示例：

当前

               item1 item2 item3 item4 item5 item6 item7 item8 etc.
respondent1    67    44    29    7     0     99    35    22
respondent2    0     42    69    50    12    0     67    100
etc.

我想要的是什么：

               item1 item2 item3 item4 item5 item6 item7 item8 etc.
respondent1    7     6     4     2     1     8     5     3
respondent2    1     4     7     5     3     1     6     8
etc.

正如您在respondent2中所看到的，我希望获得相同价值的商品获得相同的排名和排名，然后跳过一个数字。

我已经找到了很多关于如何对观测进行排名的信息，但我还没有找到如何对变量进行排序的信息。有没有人知道怎么做？

Answer 1

以下是使用reshape的一种解决方案：

/* Create sample data */

clear *
set obs 2
gen respondant = "respondant1"
replace respondant = "respondant2" in 2
set seed 123456789
forvalues i = 1/10 {
    gen item`i' = ceil(runiform()*100)
}
replace item2 = item1 if respondant == "respondant2"
list


     +----------------------------------------------------------------------------------------------+
     |  respondant   item1   item2   item3   item4   item5   item6   item7   item8   item9   item10 |
     |----------------------------------------------------------------------------------------------|
  1. | respondant1      14      56      69      62      56      26      43      53      22       27 |
  2. | respondant2      65      65      11       7      88       5      90      85      57       95 |
     +----------------------------------------------------------------------------------------------+


/* reshape long first */
reshape long item, i(respondant) j(itemNum)

/* Rank observations, accounting for ties */
by respondant (item), sort : gen rank = _n
replace rank = rank[_n-1] if item[_n] == item[_n-1] & _n > 1

/* reshape back to wide format */
drop item // optional, you can keep and just include in reshape wide
reshape wide rank, i(respondant) j(itemNum)

排名变量（不是观察）

1 个答案: