Question

我有数据集，显示人们如何在10分钟的间隔内度过30分钟。

Person     cumulative_time   Activity
A              10             Game
A              30             Eat
B              10             Sleep
B              20             Game
B              30             Sleep

which means person A did gaming during the first 10 minutes, 
and eating during the next 20 minutes, 
and person B was sleeping for the first 10 min, 
gaming for the next 10 min, and sleeping for the last 10 mins.

我想重组数据集。每一行都是独一无二的人。

然后，每列将是这样的每个时间间隔。

Person          time10    time20         time30
 A             Game         Eat           Eat
 B             Sleep        Game          Sleep

我知道我可以使用“崩溃”来使人独特，但我不知道这可以用于我的目的。 “reshape”命令做了类似的事情，但我再也无法弄清楚如何使用它来做我想做的事。

Answer 1

重塑是解决这个问题的方法。这样的事情可以实现你的需要。

clear
input str1 Person int cumulative_time str8 Activity
A              10             Game
A              30             Eat
B              10             Sleep
B              20             Game
B              30             Sleep
end
rename Activity time
reshape wide time, i(Person) j(cumulative_time)
replace time20 = time10 if missing(time20)
replace time30 = time20 if missing(time30)
list, clean

如果你的问题有很多cumulative_time值，而不只是三个，我会以不同的方式解决缺少值的问题。

Answer 2

除了William Lisowski的回答，这是一种使用tsset和tsfill命令的方法：

clear
input str1 Person int cumulative_time str8 Activity
A              10             Game
A              30             Eat
B              10             Sleep
B              20             Game
B              30             Sleep
end
rename Activity time

egen id = group(Person)
tsset id cumulative_time, delta(10)
tsfill, full

bysort id : replace Person = Person[_n-1] if Person==""
bysort id : replace time= time[_n+1] if time==""
drop id

reshape wide time, i(Person) j(cumulative_time)
list, clean

哪个输出：

       Person   time10   time20   time30  
  1.        A     Game      Eat      Eat  
  2.        B    Sleep     Game    Sleep

将每人累积变量扩展为时间间隔变量

2 个答案: