我有以下表格的数据
firm month_year sales competitor competitor_location competitor_branch_1 competitor_branch_2
1 1_2014 25 XYZ US EEE RRR
1 2_2014 21 XYZ US FFF
1 2_2014 21 ABC UK GGG
...
21 1_2009 11 LKS UK AAA
21 1_2009 11 AIS UK BBB
21 1_2009 11 AJS US CCC
21 2_2009 12 LKS UK AAA
我仍然希望在month_year级别为每家公司提供一个条目,但不希望其他变量的单独行,只需要列。我试图把它变成这种格式。
firm month_year sales competitor_1 competitor_2 competitor_3 competitor_1_location competitor_2_location competitor_3_location competitor_1_branch_1 competitor_2_branch_1 competitor_3_branch_1 competitor_1_branch_2 competitor_2_branch_2 competitor_3_branch_2 competitor_1_branch_3 competitor_2_branch_3 competitor_3_branch_3
我想reshape wide sales competitor competitor_location competitor_branch_1 competitor_branch_2, i(firm) j(month_year)
答案 0 :(得分:2)
大多数代码只是设置示例数据(但可能效率低下)。我认为encode
不是必需的,但建议使用。{/ p>
代码每个公司只提供一次观察(如我的评论中所述)。
clear all
set more off
*----- example data -----
input ///
firm str7 month_year sales str3 competitor str3 competitor_location str3 competitor_branch_1 str3 competitor_branch_2
1 "1_2014" 25 "XYZ" "US" "EEE" "RRR"
1 "2_2014" 21 "XYZ" "US" "FFF"
1 "2_2014" 21 "ABC" "UK" "GGG"
21 "1_2009" 11 "LKS" "UK" "AAA"
21 "1_2009" 11 "AIS" "UK" "BBB"
21 "1_2009" 11 "AJS" "US" "CCC"
21 "2_2009" 12 "LKS" "UK" "AAA"
end
encode competitor, gen(comp)
encode competitor_location, gen(comploc)
encode competitor_branch_1, gen(compbr1)
encode competitor_branch_1, gen(compbr2)
gen date = ym( real(substr(month_year,3,.)), real(substr(month_year,1,1)) )
format date %tm
drop competitor* month*
list
*----- what you want ?? -----
bysort firm: gen j = _n // this sorting is not unique
reshape wide date sales comp comploc compbr1 compbr2, i(firm) j(j)