使用求和创建新变量

时间:2019-04-25 14:57:48

标签: stata

我有如下数据:

| Country | Year | Firm | Profit |
|---------|------|------|--------|
| A       | 1    | 1    | 10     |
| A       | 1    | 2    | 20     |
| A       | 1    | 3    | 30     |
| A       | 1    | 4    | 40     |

我想为每个公司 i 创建一个新变量,用于计算以下内容:

enter image description here

例如,公司1的变量值将为:

max(20 - 10, 0) + max(30 - 10, 0) + max(40 - 10, 0) 

如何在Stata中按国家和年份进行操作?

2 个答案:

答案 0 :(得分:3)

注意:这是发布的第一个答案。这并没有避免从字面上取OP代数并希望在组内以最大值实现计算的陷阱。但是我在发布后意识到必须做一个更简单的方法,@ Romalpa Akzo到达了那里,这很棒。我应要求取消删除它,因为它确实显示了一些用于循环组和使用自定义Mata函数为每个组实现计算的机制。

在这里,我编写了一个Mata函数,以返回组的所需结果,然后在组上循环以填充预定义的变量。

要测试具有多个组的数据集的代码,我使用了Stata的mpg玩具数据集中的auto

mata : 

void wanted (string scalar varname, string scalar usename, string scalar resultname) { 
    real scalar i 
    real colvector x, result, zero  
    result = x = st_data(., varname, usename) 
    zero = J(rows(x), 1, 0)     
    for(i = 1; i <= rows(x); i++) { 
        result[i] = sum(rowmax((x :- x[i], zero))) 
    } 
    st_store(., resultname, usename, result) 
} 

end         

sysuse auto, clear  

sort foreign rep78 mpg 
egen group = group(foreign rep78), label  
summarize group, meanonly 
local G = r(max) 

generate wanted = . 
generate touse = 0 

quietly forvalues g = 1 / `G' { 
    replace touse = group == `g' 
    mata : wanted("mpg", "touse", "wanted")  
} 

那是如何解决的?结果如下:

. list mpg wanted group if foreign, sepby(group) 

     +--------------------------+
     | mpg   wanted       group |
     |--------------------------|
 53. |  21        7   Foreign 3 |
 54. |  23        3   Foreign 3 |
 55. |  26        0   Foreign 3 |
     |--------------------------|
 56. |  21       35   Foreign 4 |
 57. |  23       19   Foreign 4 |
 58. |  23       19   Foreign 4 |
 59. |  24       13   Foreign 4 |
 60. |  25        8   Foreign 4 |
 61. |  25        8   Foreign 4 |
 62. |  25        8   Foreign 4 |
 63. |  28        2   Foreign 4 |
 64. |  30        0   Foreign 4 |
     |--------------------------|
 65. |  17       84   Foreign 5 |
 66. |  17       84   Foreign 5 |
 67. |  18       77   Foreign 5 |
 68. |  18       77   Foreign 5 |
 69. |  25       42   Foreign 5 |
 70. |  31       18   Foreign 5 |
 71. |  35        6   Foreign 5 |
 72. |  35        6   Foreign 5 |
 73. |  41        0   Foreign 5 |
     |--------------------------|
 74. |  14        .           . |
     +--------------------------+

那么,它将如何应用于您的数据?

clear 
input str1 Country  Year  Firm  Profit 
     A        1     1     10     
     A        1     2     20     
     A        1     3     30     
     A        1     4     40     
end 

egen group = group(Country Year), label  
summarize group, meanonly 
local G = r(max) 
generate wanted = . 
generate touse = 0 

quietly forvalues g = 1/`G' { 
    replace touse = group == `g' 
    mata: wanted("Profit", "touse", "wanted")  
} 

结果:

. list Firm Profit wanted, sepby(group)  

     +------------------------+
     | Firm   Profit   wanted |
     |------------------------|
  1. |    1       10       60 |
  2. |    2       20       30 |
  3. |    3       30       10 |
  4. |    4       40        0 |
     +------------------------+

答案 1 :(得分:3)

以下是您问题的直接解决方案(请注意使用dataex来提供示例数据)

* Example generated by -dataex-. To install: ssc install dataex
clear
input str1 Country float(Year Firm Profit)
"A" 1 1 10
"A" 1 2 20
"A" 1 3 30
"A" 1 4 40
end

generate Wanted = -Profit
bysort Country Year (Wanted): replace Wanted = sum(Profit) - _n * Profit 

list 

     +-----------------------------------------+
     | Country   Year   Firm   Profit   Wanted |
     |-----------------------------------------|
  1. |       A      1      4       40        0 |
  2. |       A      1      3       30       10 |
  3. |       A      1      2       20       30 |
  4. |       A      1      1       10       60 |
     +-----------------------------------------+

其背后的逻辑如下:

enter image description here