保持观察的不同组合(对)

时间:2019-10-16 03:27:16

标签: stata

我有一个数据集,必须删除重复的组合。

这些组合是成对的地方,每两列一个:

ID    Place1         Place2

1     Ann Arbor      Toledo
2     LA             San Francisco
3     Chicago        Peoria
4     Pittsburgh     Cleveland
5     Richmond       New Port
6     Ann Arbor      Cincinnati
7     LA             San Francisco
8     LA             San Jose
9     Springfield    Chicago
10    Richmond       New Port
11    Atlanta        Greenville

如何获得下面的输出?

ID    Place1         Place2

1     Ann Arbor      Toledo
2     LA             San Francisco
3     Chicago        Peoria
4     Pittsburgh     Cleveland
5     Richmond       New Port
6     Ann Arbor      Cincinnati
7     LA             San Jose
8     Springfield    Chicago
9     Atlanta        Greenville

1 个答案:

答案 0 :(得分:1)

以下对我有用:

clear

input ID  str20 Place1 str20 Place2
1 "Ann Arbor" "Toledo"
2 "LA" "San Francisco"
3 "Chicago" "Peoria"
4 "Pittsburgh" "Cleveland"
5 "Richmond" "New Port"
6 "Ann Arbor" "Cincinnati"
7 "LA" "San Francisco"
8 "LA" "San Jose"
9 "Springfield" "Chicago"
10 "Richmond" "New Port"
11 "Atlanta" "Greenville"
end

duplicates drop Place1 Place2, force

list, separator(0)

     +----------------------------------+
     | ID        Place1          Place2 |
     |----------------------------------|
  1. |  1     Ann Arbor          Toledo |
  2. |  2            LA   San Francisco |
  3. |  3       Chicago          Peoria |
  4. |  4    Pittsburgh       Cleveland |
  5. |  5      Richmond        New Port |
  6. |  6     Ann Arbor      Cincinnati |
  7. |  8            LA        San Jose |
  8. |  9   Springfield         Chicago |
  9. | 11       Atlanta      Greenville |
     +----------------------------------+

在Stata的命令提示符中键入help duplicates,以获取详细信息和完整语法。

重要的是要注意,如果您的数据中有成对的数据(例如以下数据对),则此方法将无效:

LA San Francisco 
San Francisco LA

有关如何处理这种情况,请参见@NickCox的this文章。