假设我有以下数据:
表1(我的主数据集):
clear
input str1 Name str9 (Day Time)
A 24-Mar-08 "10:30:01"
A 24-Mar-08 "10:30:01"
B 24-Mar-08 "10:30:01"
B 24-Mar-08 "11:03:12"
B 24-Mar-08 "11:03:12"
C 25-Mar-08 "10:30:01"
D 25-Mar-08 "10:30:01"
D 25-Mar-08 "11:13:59"
E 25-Mar-08 "11:13:59"
F 25-Mar-08 "11:13:59"
F 25-Mar-08 "11:13:59"
F 25-Mar-08 "11:59:01"
end
表2:
clear
input str1 Insrument str9 (Day Time) float price
A 30-Mar-10 "09:29:34" 10
E 03-Mar-09 "08:23:19" 14
E 20-Mar-12 "12:15:11" 29
F 01-Mar-11 "10:30:01" 12
end
我想根据表1中的name
变量删除重复的观察结果。我在Stata中打开表1,需要将表2引入Stata,然后删除两个表中具有相同名称的相同观察,来自表1。
最后,结果是:
+------+------------+----------+
| name | date | time |
+------+------------+----------+
| B | 24-Mar-08 | 10:30:01 |
| B | 24-Mar-08 | 11:03:12 |
| B | 24-Mar-08 | 11:03:12 |
| C | 25-Mar-08 | 10:30:01 |
| D | 25-Mar-08 | 10:30:01 |
| D | 25-Mar-08 | 11:13:59 |
+------+------------+----------+
SAS中的SQL代码是:
proc SQL;
create table table3 as
select * from table1
where name not in (select Instrument from table2);
quit;
如何使用Stata使用此程序?
答案 0 :(得分:1)
最简单的方法是对名称进行合并,只保留那些没有匹配的名称:
clear
input str1 Name str9 (Day Time)
A 24-Mar-08 "10:30:01"
E 24-Mar-08 "10:30:01"
E 24-Mar-08 "10:30:01"
F 24-Mar-08 "11:03:12"
end
keep Name
duplicates drop
tempfile names
save "`names'"
clear
input str1 Name str9 (Day Time)
A 24-Mar-08 "10:30:01"
A 24-Mar-08 "10:30:01"
B 24-Mar-08 "10:30:01"
B 24-Mar-08 "11:03:12"
B 24-Mar-08 "11:03:12"
C 25-Mar-08 "10:30:01"
D 25-Mar-08 "10:30:01"
D 25-Mar-08 "11:13:59"
E 25-Mar-08 "11:13:59"
F 25-Mar-08 "11:13:59"
F 25-Mar-08 "11:13:59"
F 25-Mar-08 "11:59:01"
end
merge m:1 Name using "`names'", keep(master) nogen
list, clean noobs
这会产生:
Name Day Time
B 24-Mar-08 10:30:01
B 24-Mar-08 11:03:12
B 24-Mar-08 11:03:12
C 25-Mar-08 10:30:01
D 25-Mar-08 10:30:01
D 25-Mar-08 11:13:59
答案 1 :(得分:0)
// Keeping equities
rename name Instrument
merge m:1 Instrument using "file name path"
keep if _merge == 3
keep Instrument Day Time