我有表1和表2,其中包含name
和date
个变量。
我想删除表1中的观察结果,并在表2中使用相同的name
和date
。此外,对于表1和表1之间的相同name
和date
2,我想删除表1中的下一个日期。
表1:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 name long date
"A" 17659
"A" 17724
"A" 17900
"A" 17901
"A" 18086
"A" 18102
"A" 18239
"B" 17659
"B" 17662
"B" 17669
"B" 17676
"B" 17684
"B" 17701
"B" 18026
"C" 18177
"C" 18187
"C" 18195
"C" 18219
"C" 18235
"C" 18250
"C" 18391
"C" 18391
"C" 18392
end
format %d date
表2:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 name long date
"A" 17724
"A" 17900
"A" 18102
"B" 17659
"B" 17669
"B" 17701
"B" 18087
"C" 18187
"C" 18235
"C" 18250
end
format %d date
预期结果如下:
+------+-----------+
| name | date |
+------+-----------+
| A | 7-May-08 |
| A | 8-Jul-09 |
| B | 1-Jun-08 |
| C | 7-Oct-09 |
| C | 18-Nov-09 |
| C | 10-May-10 |
+------+-----------+
我该怎么做?
答案 0 :(得分:0)
我不认为我得到这个,因为我无法重现你的结果。然而,这里的技术可能会有所帮助。
$arr = array("ZN1874" => "(12 > 5)", "ZN101" => "(20 > 5)");
$arr1 = Array ("ZN1874" => "(12 > 3)", "ZN101" => "(20 > 3)");
$newArr = array();
foreach($arr1 as $key=>$val){
if(array_key_exists($key,$arr)){
$newArr[$key] = $arr[$key]." and ".$val;
}
else {
$newArr[$key] = $arr[$key]." and ".$val;
}
}
print_r($newArr); //Output Array ( [ZN1874] => (12 > 5) and (12 > 3) [ZN101] => (20 > 5) and (20 > 3) )
clear
input str4 name long date
"A" 17659
"A" 17724
"A" 17900
"A" 17901
"A" 18086
"A" 18102
"A" 18239
"B" 17659
"B" 17662
"B" 17669
"B" 17676
"B" 17684
"B" 17701
"B" 18026
"C" 18177
"C" 18187
"C" 18195
"C" 18219
"C" 18235
"C" 18250
"C" 18391
"C" 18391
"C" 18392
end
format %d date
gen table = 1
save table1 , replace
clear
input str4 name long date
"A" 17724
"A" 17900
"A" 18102
"B" 17659
"B" 17669
"B" 17701
"B" 18087
"C" 18187
"C" 18235
"C" 18250
end
format %d date
gen table = 2
append using table1
bysort name date (table) : gen todrop = table == 1 & table[1] != table[_N]
bysort table name date : replace todrop = 1 if todrop[_n-1] == 1
by table name date : replace todrop = 1 if todrop[_n-1] == 1 & date == date[_n-1]
drop if todrop
答案 1 :(得分:0)
只要没有重复的条目,下面的代码就会为您提供所需的输出:
clear
input str4 name1 long date1
"A" 17659
"A" 17724
"A" 17900
"A" 17901
"A" 18086
"A" 18102
"A" 18239
"B" 17659
"B" 17662
"B" 17669
"B" 17676
"B" 17684
"B" 17701
"B" 18026
"C" 18177
"C" 18187
"C" 18195
"C" 18219
"C" 18235
"C" 18250
"C" 18391
"C" 18391
"C" 18392
end
input str4 name2 long date2
"A" 17724
"A" 17900
"A" 18102
"B" 17659
"B" 17669
"B" 17701
"B" 18087
"C" 18187
"C" 18235
"C" 18250
end
format %d date1
format %d date2
local obs = _N
generate todrop1 = 0
forvalues i = 1 / `obs' {
forvalues j = 1 / `obs' {
replace todrop1 = 1 in `i' if name1[`i'] == name2[`j'] & ///
date1[`i'] == date2[`j']
}
}
generate todrop2 = 0
forvalues i = 1 / `obs' {
if todrop1[`i'] == 1 {
replace todrop2 = 1 in `=`i'+1'
}
}
list name1 date1 if todrop1 == 0 & todrop2 == 0
在这种特殊情况下,C 09may2010
出现在输出中,因为它在name1
中存在两次:
+-------------------+
| name1 date1 |
|-------------------|
1. | A 07may2008 |
5. | A 08jul2009 |
12. | B 01jun2008 |
15. | C 07oct2009 |
18. | C 18nov2009 |
|-------------------|
22. | C 09may2010 |
23. | C 10may2010 |
+-------------------+
确实,从"C" 18391
删除重复的条目name1
并重新运行我们获得的代码:
+-------------------+
| name1 date1 |
|-------------------|
1. | A 07may2008 |
5. | A 08jul2009 |
12. | B 01jun2008 |
15. | C 07oct2009 |
18. | C 18nov2009 |
|-------------------|
22. | C 10may2010 |
+-------------------+
如果您的数据中有重复的条目,您可以先使用duplicates
命令删除它们,假设这是您在用例中做的。