我现在有以下文字:
Two important events took place on 19/11/1923 and 30/02/1934 respectively.
我想提取两个dates
,但我希望它们保存在不同的变量中。
我已经尝试了我previous question中描述的regex
解决方案,但在这种情况下,它没有按预期工作。
是否可以保存两个日期?
答案 0 :(得分:2)
每当您提出问题时,提供您尝试过的代码和reproducible example都很重要。有关如何提出好问题的提示,请阅读this page。
考虑您当前和以前的例子:
clear
input str80 string
"This sentence contains a certain date which is 06-08-2003."
"Two important events took place on 19-11-1923 and 30-02-1934 respectively."
"On this date, 29-12-1945 my grandmother was born."
"12-04-1997 was an important year for celebrations."
end
list string
+----------------------------------------------------------------------------+
| string |
|----------------------------------------------------------------------------|
1. | This sentence contains a certain date which is 06-08-2003. |
2. | Two important events took place on 19-11-1923 and 30-02-1934 respectively. |
3. | On this date, 29-12-1945 my grandmother was born. |
4. | 12-04-1997 was an important year for celebrations. |
+----------------------------------------------------------------------------+
是的,可以在regex
循环中将assert
与for
合并来提取这两个日期:
clonevar temp_string = string
generate date1 = ""
generate date2 = ""
local reg_ex "(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)([0-9][0-9])"
forvalues i = 1 / 4 {
local dates
local j = 0
while `j' == 0 {
capture assert regex(temp_string[`i'],"`reg_ex'")
if _rc == 0 {
local dates = "`dates' " + regexs(1) + "-" + regexs(2) + "-" + regexs(3) + regexs(4)
replace temp_string = regexr(temp_string[`i'], "`reg_ex'", "null") in `i'
}
else {
local dates_n : word count `dates'
if `dates_n' == 1 {
replace date1 = trim("`dates'") in `i'
}
else {
tokenize `dates'
replace date1 = "`1'" in `i'
replace date2 = "`2'" in `i'
}
local j = 1
}
}
}
drop temp_string
这段代码实际上是做什么的,检查每个string
是否包含多个日期。如果False
,则会将日期保存在变量date1
中。如果True
,则第二个日期会保存在单独的变量date2
中。在这种情况下:
list date1 date2
+-------------------------+
| date1 date2 |
|-------------------------|
1. | 06-08-2003 |
2. | 19-11-1923 30-02-1934 |
3. | 29-12-1945 |
4. | 12-04-1997 |
+-------------------------+
您可以轻松调整此示例以提取更多日期。