我有以下文字:
This sentence contains a certain date which is 06-08-2003.
On this date, 29-12-1945 my grandmother was born.
12-04-1997 was an important year for celebrations.
我想在date
函数的变量中获取substr()
似乎不起作用?
答案 0 :(得分:2)
您没有向我们展示您的代码,因此我们无法告诉您substr()
的错误。
也就是说,如果您知道substr()
中所需项目的位置,string
函数将按预期工作。
在这种情况下,dates
出现在每个string
内的不同位置。获得所需输出的一种方法是使用strpos()
函数来查找
连字符是。然后,您可以将其用作参考点来计算每个字符串中date
的起始位置:
clear
set obs 3
input str60 string
"This sentence contains a certain date which is 06-08-2003."
"On this date, 29-12-1945 my grandmother was born."
"12-04-1997 was an important year for celebrations."
end
generate new_string = ""
forvalues i = 1 / 3 {
local pos = strpos(string[`i'], "-") - 2
replace new_string = substr(string, `pos', 10) in `i'
}
list string new_string
+-------------------------------------------------------------------------+
| string new_string |
|-------------------------------------------------------------------------|
1. | This sentence contains a certain date which is 06-08-2003. 06-08-2003 |
2. | On this date, 29-12-1945 my grandmother was born. 29-12-1945 |
3. | 12-04-1997 was an important year for celebrations. 12-04-1997 |
+-------------------------------------------------------------------------+
此方法假定dates
中的strings
一致 。也就是说,它们都具有相同的格式并且没有错误。但是,实际上通常情况并非如此。
获得所需输出的更好方法是使用regex
和regexs
:
generate new_string = regexs(1) + "-" + regexs(2) + "-" + regexs(3)+ regexs(4) if ///
regex(string,"(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)([0-9][0-9])")
以上 正则表达式 不仅可以找到每个date
中的每个string
,还可以使用一些逻辑条件检查是否前者是有效的。例如:
replace string = "On this date, 29-131945 my grandmother was born." in 2
drop new_string
generate new_string = regexs(1) + "-" + regexs(2) + "-" + regexs(3)+ regexs(4) if ///
regex(string,"(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)([0-9][0-9])")
list string new_string
+-------------------------------------------------------------------------+
| string new_string |
|-------------------------------------------------------------------------|
1. | This sentence contains a certain date which is 06-08-2003. 06-08-2003 |
2. | On this date, 29-131945 my grandmother was born. |
3. | 12-04-1997 was an important year for celebrations. 12-04-1997 |
+-------------------------------------------------------------------------+
如您所见,如果第二个date
中的string
为29-13-1945
或29-131945
,则相应的观察结果为空。因此,这种方法通常会阻止您获得非感性结果,同时还可以识别有问题的案例。
但请注意,即使这种方法也不是防弹的,您必须通过更改正则表达式来引入额外的灵活性 如果你想处理更复杂的案件。