我收到一串告诉我一个事件的字符串,我需要将所有字符串转换为开始DateTime
和结束DateTime
。起初,我尝试通过字符串逐个字符,但是当字符串中有多个日期时,它变得太复杂了。我也试图解析许多日期格式,但是当日期和月份出现时,那么时间,它不起作用。我正在使用C#,并且还尝试使用Regex搜索字符串,但我遇到了麻烦,因为我无法将这一天与正确的时间相匹配。
以下是我给出的字符串的几个例子:
2015年9月12日至13日,周六上午10:30至中午6点星期天上午10点中午
应该有2个日期:
StartDate: 2015/09/12 10:30 EndDate: 2015/09/12 18:00
StartDate: 2015/09/13 10:00 EndDate: 2015/09/13 12:00
2015年6月3日 - 9月9日,周二至周四下午6-7点,星期日上午10点到11点。
多个日期星期二/星期四/星期日,日期范围:
StartDate: 2015/06/04 18:00 EndDate: 2015/06/04 19:00
StartDate: 2015/06/07 10:00 EndDate: 2015/06/07 11:00
StartDate: 2015/06/09 18:00 EndDate: 2015/06/09 19:00
StartDate: 2015/06/11 18:00 EndDate: 2015/06/11 19:00
...继续遵循相同的模式
谢谢。
答案 0 :(得分:1)
以下是一种可行的方法:
1)扫描/ Lexing - >扫描基本令牌。
Names: September, Saturday, AM, etc.
Numbers: 12, 2015, 9, etc.
Operators serving as Separators: '-', ',', space, etc.
'-' acts as a range operator as in FromDate - ToDate.
',' and space separate components of a date
2)解析 - >用标记构建一个解析树。
Names are classified into Months, days of week, etc.
Numbers are identified as year, month or day if it
can be done unambiguously. Otherwise, their identification
is left to later steps.
We can use some heuristics, like day of month almost always follows month.
3)现在,解析树表示由' - '分隔的日期时间条目。
At this point, a date in the tree can be partial or complete.
Introduce separator when it is missing between adjacent dates or times.
"Sunday 10a.m noon" is missing separator between '10am' and 'noon'
4)从解析树中识别完整和部分日期。
For example, "September 9, 2015" is a complete date, while "June 3"
is incomplete. After extracting at least one complete date, infer
the missing elements in incomplete dates from surrounding context.
"June 3" is incomplete because of missing year, so we grab the
year from the nearest complete date as 2015.
5)如果在上述步骤中找不到完整的日期,
Use two adjacent dates and let them fill in missing parts
from each other to arrive at a complete one. "September 12 - 13, 2015"
is one such example. Left side of the separator is missing
year and can get it from right side. Figure out the date for
a day of week, like Thursday from the complete date in the string