我需要RegEx的帮助。 我要匹配 BEGIN:VEVENT 和 END:VEVENT 之间的所有行,但前提是这些行之间是字符串 PARTSTAT = DECLINED 。 下面,我放置了3个事件的文本示例(其中两个包含PARTSTAT = DECLINED,其中一个包含PARTSTAT = ACCEPTED)。 我想删除我拒绝的事件。
BEGIN:VEVENT
UID:040000008200E00074C5B7101A82E0080000000090E9AB1DA717D4010000000000000000
10000000FF519C52170B604C82055C2922E0EA43
RRULE:FREQ=WEEKLY;BYDAY=MO
X-ALT-DESC;FMTTYPE=text/html:<html xmlns:v="urn:schemas-microsoft-com:vml" x
mlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-micros
oft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/om
ml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-T
ype content="text/html\; charset=iso-8859-2"><meta name=Generator content="M
icrosoft Word 15 (filtered medium)"><style><!--\n/* Font Definitions */\n@fo
nt-face\n{font-family:"Cambria Math"\;\npanose-1:2 4 5 3 5 4 6 3 2 4\;}\n@fo
nt-face\n{font-family:Calibri\;\npanose-1:2 15 5 2 2 2 4 3 2 4\;}\n/* Style
Definitions */\np.MsoNormal\, li.MsoNormal\, div.MsoNormal\n{margin:0cm\;\nm
argin-bottom:.0001pt\;\nfont-size:11.0pt\;\nfont-family:"Calibri"\,sans-seri
f\;\nmso-fareast-language:EN-US\;}\na:link\, span.MsoHyperlink\n{mso-style-p
riority:99\;\ncolor:#0563C1\;\ntext-decoration:underline\;}\na:visited\, spa
n.MsoHyperlinkFollowed\n{mso-style-priority:99\;\ncolor:#954F72\;\ntext-deco
ration:underline\;}\np.msonormal0\, li.msonormal0\, div.msonormal0\n{mso-sty
le-name:msonormal\;\nmso-margin-top-alt:auto\;\nmargin-right:0cm\;\nmso-marg
in-bottom-alt:auto\;\nmargin-left:0cm\;\nfont-size:12.0pt\;\nfont-family:"Ti
mes New Roman"\,serif\;}\nspan.Stylwiadomocie-mail18\n{mso-style-type:person
al-compose\;\nfont-family:"Calibri"\,sans-serif\;\ncolor:windowtext\;}\n.Mso
ChpDefault\n{mso-style-type:export-only\;\nfont-size:10.0pt\;}\n@page WordSe
<o:p></o:p></p></div></body></html>
LOCATION:sala_3.11@test.com
ATTENDEE;CN=sala_3.11@test.com;PARTSTAT=DECLINED:mailto:sala_3.11@test.com
ATTENDEE;CN=Name Surname
PRIORITY:5
X-MICROSOFT-CDO-BUSYSTATUS:TENTATIVE
X-MICROSOFT-CDO-IMPORTANCE:1
X-MS-OLK-AUTOSTARTCHECK:FALSE
X-MS-OLK-CONFTYPE:0
SUMMARY:None
DTSTART;TZID="Europe/UK":19980615T110000
DTEND;TZID="Europe/UK":19980615T113000
STATUS:CONFIRMED
CLASS:PUBLIC
X-MICROSOFT-CDO-INTENDEDSTATUS:BUSY
TRANSP:OPAQUE
LAST-MODIFIED:20180709T150603Z
DTSTAMP:20180709T150602Z
SEQUENCE:0
BEGIN:VALARM
ACTION:DISPLAY
TRIGGER;RELATED=START:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
UID:040000008200E00074C5B7101A82E0080000000090D3C2088E0DD4010000000000000000
1000000079086417F9C0F9478C1916D1A1E58267
X-ALT-DESC;FMTTYPE=text/html:<html xmlns:v="urn:schemas-microsoft-com:vml" x
mlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-micros
oft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/om
ml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-T
ype content="text/html\; charset=us-ascii"><meta name=Generator content="Mic
rosoft Word 15 (filtered medium)"><style><!--\n/* Font Definitions */\n@font
-face\n{font-family:"Cambria Math"\;\npanose-1:2 4 5 3 5 4 6 3 2 4\;}\n@font
-face\n{font-family:Calibri\;\npanose-1:2 15 5 2 2 2 4 3 2 4\;}\n/* Style De
finitions */\np.MsoNormal\, li.MsoNormal\, div.MsoNormal\n{margin:0cm\;\nmar
gin-bottom:.0001pt\;\nfont-size:11.0pt\;\nfont-family:"Calibri"\,sans-serif\
;\nmso-fareast-language:EN-US\;}\na:link\, span.MsoHyperlink\n{mso-style-pri
ority:99\;\ncolor:#0563C1\;\ntext-decoration:underline\;}\na:visited\, span.
MsoHyperlinkFollowed\n{mso-style-priority:99\;\ncolor:#954F72\;\ntext-decora
tion:underline\;}\np.msonormal0\, li.msonormal0\, div.msonormal0\n{mso-style
nk="#954F72"><div class=WordSection1><p class=MsoNormal><o:p> \;</o:p></
p></div></body></html>
LOCATION:sala_3.11@test.com
ATTENDEE;CN=sala_3.11@test.com;PARTSTAT=ACCEPTED:mailto:sala_3.11@test.com
PRIORITY:5
X-MICROSOFT-CDO-BUSYSTATUS:TENTATIVE
X-MICROSOFT-CDO-IMPORTANCE:1
X-MS-OLK-AUTOSTARTCHECK:FALSE
X-MS-OLK-CONFTYPE:0
SUMMARY:None
DTSTART;TZID="Europe/UK":20180628T103000
DTEND;TZID="Europe/UK":20180628T140000
STATUS:CONFIRMED
CLASS:PUBLIC
X-MICROSOFT-CDO-INTENDEDSTATUS:BUSY
TRANSP:OPAQUE
LAST-MODIFIED:20180626T184118Z
DTSTAMP:20180626T184118Z
SEQUENCE:0
BEGIN:VALARM
ACTION:DISPLAY
TRIGGER;RELATED=START:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
UID:040000008200E00074C5B7101A82E008000000008030BEAE1C0ED4010000000000000000
100000008AEEB06CBD136945961F46812BD0D171
X-ALT-DESC;FMTTYPE=text/html:<html xmlns:v="urn:schemas-microsoft-com:vml" x
mlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-micros
oft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/om
ml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-T
ype content="text/html\; charset=windows-1250"><meta name=Generator content=
"Microsoft Word 15 (filtered medium)"><style><!--\n/* Font Definitions */\n@
font-face\n{font-family:"Cambria Math"\;\npanose-1:2 4 5 3 5 4 6 3 2 4\;}\n@
font-face\n{font-family:Calibri\;\npanose-1:2 15 5 2 2 2 4 3 2 4\;}\n/* Styl
e Definitions */\np.MsoNormal\, li.MsoNormal\, div.MsoNormal\n{margin:0cm\;\
nmargin-bottom:.0001pt\;\nfont-size:11.0pt\;\nfont-family:"Calibri"\,sans-se
rif\;\nmso-fareast-language:EN-US\;}\na:link\, span.MsoHyperlink\n{mso-style
-priority:99\;\ncolor:#0563C1\;\ntext-decoration:underline\;}\na:visited\, s
pan.MsoHyperlinkFollowed\n{mso-style-priority:99\;\ncolor:#954F72\;\ntext-de
<o:p></o:p></p></div></body></html>
LOCATION:Sala 3.11
ATTENDEE;CN=Sala kon 3.11;PARTSTAT=DECLINED:mailto:sala_3.1
1@test.com
PRIORITY:5
X-MICROSOFT-CDO-BUSYSTATUS:TENTATIVE
X-MICROSOFT-CDO-IMPORTANCE:1
X-MS-OLK-AUTOSTARTCHECK:FALSE
X-MS-OLK-CONFTYPE:0
SUMMARY:None
DTSTART;TZID="Europe/UK":19980615T110000
DTEND;TZID="Europe/UK":19980615T113000
STATUS:CONFIRMED
CLASS:PUBLIC
X-MICROSOFT-CDO-INTENDEDSTATUS:BUSY
TRANSP:OPAQUE
LAST-MODIFIED:20180627T114346Z
DTSTAMP:20180627T114346Z
SEQUENCE:0
BEGIN:VALARM
ACTION:DISPLAY
TRIGGER;RELATED=START:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
UID:040000008200E00074C5B7101A82E008000000008077B51A0819D4010000000000000000
1000000027DB863B9FBE90468D3B3F888327EF15
答案 0 :(得分:0)
由于目标是删除包含PARTSTAT=DECLINED
的条目,因此以下操作将通过仅保留带有PARTSTAT=ACCEPTED
的条目来实现:
import re
print([m for m, s in re.findall(r'\b(BEGIN:VEVENT\b.*?\bPARTSTAT=(ACCEPTED|DECLINED)\b.*?\bEND:VEVENT)\b', data, re.DOTALL) if s == 'ACCEPTED'])
例如,给定:
data = '''BEGIN:VEVENT SOME TEXT PARTSTAT=DECLINED END:VEVENT BEGIN:VEVENT SOME TEXT PARTSTAT=ACCEPTED END:VEVENT BEGIN:VEVENT SOME TEXT PARTSTAT=DECLINED END:VEVENT'''
上面的代码将输出:
['BEGIN:VEVENT SOME TEXT PARTSTAT=ACCEPTED END:VEVENT']
答案 1 :(得分:0)
如果您要查找的唯一字符串是BEGIN:VEVENT
,END:VEVENT
和PARTSTAT=DECLINED
(只要它们是常数),您甚至可能不需要正则表达式。
解析它的代码可能更冗长,但是对于不熟悉正则表达式的人来说,它比使用re.DOTALL
的正则表达式更明确。
例如在Python中,您可以做类似的事情
def _next_event(lines, start=0):
"""
Find the body of the next event as a string.
Returns (body, end_index) if found, or (None, -1) if not.
body is defined to be every line after BEGIN:VEVENT and before END:VEVENT.
end_index is the index of END:VEVENT.
"""
for i, line in enumerate(lines, start):
if line.strip() == 'BEGIN:VEVENT':
start = i
break
else:
# Return None if there is not BEGIN:VEVENT, -1 for "not found"
return None, -1
for i, line in enumerate(lines[start+1:], start+1):
if line.strip() == 'END:VEVENT':
end = i
break
else:
# Return None if there is not END:VEVENT, -1 for "not found"
return None, -1
return '\n'.join(lines[start+1:end]), end
def get_events(lines):
"""
Get the bodies of all events.
"""
events = []
body, i = _next_event(lines)
while i != -1:
events.append(body)
body, i = _next_event(lines, i)
return events
if __name__ == '__main__':
with open(calendar_file, 'r') as f:
lines = f.readlines()
events = get_events(lines)
for event in events:
if event.find('PARTSTAT=DECLINED') != -1:
# You'll need to define "delete event"
delete_event(event)
如果要将逻辑扩展到某人拒绝,再扩展为任何人拒绝,则可以将其扩展为以下内容:
def anyone_declined(event):
rsvps = re.findall('PARTSTAT=(ACCEPTED|DECLINED)', event, re.DOTALL)
return any(rsvp == 'DECLINED' for rsvp in rsvps)