说我有一个电影发行年份的清单:
The.Sword.in.the.Stone.1963
The.War.of.The.Worlds.1953
Waynes.World.2.1993
Flora.2017
Candyman.1992
Charming.2018
Candyman.1992
Sollers.Point.2017
Luis.And.The.Aliens.2018
Edie.2017
Daisies.1966
Distant.Voices.Still.Lives.1988
The.Scorpion.King.Book.of.Souls.2018
The.Great.Scout.and.Cathouse.Thursday.1976
Valley.Girl.1983
Psycho.1960
North.By.Northwest.1959
Michael.Jacksons.Moonwalker.1988
如何使用正则表达式从列表中删除2000年之前发行的电影标题?
答案 0 :(得分:2)
如果年份始终是文件的最后4个字符,则不需要任何正则表达式;您可以这样做:
from io import StringIO
txt = '''The.Sword.in.the.Stone.1963
The.War.of.The.Worlds.1953
Waynes.World.2.1993
Flora.2017
Candyman.1992
Charming.2018
Candyman.1992
Sollers.Point.2017
Luis.And.The.Aliens.2018
Edie.2017
Daisies.1966
Distant.Voices.Still.Lives.1988
The.Scorpion.King.Book.of.Souls.2018
The.Great.Scout.and.Cathouse.Thursday.1976
Valley.Girl.1983
Psycho.1960
North.By.Northwest.1959
Michael.Jacksons.Moonwalker.1988'''
with StringIO(txt) as file:
for line in file:
year = int(line.split('.')[-1])
# or:
# year = int(line[-5:])
if year < 2000:
print(line)
答案 1 :(得分:1)
考虑到第一部电影直到1888年才发明,并且您正在寻找2000年之前的电影,因此可以安全地在每个标题的倒数第四个字符中查找1
。假设您的标题存储在字符串l
的列表中:
[t for t in l if t[-4] != '1']
答案 2 :(得分:1)
如果必须使用正则表达式,则可以使用超前匹配以2
开头的任何年份,并将其与字符串的后4个字符进行比较(假设一行的后4个字符始终是一年) )。
import re
# assuming file name is file.txt
with open("d:/a.txt") as file:
for line in file:
if re.match(r'(?=2)\d{4}',line.rstrip()[-4:]):
print(line)
# output,
# Flora.2017
# Charming.2018
# Sollers.Point.2017
# Luis.And.The.Aliens.2018
使用列表理解
with open("d:/a.txt") as file:
print([line for line in file if re.match(r'(?=2)\d{4}',line.rstrip()[-4:])])
# output
# ['Flora.2017\n', 'Charming.2018\n', 'Sollers.Point.2017\n', 'Luis.And.The.Aliens.2018\n', 'Edie.2017\n', 'The.Scorpion.King.Book.of.Souls.2018\n']