在文本文件的特定部分中查找全年

时间:2019-03-11 04:13:10

标签: python regex

我正试图通过搜索模式从文本文件的特定部分获取全部时间,但是我无法从我第一次发现的那一行开始搜索。

这就是我得到的。

for line in infile_1:
    if '[edit]Crime' in line:
    print(re.findall(r'\d{4}',line[1:]))
    for i in infile_1:
        if '[edit]Education' in infile:
              break

这只得到空括号。是错的。

文字:

[edit]Crime
Main articles: Crime in Chicago and Organized crime in Chicago
Murders in the city peaked first in 1974, with 970 murders when the city's population was over 3 million people (resulting in a murder rate of around 29 per 100,000), and again in 1992 with 943 murders, resulting in a murder rate of 34 per 100,000.[114] Chicago, along with other major US cities, experienced a significant reduction in violent crime rates through the 1990s, eventually recording 448 homicides in 2004, the lowest total since 1965 (15.65 per 100,000.) Chicago's homicide tally remained steady throughout 2005, 2006, and 2007 with 449, 452, and 435 respectively.

In 2008, murders rebounded to 510, 2nd highest in the country (though not in per capita rate), breaking 500 for the first time since 2003.[115][116] For 2009 the murder count was down about 10% for the year, to 458.[117]

2010 saw Chicago's murder rate at its lowest levels since 1965. Overall, 435 homicides were recorded for the year, a 5% decrease from 2009.[118]

1 个答案:

答案 0 :(得分:0)

您可以通过拆分文本来删除[edit]Education部分,然后选中[edit]Crime部分:

import re
with open('file.txt') as f:
    data = f.read().split('[edit]Crime')[1].split('[edit]')[0]
years = re.findall('(\d\d\d\d)', data)

输出:

['1974']