从原始字符串中提取月份名称?

时间:2017-05-26 09:07:19

标签: python monthcalendar

从原始字符串中提取月份名称

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<p>Choose &laquo;See more...&raquo; for more options:</p>
<select>
  <optgroup label="America">
    <option value="ca">Canada</option>
    <option value="us">United States</option>
  </optgroup>
  <optgroup label="Europe">
    <option  value="fr">France</option>
    <option value="uk">United Kinddom</option>
  </optgroup>
</select>

我想从原始字符串中提取月份名称,我拿了一个approch通过创建主元组来提取它

'January 2045 Robots'
'2065 March Mars Colony'
'2089 December Alien'

有没有任何优雅或任何pythonic方式来实现这个

注意:目前,要求输入字符串仅包含单个月(不是多个,如s = 'January 2045 Robots' months_master = ('january','feb','march','april','may','june','july','august','september','october','november','december') month = [i for i in months_master if i in s.casefold()] print(month[0]) 'january'

4 个答案:

答案 0 :(得分:1)

您可以从calendar导入月份名称,也可以使用生成器代替列表理解:

>>> from calendar import month_name
>>> s = 'January 2045 Robots'
>>> months = set(m.lower() for m in month_name[1:])
>>> next((x for x in s.lower().split() if x in months), None)
'january'

或者,您可以使用regular expression

>>> import re
>>> pattern = "|".join(month_name[1:])
>>> re.search(pattern, s, re.IGNORECASE).group(0)
'January'

答案 1 :(得分:0)

使用单词拆分或单词标记化,查看该单词是否在月份列表中

text = 'January 2045 Robots'
month_master = ('january','feb','march','april','may','june','july','august','september','october','november','december')
month_found = [word for word in text.split() if word.lower() in month_master]

# output ['January']

答案 2 :(得分:0)

您可以将月份存储在set而不是元组中,并检查此集中是否有单词。这将减少O(N * M)的时间复杂度,其中N是字符串的长度,M是months_master元组到O(N)的长度。 这样的事情:

    months_master = set("january", "february", ...)
    month = [word for word in s.casefold().split() if word in months_master]

答案 3 :(得分:0)

calendar模块为名为month_name的本地化月份名称提供生成器。这个列表确实包含一个空字符串,所以你需要捕获它,并且月份出现在标题大小写(“1月”等)中,所以你也需要抓住它。我们使用if x and x in s.title()执行此操作 - 当x为空字符串时,此值为False

from calendar import month_name
s = 'January 2045 Robots'
month = [x for x in month_name if x and x in s.title()]