Python re.findall returning only first character

时间:2018-06-15 15:22:16

标签: python regex python-3.x regex-lookarounds

Working in Python 3.6, I have a list of html files with date prefixes. I'd like to return all dates, so I join the list and use some regex, like so:

import re
snapshots =  ['20180614_SII.html', '20180615_SII.html']
p = re.compile("(\d|^)\d*(?=_)")
snapshot_dates = p.findall(' '.join(snapshots))

snapshot_dates is a list, ['2', '2'], but I'm expecting ['20180614', '20180615']. Demonstration here: https://regexr.com/3r44o嵌入了无法显示的视频。我错过了什么?

1 个答案:

答案 0 :(得分:3)

您可以简化模式以使用 \ d + 而不是(\ d | ^)\ d *




  p = re.compile(“\ d +(?= _)”)
 print(p.findall(''。join(snapshots)))
#['20180614' ,'20180615']
  




但是,在这种情况下,您可能不需要 regex 来实现所需的结果。您只需将字符串拆分为 _




  print([x.split(“_”)[0] for快照中的x]
#['20180614','20180615']