如何从txt文件中获取带有正则表达式的ID?

时间:2018-07-09 10:36:48

标签: python regex

我想从看起来像这样的文本文件中获取带有正则表达式的ID:

Id:   1
ASIN: 0827229534
  title: Patterns of Preaching: A Sermon Sampler
  group: Book
  salesrank: 396585
  similar: 5  0804215715  156101074X  0687023955  0687074231  082721619X
  categories: 2
   |Books[283155]|Subjects[1000]|Religion & Spirituality[22]|Christianity[12290]|Clergy[12360]|Preaching[12368]
   |Books[283155]|Subjects[1000]|Religion & Spirituality[22]|Christianity[12290]|Clergy[12360]|Sermons[12370]
  reviews: total: 2  downloaded: 2  avg rating: 5
    2000-7-28  cutomer: A2JW67OY8U6HHK  rating: 5  votes:  10  helpful:   9
    2003-12-14  cutomer: A2VE83MZF98ITY  rating: 5  votes:   6  helpful:   5  

到目前为止,这是我的代码,但是返回一个空列表,有人可以帮我吗?

import pandas as pd
import re
regex=r'^Id:(\s*\d*)'
textfile = open("amazon-meta.txt", 'r')
filetext = textfile.read()
matches = re.findall(regex, filetext)
matches

1 个答案:

答案 0 :(得分:0)

尝试使用flags=re.MULTILINE

例如:

import re
with open(filename, "r") as infile:
    print( re.findall(r'^Id:\s*(\d*)', infile.read(), flags=re.MULTILINE))