正则表达式不匹配引号之间的模式?

时间:2015-05-15 08:39:57

标签: python regex

我有一个简单的正则表达式来解析源代码文件,每行提取用双引号括起来的内容,用于gettext.po文件

这是我的正则表达式:

gettext_subject = re.compile(r"""[subject: |summary: ]\"(.*?)\"""").findall

以下是输入文件示例:

exports.onAppointment = (appt, user, lang, isNew) ->
  if not user then return Promise.reject "Appointment has no user."
  moment.locale(lang)
  start = moment(appt.when)
  cal = new ICal()
  console.log appt.when
  cal.addEvent
    start: start.toDate()
    end: moment(start).add(2,"hours").toDate()
    summary: "Continental showroom visit"
  mail =
    to: user.emailId
    subject: if isNew then "New appointment" else "Appointment updated"
    alternatives: [
        contentType: "text/calendar",
        contents: new Buffer(cal.toString()),
        contentEncoding: "7bit"
      ]
  template =
    name: "booking"
    lang: lang
    locals:
      name: "#{user.firstName} #{user.lastName}"
      datetime: moment(appt.when).format("dddd Do MMMM [at] HH:mm A")
      cancelurl: config.server.baseUrl + "/appointment/cancel/#{appt._id}"
  emailClient.send2 mail, template

此代码运行正确:

gettext_subject = re.compile(r"""subject: \"(.*?)\"""").findall

并从命令行进行测试也会返回正确的答案

$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> gettext = re.compile(r"""[subject: |summary: ]\"(.*?)\"""").findall
>>> pattern = """subject: \"blah blah blah\"\nsummary: \"summary text\"\nsubject: \"second subject line\"\nsummary: if isNew then \"New appointment\" else \"Appointment updated\"\n"""
>>> print gettext(pattern)
['blah blah blah', 'summary text', 'second subject line', 'New appointment', 'Appointment updated']
>>> 

但是当我通过我的代码运行它时,这不起作用,这里是代码:

import os
import sys
import re
from operator import itemgetter

walk_dir = ["app", "email", "views"]
#t=(" ")
gettext_messages = re.compile(r"""\"(.*)\"""", re.MULTILINE).findall
gettext_re = re.compile(r"""[=|#|{]t\(\"(.*?)\"""").findall
gettext_subject = re.compile(r"""[subject: |summary: ]\"(.*?)\"""").findall

gettext = []
for x in walk_dir:
    curr_dir = "../node-blade-boiler-template/" + x
    for root, dirs, files in os.walk(curr_dir, topdown=False):
        if ".git" in dirs:
            dirs.remove(".git")
        if "node-modules" in dirs:
            dirs.remove("node-modules")
        if "models" in dirs:
            dirs.remove("models")

        for filename in files:
            file_path = os.path.join(root, filename)
            #print('\n- file %s (full path: %s)' % (filename, file_path))
            with open(file_path, 'rb') as f:
                f_content = f.read()
                if 'messages.coffee' == filename:
                    #pass
                    msgids = gettext_messages(f_content)
                elif 'map.coffee' == filename:
                    pass
                elif 'emailtrigger.coffee' == filename:
                    #print f_content
                    if 'subject: ' in f_content:
                        print gettext_subject(f_content)
                        msgids = gettext_subject(f_content)

                else:
                    msgids = gettext_re(f_content)
                for msgid in msgids:
                    msgid = '"' + msgid + '"'
                    #print msgid
                    dic = {
                    'path' : file_path,
                    'msgid' : "%s" % msgid
                    }
                    gettext.append(dic)

任何建议都非常感激。

1 个答案:

答案 0 :(得分:1)

(?:Subject:|Summary:)[^"]*"(.*?)"

你可以试试这个。看看演示。[]不是你的想法。它是一个角色类。[subject]将匹配subjecttcejubs任何字符的任何顺序。

参见演示。

https://regex101.com/r/mT0iE7/33#python