我有一个简单的正则表达式来解析源代码文件,每行提取用双引号括起来的内容,用于gettext.po文件
这是我的正则表达式:
gettext_subject = re.compile(r"""[subject: |summary: ]\"(.*?)\"""").findall
以下是输入文件示例:
exports.onAppointment = (appt, user, lang, isNew) ->
if not user then return Promise.reject "Appointment has no user."
moment.locale(lang)
start = moment(appt.when)
cal = new ICal()
console.log appt.when
cal.addEvent
start: start.toDate()
end: moment(start).add(2,"hours").toDate()
summary: "Continental showroom visit"
mail =
to: user.emailId
subject: if isNew then "New appointment" else "Appointment updated"
alternatives: [
contentType: "text/calendar",
contents: new Buffer(cal.toString()),
contentEncoding: "7bit"
]
template =
name: "booking"
lang: lang
locals:
name: "#{user.firstName} #{user.lastName}"
datetime: moment(appt.when).format("dddd Do MMMM [at] HH:mm A")
cancelurl: config.server.baseUrl + "/appointment/cancel/#{appt._id}"
emailClient.send2 mail, template
此代码运行正确:
gettext_subject = re.compile(r"""subject: \"(.*?)\"""").findall
并从命令行进行测试也会返回正确的答案
$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> gettext = re.compile(r"""[subject: |summary: ]\"(.*?)\"""").findall
>>> pattern = """subject: \"blah blah blah\"\nsummary: \"summary text\"\nsubject: \"second subject line\"\nsummary: if isNew then \"New appointment\" else \"Appointment updated\"\n"""
>>> print gettext(pattern)
['blah blah blah', 'summary text', 'second subject line', 'New appointment', 'Appointment updated']
>>>
但是当我通过我的代码运行它时,这不起作用,这里是代码:
import os
import sys
import re
from operator import itemgetter
walk_dir = ["app", "email", "views"]
#t=(" ")
gettext_messages = re.compile(r"""\"(.*)\"""", re.MULTILINE).findall
gettext_re = re.compile(r"""[=|#|{]t\(\"(.*?)\"""").findall
gettext_subject = re.compile(r"""[subject: |summary: ]\"(.*?)\"""").findall
gettext = []
for x in walk_dir:
curr_dir = "../node-blade-boiler-template/" + x
for root, dirs, files in os.walk(curr_dir, topdown=False):
if ".git" in dirs:
dirs.remove(".git")
if "node-modules" in dirs:
dirs.remove("node-modules")
if "models" in dirs:
dirs.remove("models")
for filename in files:
file_path = os.path.join(root, filename)
#print('\n- file %s (full path: %s)' % (filename, file_path))
with open(file_path, 'rb') as f:
f_content = f.read()
if 'messages.coffee' == filename:
#pass
msgids = gettext_messages(f_content)
elif 'map.coffee' == filename:
pass
elif 'emailtrigger.coffee' == filename:
#print f_content
if 'subject: ' in f_content:
print gettext_subject(f_content)
msgids = gettext_subject(f_content)
else:
msgids = gettext_re(f_content)
for msgid in msgids:
msgid = '"' + msgid + '"'
#print msgid
dic = {
'path' : file_path,
'msgid' : "%s" % msgid
}
gettext.append(dic)
任何建议都非常感激。
答案 0 :(得分:1)
(?:Subject:|Summary:)[^"]*"(.*?)"
你可以试试这个。看看演示。[]
不是你的想法。它是一个角色类。[subject]
将匹配subject
,tcejubs
任何字符的任何顺序。
参见演示。