我想将一个小的Perl程序重写为Python。 我用它处理文本文件如下:
输入:
00000001;Root;;
00000002; Documents;;
00000003; oracle-advanced_plsql.zip;file;
00000004; Public;;
00000005; backup;;
00000006; 20110323-JM-F.7z.001;file;
00000007; 20110426-JM-F.7z.001;file;
00000008; 20110603-JM-F.7z.001;file;
00000009; 20110701-JM-F-via-summer_school;;
00000010; 20110701-JM-F-yyy.7z.001;file;
期望的输出:
00000001;;Root;;
00000002; ;Documents;;
00000003; ;oracle-advanced_plsql.zip;file;
00000004; ;Public;;
00000005; ;backup;;
00000006; ;20110323-JM-F.7z.001;file;
00000007; ;20110426-JM-F.7z.001;file;
00000008; ;20110603-JM-F.7z.001;file;
00000009; ;20110701-JM-F-via-summer_school;;
00000010; ;20110701-JM-F-yyy.7z.001;file;
以下是有效的Perl代码:
#filename: perl_regex.pl
#/usr/bin/perl -w
while(<>) {
s/^(.*?;.*?)(\w)/$1;$2/;
print $_;
}
从命令行调用它:perl_regex.pl input.txt
Perl风格的正则表达式的解释:
s/ # start search-and-replace regexp
^ # start at the beginning of this line
( # save the matched characters until ')' in $1
.*?; # go forward until finding the first semicolon
.*? # go forward until finding... (to be continued below)
)
( # save the matched characters until ')' in $2
\w # ... the next alphanumeric character.
)
/ # continue with the replace part
$1;$2 # write all characters found above, but insert a ; before $2
/ # finish the search-and-replace regexp.
有谁能告诉我,如何在Python中获得相同的结果?特别是对于1美元和2美元的变量,我找不到类似的东西。
答案 0 :(得分:2)
s / pattern / replace /在python正则表达式中的替换指令是re.sub(pattern,replace,string)函数,或re.compile(pattern).sub(replace,string)。在您的情况下,您将这样做:
_re_pattern = re.compile(r"^(.*?;.*?)(\w)")
result = _re_pattern.sub(r"\1;\2", line)
请注意,$1
变为\1
。至于perl,你需要以你想要的方式迭代你的行(open,inputfile,splitlines,......)。
答案 1 :(得分:1)
Python正则表达式与Perl非常相似,除了:
r'raw string literal'
。\1
,\2
,..或\g<1>
,\g<2>
,.. 使用re.sub
替换。
import re
import sys
for line in sys.stdin: # Explicitly iterate standard input line by line
# `line` contains trailing newline!
line = re.sub(r'^(.*?;.*?)(\w)', r'\1;\2', line)
#print(line) # This print trailing newline
sys.stdout.write(line) # Print the replaced string back.