我有两个文件:
efile = c:\myexternal.txt
cfile = c:\mycurrent.txt
myexternal.txt:
Paris
London
Amsterdam
New York
mycurrent.txt(但它可以是任何文字):
Paris is a city in France
A city in the UK is London
In the USA there is no city named Manchester
Amsterdam is in the Netherlands
我想要做的是外部文件(原始文本)中的每一行都在当前文件中进行搜索,但是使用正则表达式边界:
体育专业.:
我想在currentfile中找到来自externalfile的所有城市,但不想找到之前有“是”的城市,所有城市必须在城市名后面有空格或者必须在行尾:
boundO = "(?<!is\s)"
boundC = "(?=\s|$)"
#boundO + line in externalfile + boundC
#(regex rawtext regex)
#put every line of external file (c:\myexternal.txt) in list:
externalfile=[]
with open(efile, 'r+', encoding="utf8") as file:
for line in file:
if line.strip(): #if line != empty
line=line.rstrip("\n") #remove linebreaks
line=boundO + line + boundC #add regex bounderies
externalfile.append(line)
results = []
#check every line in c:\mycurrent.txt
with open(cfile, 'r+', encoding="utf8") as file:
for line in file:
if any(ext in line for ext in externalfile):
results.append(line)
这不起作用:
边界不被视为正则表达式。
我错了什么?
答案 0 :(得分:1)
您需要re.search
。使用
with open("check.pl", 'r+') as file:
for line in file:
if any(re.search(ext, line) for ext in externalfile): # <---here
print(line)
results.append(line)
输出
Paris is a city in France
Amsterdam is in the Netherlands
[Finished in 0.0s]
修改强>
我不确定,但请查看
boundO = "(?<!is\s)\\b"
boundC = "(?=\s|$)"
#boundO + line in externalfile + boundC
#(regex rawtext regex)
#put every line of external file (c:\myexternal.txt) in list:
externalfile=[]
with open("check", 'r+') as file:
for line in file:
if line.strip(): #if line != empty
line=line.rstrip("\n") #remove linebreaks
#line=boundO + line + boundC #add regex bounderies
externalfile.append(line)
results = []
print(externalfile)
#check every line in c:\mycurrent.txt
with open("check.pl", 'r+') as file:
for line in file:
if any(re.search(boundO + ext + boundC, line) for ext in externalfile):
print(line)
results.append(line)
答案 1 :(得分:1)
正则表达式需要在使用之前进行编译。
ext in line
只会测试是否可以在行
中找到字符串ext您应该使用以下内容:
import re
regc=re.compile(ext)
regc.search(line)
答案 2 :(得分:1)
您必须使用re.search
代替 compile 'com.android.support:appcompat-v7:23.3.0'
compile 'com.android.support:design:23.3.0'
compile 'com.squareup.retrofit2:retrofit:2.0.0-beta4'
compile 'com.squareup.retrofit2:converter-gson:2.0.0-beta4'
compile 'com.github.zzz40500:AndroidSweetSheet:1.1.0'
compile 'com.github.ksoichiro:android-observablescrollview:1.5.0'
compile 'com.googlecode.android-query:android-query:0.25.9'
compile 'com.facebook.android:facebook-android-sdk:4.+'
compile 'com.android.support:support-v4:23.3.0'
compile 'com.baoyz.pullrefreshlayout:library:1.2.0'
compile 'com.victor:lib:1.0.4'
运算符:
in
并且,为防止文件中的文本被解释为正则表达式,请使用re.escape:
if any(re.search(ext, line) for ext in externalfile):