我想打印输入文件中出现字符串的所有行以及行号。到目前为止,我编写了如下所示的代码。它正在工作,但不是我想要的方式:
def index(filepath, keyword):
with open(filepath) as f:
for lineno, line in enumerate(f, start=1):
matches = [k for k in keyword if k in line]
if matches:
result = "{:<15} {}".format(','.join(matches), lineno)
print(result)
print (line)
index('deneme.txt', ['elma'])
输出如下:
elma 15
Sogan+Noun ,+Punc domates+Noun ,+Punc patates+Noun ,+Punc elma+Noun ve+Conj turunçgil+Noun+A3pl ihracat+Noun+P3sg+Dat devlet+Noun destek+Noun+P3sg ver+Verb+Pass+Prog2+Cop .+Punc
到目前为止,还不错,但是当我输入类似"Sog"
的关键字时,它也会找到Sogan
,但是我不想要那样,我只想检查空白之间的标记。我想我需要为此编写正则表达式,但我得到了一个,但现在无法将该正则表达式添加到此代码中。
r'[\w+]+'
答案 0 :(得分:1)
您可以使用以下正则表达式:
import re
lines = [
'Sogan+Noun ,+Punc domates+Noun ,+Punc patates+Noun ,+Punc elma+Noun ve+Conj turunçgil+Noun+A3pl ihracat+Noun+P3sg+Dat devlet+Noun destek+Noun+P3sg ver+Verb+Pass+Prog2+Cop .+Punc',
'Sog+Noun ,+Punc domates+Noun ,+Punc patates+Noun ,+Punc elma+Noun ve+Conj turunçgil+Noun+A3pl ihracat+Noun+P3sg+Dat devlet+Noun destek+Noun+P3sg ver+Verb+Pass+Prog2+Cop .+Punc',
]
keywords = ['Sog']
pattern = re.compile('(\w+)\+')
for lineno, line in enumerate(lines):
words = set(m.group(1) for m in pattern.finditer(line)) # convert to set for efficiency
matches = [keyword for keyword in keywords if keyword in words]
if matches:
result = "{:<15} {}".format(','.join(matches), lineno)
print(result)
print(line)
输出
Sog 1
Sog+Noun ,+Punc domates+Noun ,+Punc patates+Noun ,+Punc elma+Noun ve+Conj turunçgil+Noun+A3pl ihracat+Noun+P3sg+Dat devlet+Noun destek+Noun+P3sg ver+Verb+Pass+Prog2+Cop .+Punc
说明
模式'(\w+)\+'
的任何一组字母后跟一个+
字符,+
是特殊字符,因此您必须对其进行转义以进行匹配。然后使用group提取匹配的组(即字母组)。
进一步
答案 1 :(得分:1)
您可能要使用单词边界标记\b
。这是\w
和\W
之间过渡的空匹配。如果您希望关键字是文字字符串,则必须首先escape。您可以使用|
将所有内容组合到一个正则表达式中:
pattern = re.compile(r'\b(' + '|'.join(map(re.escape, keyword)) + r')\b')
OR
pattern = re.compile(r'\b(?' + '|'.join(re.escape(k) for k in keyword) + r')\b')
现在,计算比赛要容易一些,因为您可以使用finditer
而不是自己进行理解:
matches = pattern.finditer(line)
由于每个匹配项都包含在一个组中,因此打印并不困难:
result = "{:<15} {}".format(','.join(m.group() for m in matches), lineno)
OR
result = "{:<15} {}".format(','.join(map(re.Match.group(), matches)), lineno)
当然,不要忘记
import re
拐角案例
如果您的关键字彼此是同一前缀的子集,请确保较长的关键字排在前。例如,如果您有
keyword = ['foo', 'foobar']
正则表达式将
\b(foo|foobar)\b
当您遇到其中有foobar
的行时,foo
将与之成功匹配,然后对\b'. This is documented behavior of
||将失败。解决方案是在构造表达式之前,通过减小长度来对所有关键字进行预排序:
keywords.sort(key=len, reversed=True)
或者,如果可以使用非列表输入:
keywords = sorted(keywords, key=len, reversed=True)
如果您不喜欢此顺序,则始终可以在匹配后按其他顺序打印它们。
答案 2 :(得分:1)
问题:类似“ Sog”的关键字,它也可以找到Sogan...。我只希望空格之间有标记。 ...我如何将该正则表达式添加到此代码中。
使用<?xml version="1.0" encoding="utf-8"?>
<android.support.v4.widget.DrawerLayout xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:background="@color/bluetwo"
android:fitsSystemWindows="true"
android:id="@+id/drawer_layout">
<RelativeLayout
android:layout_width="wrap_content"
android:layout_height="wrap_content">
<include
android:id="@+id/tbar"
layout="@layout/my_toolbar"
android:layout_width="match_parent"
android:layout_height="?attr/actionBarSize"
android:layout_alignParentStart="true"
android:layout_alignParentTop="true"
android:layout_alignParentLeft="true" />
</RelativeLayout>
<RelativeLayout
android:layout_width="wrap_content"
android:layout_height="wrap_content">
<Button
android:id="@+id/equels"
android:layout_width="101dp"
android:layout_height="183dp"
android:layout_alignBottom="@id/point"
android:layout_alignParentEnd="true"
android:layout_alignParentRight="true"
android:background="@drawable/ic_button_equals_orange" />
<Button
android:id="@+id/eight"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_toRightOf="@id/seven"
android:layout_toEndOf="@id/seven"
android:layout_alignBottom="@id/seven"
android:background="@drawable/ic_button_eight_blue"
android:layout_marginLeft="1dp"
android:layout_marginStart="1dp" />
<Button
android:id="@+id/nine"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_alignBottom="@id/eight"
android:layout_marginStart="1dp"
android:layout_toEndOf="@id/eight"
android:layout_toRightOf="@id/eight"
android:background="@drawable/ic_button_nine_blue"
android:layout_marginLeft="1dp" />
<Button
android:id="@+id/times"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_toEndOf="@id/persent"
android:layout_toRightOf="@id/persent"
android:layout_alignBottom="@id/persent"
android:layout_marginLeft="1dp"
android:background="@drawable/ic_button_times_blue"
android:layout_marginStart="1dp" />
<Button
android:id="@+id/devide"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_toRightOf="@id/times"
android:layout_toEndOf="@id/times"
android:layout_alignBottom="@id/times"
android:layout_marginLeft="1dp"
android:layout_marginStart="1dp"
android:background="@drawable/ic_button_devide_blue"
/>
<TextView
android:id="@+id/ValueTextBox"
android:layout_width="422dp"
android:layout_height="145dp"
android:layout_alignBottom="@id/back"
android:layout_marginBottom="90dp"
android:background="@color/bluesemi"
android:gravity="center_horizontal|center|end"
android:padding="6dp"
android:textSize="50sp"
app:fontFamily="@font/zonaprothin"
android:textColor="@color/bluetwo"/>
<Button
android:id="@+id/back"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_alignBottom="@id/seven"
android:layout_marginBottom="92dp"
android:background="@drawable/ic_button_back_space_blue"/>
<Button
android:id="@+id/persent"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_toEndOf="@id/back"
android:layout_toRightOf="@id/back"
android:layout_alignBottom="@id/back"
android:layout_marginLeft="1dp"
android:layout_marginStart="1dp"
android:background="@drawable/ic_button_persentage_blue"
/>
<Button
android:id="@+id/four"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_alignBottom="@+id/one"
android:layout_marginBottom="92dp"
android:background="@drawable/ic_button_four_blue" />
<Button
android:id="@+id/five"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_alignStart="@+id/four"
android:layout_alignLeft="@+id/four"
android:layout_alignBottom="@id/four"
android:layout_marginStart="103.5dp"
android:layout_marginLeft="103.5dp"
android:layout_marginBottom="0dp"
android:background="@drawable/ic_button_five_blue" />
<Button
android:id="@+id/six"
android:layout_alignBottom="@id/five"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_marginLeft="207dp"
android:layout_marginStart="207dp"
android:background="@drawable/ic_button_six_blue"/>
<Button
android:id="@+id/one"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_alignTop="@id/zero"
android:layout_alignParentStart="true"
android:layout_alignParentLeft="true"
android:layout_marginTop="-92dp"
android:background="@drawable/ic_button_one_blue" />
<Button
android:id="@+id/two"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_alignBottom="@id/one"
android:layout_marginStart="1dp"
android:layout_marginLeft="1dp"
android:layout_toEndOf="@id/one"
android:layout_toRightOf="@id/one"
android:background="@drawable/ic_button_two_blue" />
<Button
android:id="@+id/three"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_toEndOf="@id/two"
android:layout_toRightOf="@id/two"
android:layout_alignBottom="@id/two"
android:layout_marginLeft="1dp"
android:layout_marginStart="1dp"
android:background="@drawable/ic_button_three_blue"
/>
<Button
android:id="@+id/plus"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_above="@+id/equels"
android:layout_marginStart="1dp"
android:layout_marginLeft="1dp"
android:layout_toEndOf="@id/six"
android:layout_toRightOf="@id/six"
android:layout_alignBottom="@+id/six"
android:background="@drawable/ic_button_plus_blue" />
<Button
android:id="@+id/minus"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_toEndOf="@id/nine"
android:layout_toRightOf="@id/nine"
android:layout_alignBottom="@id/nine"
android:layout_marginLeft="1dp"
android:layout_marginStart="1dp"
android:background="@drawable/ic_button_minus_blue" />
<Button
android:id="@+id/seven"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_alignBottom="@id/four"
android:layout_marginBottom="92dp"
android:background="@drawable/ic_button_seven_blue" />
<Button
android:id="@+id/point"
android:layout_width="102dp"
android:layout_height="91dp"
android:layout_above="@+id/zero"
android:layout_alignBottom="@id/zero"
android:layout_toEndOf="@id/zero"
android:layout_toRightOf="@id/zero" />
<Button
android:id="@+id/zero"
android:layout_width="205dp"
android:layout_height="91dp"
android:layout_alignParentStart="true"
android:layout_alignParentLeft="true"
android:layout_alignParentBottom="true"
android:layout_marginStart="0dp"
android:layout_marginLeft="0dp"
android:layout_marginEnd="1dp"
android:layout_marginRight="1dp"
android:layout_marginBottom="0dp"
android:background="@drawable/ic_button_zero_blue" />
</RelativeLayout>
<android.support.design.widget.NavigationView
android:id="@+id/nav_view"
android:layout_width="wrap_content"
android:layout_height="match_parent"
android:layout_gravity="start"
android:fitsSystemWindows="true"
app:headerLayout="@layout/nav_header"
app:menu="@menu/draw_items" />
</android.support.v4.widget.DrawerLayout>
构建regex
,并使用keywords
分隔符来表示多个or |
。
例如:
keywords
输出:
import re def index(lines, keyword): rc = re.compile(".*?(({})\+.+?\s)".format(keyword)) for i, line in enumerate(lines): match = rc.match(line) if match: print("lines[{}] match:{}\n{}".format(i, match.groups(), line)) if __name__ == "__main__": lines = [ 'Sogan+Noun ,+Punc domates+Noun ,+Punc patates+Noun ,+Punc elmaro+Noun ve+Conj ... (omitted for brevity)', 'Sog+Noun ,+Punc domates+Noun ,+Punc patates+Noun ,+Punc elma+Noun ve+Conj ... (omitted for brevity)', ] index(lines, 'elma') index(lines, 'Sog|elma')
使用Python测试:3.5