在python

时间:2018-12-21 21:39:42

标签: python string

我有一长串的字符串,其中包含按给出顺序排列的感兴趣的子字符串,但这是一个在文本文件中使用句子的小示例:

This is a long drawn out sentence needed to emphasize a topic I am trying to learn.
It is new idea for me and I need your help with it please!
Thank you so much in advance, I really appreciate it.

我想从此文本文件中找到同时包含"I""need"的所有句子,但是它们必须按该顺序出现。

因此,在此示例中,'I''need'都出现在句子1和句子2中,但是在句子1中它们的顺序错误,所以我不想返回它。我只想返回第二句话,因为它的顺序是'I need'

我已经使用此示例来标识子字符串,但是我无法弄清楚如何仅按顺序查找它们:

id1 = "I"
id2 = "need"

with open('fun.txt') as f:
    for line in f:
        if id1 and id2 in line:
            print(line[:-1])

这将返回:

This is a long drawn out sentence needed to emphasize a topic I am trying to learn.
It is new idea for me and I need your help with it please!

但我只想要:

It is new idea for me and I need your help with it please!

谢谢!

4 个答案:

答案 0 :(得分:1)

您需要在行{em>之后的部分{em} {em> id2中标识id1

infile = [
    "This is a long drawn out sentence needed to emphasize a topic I am trying to learn.",
    "It is new idea for me and I need your help with it please!",
    "Thank you so much in advance, I really appreciate it.",
]

id1 = "I"
id2 = "need"

for line in infile:
    if id1 in line:
        pos1 = line.index(id1)
        if id2 in line[pos1+len(id1) :] :
            print(line)

输出:

It is new idea for me and I need your help with it please!

答案 1 :(得分:1)

您可以使用正则表达式进行检查。一种可能的解决方案是:

id1 = "I"
id2 = "need"
regex = re.compile(r'^.*{}.*{}.*$'.format(id1, id2))

with open('fun.txt') as f:
    for line in f:
        if re.search(regex, line):
            print(line[:-1])

答案 2 :(得分:0)

只需

  import re
  match = re.match('pattern','yourString' )

https://developers.google.com/edu/python/regular-expressions

所以您要寻找的模式是'I(。*)need' Regex Match all characters between two strings 您可能必须以不同的方式构建模式  因为我不知道是否有例外。如果是这样,您可以运行regex两次以获取原始字符串的子集,然后再次运行以获取所需的完全匹配项

答案 3 :(得分:0)

您可以定义一个函数来计算两个sets(每个句子和I need)的交集,并使用sortedkey来对结果的出现顺序与句子中的出现顺序相同。这样,您可以检查结果列表的顺序是否与I need中的顺序匹配:

a = ['I','need']
l = ['This is a long drawn out sentence needed to emphasize a topic I am trying to learn.',
'It is new idea for me and I need your help with it please!',
'Thank you so much in advance, I really appreciate it.']

自定义函数。如果字符串以相同顺序包含,则返回True

def same_order(l1, l2):
    inters = sorted(set(l1) & set(l2.split(' ')), key = l2.split(' ').index)
    return True if inters == l1 else False

如果返回了l,则返回列表True中的给定字符串:

[l[i] for i, j in enumerate(l) if same_order(a, j)]
#['It is new idea for me and I need your help with it please!']