Question

基于正则表达式的函数f是什么，给定输入文本和字符串，返回文本中包含此字符串的所有单词。例如：

f("This is just a simple text to test some basic things", "si")

将返回：

["simple", "basic"]

（因为这两个词包含子串"si"）

怎么做？

Answer 1

我不相信没有比我的方法更好的方法来做到这一点，但是像：

import re

def f(s, pat):
    pat = r'(\w*%s\w*)' % pat       # Not thrilled about this line
    return re.findall(pat, s)


print f("This is just a simple text to test some basic things", "si")

使用：

['simple', 'basic']

Answer 2

对于像这样的东西，我不会使用正则表达式，我会使用这样的东西：

def f(string, match):
    string_list = string.split()
    match_list = []
    for word in string_list:
        if match in word:
            match_list.append(word)
    return match_list

print f("This is just a simple text to test some basic things", "si")

Answer 3

这是我尝试解决方案。我将输入字符串拆分为＆＃34; ＆＃34;，然后尝试将每个单词与模式匹配。如果找到匹配项，则将该单词添加到结果集中。

import re

def f(str, pat):
    matches = list()
    str_list = str.split(' ');

    for word in str_list:
        regex = r'' + re.escape(word)
        match = re.search(regex, word)
        if match:
            matches.append(word)
    return matches

print f("This is just a simple text to test some basic things", "si")

Answer 4

import re
def func(s, pat):
pat = r'\b\S*%s\S*\b' % re.escape(pat) 
return re.findall(pat, s)


print f("This is just a simple text to test some basic things", "si")

你需要这个。\b只会通过剪切字边界来取出单词。\S不会选择任何space。

Python正则表达式：返回包含给定子字符串的单词列表

4 个答案: