1。索引所有字符串，以便稍后通过索引返回它们：

Question

我想对相似的字符串进行分组，但是，我希望能够明智地了解是否有像＆＃39; /＆＃39;这样的约定。或者＆＃39; - ＆＃39;是分歧而不是字母差异。

给出以下输入：

moose
mouse
mo/os/e
m.ouse

alpha = ['/','.']

我想根据受限制的字母组对字符串进行分组，其中输出应为：

moose
mo/os/e

mouse
m.ouse

我知道我可以使用difflib获得类似的字符串，但它并没有提供限制字母表的选项。还有另一种方法吗？谢谢。

更新

通过检查事件，alpha更容易实现，而不是受限制的字母。因此，我改变了标题。

Answer 1

可能是这样的：

$sql  = "select stoks,srate,prate,taxp,";
$sql .= "iname,suplier,icod from stock where iname='".$item_name."'";
$sql .= " AND  suplier ='".$suplier." '";

Answer 2

这是一个需要一些（简单）步骤的想法：

import re
example_strings = ['m/oose', 'moose', 'mouse', 'm.ouse', 'ca...t', 'ca..//t', 'cat']

1。索引所有字符串，以便稍后通过索引返回它们：

indexed_strings = list(enumerate(example_strings))

2。使用索引作为键，将字符串作为值，将具有受限字符的所有字符串存储在字典中。然后暂时删除受限制的字符以进行排序：

# regex to match restricted alphabet
restricted = re.compile('[/\.]')
# dictionary to store strings with restricted char
restricted_dict = {}
for (idx, string) in indexed_strings:
    if restricted.search(string):
        # storing the string with a restricted char by its index
        restricted_dict[idx] = string
        # stripping the restricted char temporarily and returning to the list
        indexed_strings[idx] = (idx, restricted.sub('', string))

3。按字符串值对已清理的字符串列表进行排序，然后再次遍历字符串并将剥离的字符串替换为其原始值：

indexed_strings.sort(key=lambda x: x[1])
# make a new list for the final set of strings
final_strings = []
for (idx, string) in indexed_strings:
    if idx in restricted_dict:
        final_strings.append(restricted_dict[idx])
    else:
        final_strings.append(string)

结果：['ca...t', 'ca..//t', 'cat', 'm/oose', 'moose', 'mouse', 'm.ouse']

Answer 3

由于您要对单词进行分组，因此您应该使用groupby。

您只需要定义一个删除alpha字符的函数（例如，使用str.translate），然后您可以将sort和groupby应用于您的数据：

from itertools import groupby

words = ['moose', 'mouse', 'mo/os/e', 'm.ouse']
alpha = ['/','.']

alpha_table = str.maketrans('', '', ''.join(alpha))

def remove_alphas(word):
    return word.lower().translate(alpha_table)

words.sort(key=remove_alphas)
print(words)
# ['moose', 'mo/os/e', 'mouse', 'm.ouse'] # <- Words are sorted correctly.

for common_word, same_words in groupby(words, remove_alphas):
    print(common_word)
    print(list(same_words))
# moose
# ['moose', 'mo/os/e']
# mouse
# ['mouse', 'm.ouse']

使用Python查找具有受限字母字符的类似字符串

3 个答案:

1。索引所有字符串，以便稍后通过索引返回它们：

2。使用索引作为键，将字符串作为值，将具有受限字符的所有字符串存储在字典中。然后暂时删除受限制的字符以进行排序：

3。按字符串值对已清理的字符串列表进行排序，然后再次遍历字符串并将剥离的字符串替换为其原始值：