查找英文单词是否存在于随机字符字符串中

时间:2019-03-11 17:13:39

标签: python python-3.x string

我正在尝试计算一个英文单词存在于可变长度字符串中的可能性;假设有10个字符。我有用于打印可变长度随机字符的代码,但是我不知道如何检查英文单词是否存在。

我不需要检查特定的单词-我需要检查在此可变长度的字符串中是否存在任何个英语单词。

我有两个问题-如何对10个字符的字符串执行此操作,或者如何对任意长度的字符串执行此操作也很有帮助。

随机字符的代码为:

web: gunicorn main:app.server

和切换器是一个字典,包含分别与A-Z配对的数字1-26。

如果我的输入为10,则该字符串可能类似于“ BFGEHDUEND”,而输出则应为字符串“ BFGEHDUEND”和True,因为该字符串包含英语单词(“ END”)。

1 个答案:

答案 0 :(得分:0)

我想我可以为您提供一个解决方案,该解决方案不仅可以用英语,而且还可以使用其他语言(如果得到NLTK的支持)。

我们将使用NLTK来获取一组所有英语单词(已记录在here,第4.1节中),并将其分配给english

然后,我们遍历变量out,并在所有可能的位置对其进行切片(最小长度为2个字母),并将结果附加到名为all_variants的新列表中。

最后,我们遍历all_variants中的“单词”,检查它们是否在变量english中,并适当地打印响应。

# imports 
import nltk
import string
import random

# getting the alphabet
alph = [x for x in string.ascii_lowercase]
# creating your dictionary
switcher = {}
for i in range(1, 27):
    switcher[i] = alph[i-1]
# using nltk we are going to get a set of all english words
english = set(w.lower() for w in nltk.corpus.words.words())


def infmonktyp(english_dict = english, letter_dictionary = switcher): 
    out = "" 
    count = 0 
    length = int(input("How many characters do you want to print?"))
    if length < 2:
        raise ValueError("Length must be greater than 1")
    for i in range(1, length+1): 
        num = random.randint(1,26) 
        out += letter_dictionary.get(num, "0") 
    # the random word has been created
    print(out)
    all_variants = []
    # getting all variants of the word, minimum of 2 letters
    for i in range(len(out)-1):
        for j in range(i+2, len(out)+1):
            all_variants.append(out[i:j])
    # for know how many words we found, im gussing thats what you have in the second line?
    words_found = 0
    # looping through all the words, if they exist in english, print them, if not keep going
    for word in all_variants:
        if word in english_dict:
            print(word, ' found in ', out)
            words_found += 1
    # if we didnt find any words, print that we didnt find any words
    if words_found == 0:
        print("Couldn't find a word")

# initialising function
infmonktyp(english, switcher)