替换python 3中的重复单词

时间:2017-10-24 21:15:41

标签: python string python-3.x duplicates words

我想看一段看起来像这样的文字:

Engineering will save the world from inefficiency. Inefficiency is a blight on the world and its humanity.

并返回:

Engineering will save the world from inefficiency..is a blight on the . and its humanity.

也就是说,我想删除重复的单词并将其替换为"。" 这就是我开始编写代码的方式:

lines= ["Engineering will save the world from inefficiency.",
        "Inefficiency is a blight on the world and its humanity."]

def solve(lines):    
    clean_paragraph = []    
    for line in lines:    
        if line not in str(lines):
            clean_paragraph.append(line)
        print (clean_paragraph)    
        if word == word in line in clean_paragraph:
            word = "."              
     return clean_paragraph

我的逻辑是创建一个包含字符串中所有最差的列表,并将每个列表添加到新列表中,然后,如果该单词已经在列表中,则将其替换为"。&# 34 ;.我的代码返回[]。任何建议将不胜感激!

3 个答案:

答案 0 :(得分:0)

<强>问题:

if word == word in line in clean_paragraph:

我不确定您对此的期望,但它始终是False。这里有一些明确的括号:

if word == ((word in line) in clean_paragraph):

这会评估word in line,它可以是布尔值。但是,该值会显示在clean_paragraph的文本中,因此生成的表达式为False

<强> REPAIR

编写你想要编码的循环:

for clean_line in clean_paragraph:
    for word in clean_line:

此时,你必须弄清楚你想要的变量名称。你试图让一些变量同时代表两个不同的东西(lineword;我修复了第一个变量。)

你还必须学会正确地操纵循环及其索引;部分问题在于你一次编写的代码多于你能处理的代码 - 但是。备份,一次写一个循环,然后打印结果,这样你就知道自己要进入什么。例如,从

开始
for line in lines:

    if line not in str(lines):
        print("line", line, "is new: append")
        clean_paragraph.append(line)
    else:
        print("line", line, "is already in *lines*")

我想你会发现另一个问题 - 比我发现的更早出现问题。修复此问题,然后一次只添加一行或两行,逐步构建程序(和编程知识)。当某些东西不起作用时,你知道它几乎肯定是新的行。

答案 1 :(得分:0)

这是一种方法。它用点替换所有重复的单词。

lines_test = (["Engineering will save the world from inefficiency.",
               "Inefficiency is a blight on the world and its humanity."])


def solve(lines):
    clean_paragraph = ""
    str_lines = " ".join(lines)
    words_lines = str_lines.replace('.', ' .').split()
    for word in words_lines:
        if word != "." and word.lower() in clean_paragraph.lower():
            word = " ."
        elif word != ".":
            word = " " + word
        clean_paragraph += word
    return clean_paragraph


print(solve(lines_test))

输出:

Engineering will save the world from inefficiency. . is . blight on . . and its humanity.

在进行比较之前,将单词或字符串转换为小写或大写(一致形式)非常重要。

答案 2 :(得分:0)

另一种方法可以

lines_test = 'Engineering will save the world from inefficiency. Inefficiency is a blight on the world and its humanity.'

text_array = lines_test.split(" ")
formatted_text = ''
for word in text_array:
    if word.lower() not in formatted_text:   
        formatted_text = formatted_text +' '+word
    else:
        formatted_text = formatted_text +' '+'.'

print(formatted_text)  

输出

Engineering will save the world from inefficiency. . is . blight on . . and its humanity.