如果满足条件,从字符串中删除单词?

时间:2014-08-12 03:48:31

标签: python string

我需要对一系列字符串进行一些“清理”。 1.要删除的特殊字符(比如!@#$%^等) 2.字符串中的所有单词都应该是小写字母 3.如果单词是< = 2个字符,则删除单词。 (“a,it,me,us”等)

trainset = [('It is too bad that our jane is just a pigeon. It would be great if it could speak. It would be able to prove my innocence.'), ('I have no other choice. Is death the only way to prove it? Loving you is really hard!'), ('These are my last words.')]

def cleanedthings(trainset):
cleanedtrain = []
specialch = "!@#$%^&*-=_+:;\".,/?`~][}{|)("
for line in trainset:
    for word in line.split():
        lowword = word.lower()
        for ch in specialch:
            if ch in lowword:
                lowword = lowword.replace(ch,"")
        if len(lowword) >= 3:
            cleanedtrain.append(lowword)
return cleanedtrain

以上功能似乎不起作用..你能帮助我吗?而且,我需要最终输出为字符串格式,而不是列表格式。

1 个答案:

答案 0 :(得分:0)

检查缩进和语法。逻辑很好。

trainset = [('It is too bad that our jane is just a pigeon. It would be great if it could speak. It would be able to prove my innocence.'), ('I have no other choice. Is death the only way to prove it? Loving you is really hard!'), ('These are my last words.')]

def cleanedthings(trainset):
    cleanedtrain = []
    specialch = "!@#$%^&*-=_+:;\".,/?`~][}{|)("
    for line in trainset:
        for word in line.split():
            lowword = word.lower()
            for ch in specialch:
                if ch in lowword:
                    lowword = lowword.replace(ch,"")
            if len(lowword) >= 3:
                cleanedtrain.append(lowword)
    return cleanedtrain

print " ".join(cleanedthings(trainset))