删除少于4个字符的单词(python)

时间:2018-10-06 18:45:07

标签: python

with open('text.txt','r') as f:
    for i in f:
        trantab = str.maketrans({key: None for key in string.punctuation})
        j = i.translate(trantab)
        result1.append(j)
shortword = re.compile(r'\W*\b\w{1,4}\b')
shortword.sub('', result1)
f = result1

,错误是:

  line 13, in shortword.sub('', result1)
TypeError: expected string or bytes-like object

我该如何解决?

2 个答案:

答案 0 :(得分:0)

由于尝试使用[] .sub()数组而收到此错误...

我用此脚本解决了您的需求:

import re

t = []
t.append("THIS IS A SIMPLE DUMMY TEXT")
t.append("ANOTHER INDEX BLA BLA")

for i in t: 
    shortword = re.compile(r'\W*\b\w{1,4}\b')
    t = shortword.sub('', str(t))

print(t)

您只需要将shortword.sub('',result1)分配给result1,并确保使用str():

result1 = shortword.sub('', str(result1))

我相信这会对您有帮助!

答案 1 :(得分:0)

假设每个单词都在一行上,否则您将不得不用content来分解.split()

with open('something.txt') as f:
    content = [line.strip() for line in f]

res = list(filter(lambda x: len(x) >= 4, content))