从文本文件中排除行(Python)

时间:2018-05-24 13:29:11

标签: python

我最近发布了关于尝试让我的代码排除" RT"和" DM"在计算线条时。我试图用这个程序实现的是从用户读取一个数字x,然后将该数量的用户输出为顶部x" tweeters"。来自包含DM或RT的用户的推文不应被计算在内,但仍然会使用此代码计算。我已对代码的第一行进行了更改,但我相信它没有提供正确的输出的原因是因为完整代码的其余部分中存在某些内容:

x = input("Enter a number: ")
with open('stream.txt', 'r') as file:
    fileread = file.readlines()

tweets = [string.split() for string in fileread
          if not "DM" in string and not "RT" in string]
numofwords = [len(word)-1 for word in tweets]
with open('stream.txt',"r") as f:
    wordlist = [r.split()[0] for r in f]
maximum = max(numofwords)
users = [a for a, b in enumerate(wordlist) if b == maximum]
tweetuser = [word[0] for word in [tweets[a] for a in users]]
tweetuser.sort()


word_counter = {}
for word in wordlist:
    if word in word_counter:
        word_counter[word] += 1
    else:
        word_counter[word] = 1

popular_words = sorted(word_counter, key = word_counter.get, reverse = 
True)
top = popular_words[:x]
top.sort()
for user in top:
    print(user)

这是我一直在使用的文本文件:

 andrew I hate mondays.
 fred Python is cool.
 fred Ko Ko Bop Ko Ko Bop Ko Ko Bop for ever
 andrew @fred no it isn't, what do you think @john???
 judy @fred enough with the k-pop
 judy RT @fred Python is cool.
 andrew RT @judy @fred enough with the k pop
 george RT @fred Python is cool.
 andrew DM @john Oops
 john DM @andrew Who are you go away! Do you know him, @judy?
 sam DM
 sam DM
 sam DM
 sam DM

我输入的数字为3(前3位用户)的输出是:

andrew
fred
sam

这是不正确的,因为sam是我所包含的诱饵用户,不应出现在任何列表中,因为他的所有推文都包含单词DM。非常感谢所有帮助,谢谢:)

2 个答案:

答案 0 :(得分:0)

更改

wordlist = [r.split()[0] for r in f]

wordlist = [r.split()[0] for r in f if "DM" not in r and "RT" not in r]

答案 1 :(得分:0)

我认为你这样做很复杂。如果您只想要能够做到的前x用户:

soft-flush

为我输出[' andrew',' fred',' judy']。