Question

我有一个文件wordlist.txt，里面有100个随机单词，每个单独一行。我目前使用以下代码从该文件中获取12个随机单词。为了避免选择完全相同的12个单词，我想构建一个额外的检查。 12个随机字写入output.txt。如何使我的脚本将12个随机单词（以相同顺序）与output.txt中的12个随机单词（1行）进行比较？

我目前使用以下内容从wordlist.txt中读取12个随机单词并将其写入output.txt：

teller = 0

while True:
    teller += 1

    #Choose 12 random words and write to textfile
    print "\nRound",teller
    f1=open('output.txt', 'w+')
    count = 0
    while (count<12):
        f1.write(random.choice([x.rstrip() for x in open('wordlist.txt')])+ " ")
        count += 1
    f1.close()

Answer 1

而不是random.choice()，将所有字词都读入列表并使用random.sample()：

with open('wordlist.txt') as wlist:
    words = [w.strip() for w in wlist]
with open('output.txt', 'w') as output:
    for word in random.sample(words, 12):
        output.write(word + '\n')

random.sample()保证从您的输入列表中选择12个不同的单词。

因为你的单词列表很小（只有100个单词），所以将它们全部读入内存列表中就可以了。

如果您的输入文件较大（兆字节到千兆字节），您可能想要转移到can pick a uniform sample out of any iterable的算法，无论大小如何，只需要输出样本大小的内存。

如果您需要查找前一次运行中output.txt 中尚未出现的12个随机单词，则需要先将这些单词读入一组：

with open('wordlist.txt') as wlist: words = [w.strip() for w in wlist] with open('output.txt', 'r') as output: seen = {w.strip() for w in output} with open('output.txt', 'a') as output: count = 0 while count < 12: new_word = random.choice(words) if new_word in seen: words.remove(new_word) continue seen.add(new_word) output.write(new_word + '\n') count += 1

在这里，我打开带有output.txt的{{1}}文件进行追加，添加我们以前从未见过的新12个单词。

Python：比较相同随机行的两个文本文件

1 个答案: