我一直在尝试编写一个读取行的程序,每行都应该返回单个字母。不幸的是问题是在输出中有比输入更多的行(160k输入行大约20行)。
如果有人能告诉我我做错了什么,我会非常高兴。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from collections import Counter
from re import split
import re
from itertools import islice
import random
def checkisin(femalewords, malewords, testin):
with open(femalewords) as filein:
femalewordslist = filein.readlines()
with open(malewords) as filein:
malewordslist = filein.readlines()
letters = "FM"
with open(testin, "rU") as filein:
for line in filein:
malecounter = 0
femalecounter = 0
linia = line.rstrip()
if any(word in linia for word in femalewordslist):
femalecounter = femalecounter+1
if any(word in linia for word in malewordslist):
malecounter = malecounter+1
if malecounter > femalecounter:
print "M"
elif malecounter < femalecounter:
print "F"
elif malecounter == femalecounter:
print random.choice(letters)
checkisin("femaletopwords.txt", "maletopwords.txt", "in2.tsv")
答案 0 :(得分:0)
用户tdelaney说: &#34;你正在使用python的通用换行模式&#34; rU&#34;这可能会产生与其他程序不同的行数,特别是如果有一个未连接的&#34; \ r&#34;在文件中。完成for循环后,执行print repr(filein.newlines)。如果它得到\ r \ n,那就是你的问题。&#34;
解决了这个问题。