在文件中查找彼此相反的单词

时间:2011-02-23 21:54:12

标签: python file loops iterator for-loop

抱歉这个新手问题,刚刚开始。我想要一个简单的程序在文件中查找反向词,所以写了这个源,但它不起作用。进入第二个“for”循环后,它不会返回到第一个循环,而是结束程序。任何线索?

def is_reverse(word1, word2):   
   if len(word1) == len(word2):
     if word1 == word2[::-1]:
       return True   
return False

fin = open('List.txt') 
for word1 in fin:
    word1 = word1.strip()
    word1 = word1.lower()
    for word2 in fin:
      word2 = word2.strip()
      word2 = word2.lower()
      print word1 + word2
      if is_reverse(word1, word2) is True:
             print word1 + ' is the opposite of ' + word2 

编辑:   我试图循环一个文件vs一个列表,并得到了一个好奇的(对我而言)结果。如果我使用此代码,一切正常:

def is_reverse(word1, word2):
  if len(word1) == len(word2):
      if word1 == word2[::-1]:
        return True
  return False

fin = open('List.txt')
fin2 = ['test1','test2','test3','test4','test5']
for word1 in fin:
    word1 = word1.strip()
    word1 = word1.lower()
    for word2 in fin2:
      word2 = word2.strip()
      word2 = word2.lower()
      print word1 + word2
      if is_reverse(word1, word2) is True:
             print word1 + ' is the opposite of ' + word2

如果我交换fin和fin2,第一个循环只进行一次瞄准。有人能解释我为什么吗?

4 个答案:

答案 0 :(得分:3)

for word1 in fin逐行迭代,因此word1实际上是一行,而不是单词。那是你的意图吗?

for word2 in fin使用相同的迭代器,所以我认为它将消耗所有输入,for word1 in fin只会执行一次。

因此,最简单的更改是拥有两个文件file1file2,并通过循环每次传递重新打开file2。

def is_reverse(word1, word2):   
   if len(word1) == len(word2):
     if word1 == word2[::-1]:
       return True   
return False

file1 = open('List.txt') 
for word1 in file1:
    word1 = word1.strip()
    word1 = word1.lower()
    file2 = open('List.txt')
    for word2 in file2:
      word2 = word2.strip()
      word2 = word2.lower()
      print word1 + word2
      if is_reverse(word1, word2):
             print word1 + ' is the opposite of ' + word2 

但可能更好的方法是将文件读入列表一次,然后遍历列表而不是文件,例如

def is_reverse(word1, word2):
    if len(word1) == len(word2):
        if word1 == word2[::-1]:
            return True
    return False

file = open('List.txt')
words = list(file)
for word1 in words:
    word1 = word1.strip()
    word1 = word1.lower()
    for word2 in words:
        word2 = word2.strip()
        word2 = word2.lower()
        print word1 + word2
        if is_reverse(word1, word2):
            print word1 + ' is the opposite of ' + word2 

要回答您的其他问题,请了解为什么您可以在同一个列表上迭代两次但不能在同一个文件上迭代:

for element in iterable循环通过调用iterableiterable.__iter__询问其迭代器

当Python向文件请求其迭代器时,该文件将自行返回。这意味着文件上的每个迭代器都共享相同的状态/位置。

>>> file = open('testfile.txt')
>>> it1 = iter(file)
>>> it2 = iter(file)
>>> id(it1)
3078689064L
>>> id(it2)
3078689064L
>>> id(file)
3078689064L

当你向列表中询问它的迭代器时,每次都会得到一个不同的迭代器,它有自己独立的位置信息。

>>> list = [1,2,3]
>>> it3 = iter(list)
>>> it4 = iter(list)
>>> id(it3)
3078746156L
>>> id(it4)
3078746188L
>>> id(list)
3078731244L

后记

正如Hugh所指出的那样,迭代每个单词的单词列表将会非常低效。

这是一种更快的方式。将List.txt更改为一个非常大的文件,例如在Linux系统上/usr/share/dict/words看看我的意思。

words = []
wordset = set(())

file = open('List.txt')
for line in file:
    word = line.strip('\n')
    words.append(word)
    wordset.add(word)

for word in words:
    reversed = word[::-1]
    if reversed in wordset:
        print word + ' is the opposite of ' + reversed

答案 1 :(得分:1)

如果您确实想要将列表与自身进行比较,可以通过测试集合中的值来避免迭代:

def getWords(fname):
    with open(fname) as inf:
        words = list(w.strip().lower() for w in inf)
    ws = set(words)
    words = list(ws)
    words.sort()
    return words, ws

def wordsInReverse(words, wordset):
    for w in words:
        rw = w[::-1]  # reverse the string
        if rw in wordset:
            yield w,rw

def main():
    words, wordSet = getWords('List.txt')

    for w,rw in wordsInReverse(words, wordSet):
        if rw >= w:  # don't print duplicates
            print('{0} is the opposite of {1}'.format(w, rw))        

if __name__=="__main__":
    main()

并交叉比较两个文件:

from itertools import chain

def main():
    words1, wordSet1 = getWords('List1.txt')
    words2, wordSet2 = getWords('List2.txt')

    for w,rw in chain(wordsInReverse(words1, wordSet2), wordsInReverse(words2, wordSet1)):
        print('{0} is the opposite of {1}'.format(w, rw))        

答案 2 :(得分:0)

我的猜测是你在两个循环中迭代“fin”(尽管你的示例代码在第一个循环中有一个神秘的变量“x”)。而是尝试在每个循环中对文件使用单独的句柄,如下所示:

fin1 = open("list.txt")
for word1 in fin1:
    fin2 = open("list.txt")
    for word2 in fin2:
        ...etc...

答案 3 :(得分:0)

  

不需要读取文件   不止一次。

- Klaus Byskov Hoffmann

这意味着过度消耗时间对单词进行两次迭代:如果一个文件包含1000个单词,则每个单词的反转可能会与1000个单词进行比较,即总计1000000个比较;

这是一个只有一次迭代的代码,一本字典提醒它已经看到了什么

with open('palindromic.txt') as f:
    ch = f.read()
    li = [ w for w in ch.split() if len(w)>1 ]

dic ={}
pals = set([])

for line in li:
    word = line.strip().lower()
    if len(word)>1:
        if word not in dic:
            dic[word] = 1
            if word[::-1] in dic and word[::-1]!=word:
                pals.add(word)
        else:
            dic[word] += 1


for w in pals:
    print w,dic[w],'  ',w[::-1],dic[w[::-1]]

[w for ch.split()如果len(w)> 1]必须改进以从每个单词中删除括号,撇号等