在python中将单词分隔成字母文件

时间:2013-12-04 18:21:06

标签: python file alphabetical

我正在使用Python开发一个项目,并尝试将单词列表分成字母文件。所以任何以'a'或'A'开头的单词都会进入'A.html'文件。我能够创建文件,并拥有以字母开头的所有单词,但我需要递归执行,以便它将遍历所有字母并将它们放入不同的文件中。以下是一些代码:    class LetterIndexPage(object):

   def __init__(self, wordPage):
       self.alphaList = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','Numbers','Special Characters']

   def createLetterPages(self):
       if not os.path.exists('A.html'):
           file('A.html', 'w')
       letterFileName = 'A.html'
       letterItemList = []
       for item in wordItems():
           if item[:1] == 'a' or item[:1] == 'A':
               letterItemList.append(item)
       letterItems = reduce(lambda letterItem1, letterItem2: letterItem1 + letterItem2, letterItemList)
       return letterItems

wordItems()方法返回网页中的所有文本。我不知道从哪里开始。有人可以帮忙吗?

2 个答案:

答案 0 :(得分:0)

from itertools import groupby
import requests
page = requests.get('http://www.somepage.com/some.txt')
all_words = page.text.split()
groups = groupby(sorted(all_words),lambda x:x[0].lower())
for g in groups:
   with open("%s.html"%g[0],"a") as f:
        f.write("\n".join(g[1]))

我认为应该工作(不测试......)

答案 1 :(得分:0)

首先打开文件,完成工作,然后关闭它们:

from string import ascii_uppercase

output_files = {letter: open(letter + '.html', 'w') for letter in ascii_uppercase}
for word in list_of_words:
    output_files[word[0].upper()].write(word + '\n')

for of in output_files:
    of.close()