从列表中提取元素并按长度排序

时间:2015-04-17 18:22:11

标签: python

我有一个包含单词列表的大文件我想按长度对世界进行排序,并将它们放在不同的文件上,例如:

List1=['example','example1','example12]

输出: File1:带7个字母的单词(示例) File2:包含8个字母的单词(example1) File3:包含9个字母的单词(example12)

4 个答案:

答案 0 :(得分:0)

我认为你真的不需要在这里进行任何排序。您只想将分区这些单词按长度划分为不同的文件。你可以动态地做到这一点:

with open('file1', 'w') as f1, open('file2', 'w') as f2, open('file3', 'w') as f3:
    for entry in List1:
        if len(entry) == 7:
            f1.write(entry)
        elif len(entry) == 8:
            f2.write(entry)
        elif len(entry) == 9:
            f3.write(entry)

如果您要拥有大量文件(实际上,3已经是临界状态),我会考虑使用dict而不是elif链使其更具动态性。例如,这将长度为0,1,2,...,9的条目传送到名为file0file1,...,file9的文件:

with contextlib.ExitStack as stack:
    lenmap = {i: stack.enter_context(open('file{}'.format(i), 'w')) 
             for i in range(10)}
    for entry in List1:
        f = lenmap.get(len(entry))
        if f:
            f.write(entry)

对于Python 2.7,你没有ExitStack,所以在with语句中没有安全的方法来使用任意数量的文件,所以我们必须使用finally代替:

lenmap = {i: open('file{}'.format(i), 'w') for i in range(10)}
try:
    for entry in List1:
        f = lenmap.get(len(entry))
        if f:
            f.write(entry)
finally:
    for f in lenmap.values():
        f.close()

我猜你实际上想要在单词之间使用某种分隔符,例如'\n'' ',但显而易见的是如何添加你想要的任何东西。


)。在这种情况下,您可以排序,然后分组,然后一次执行一个文件:

for key, group in itertools.groupby(sorted(List1, key=len), key=len):
    with open('file{}'.format(key), 'w') as f:
        for entry in group:
            f.write(entry)

答案 1 :(得分:0)

List1=['example','example1','example12']

for item in List1:
    fileToWrite = "example{0}".format(len(item))
    with open(fileToWrite, 'a') as fileID:
        fileID.write(item + "\n")

答案 2 :(得分:0)

这是一种简短而单一的方式。使用collections.defaultdicthttps://docs.python.org/2/library/collections.html#collections.defaultdict),以便自动为字长创建+打开文件。

请参阅代码中的注释以获得进一步说明。

from collections import defaultdict

# default dict that will automatically open/create file
# if it didn't have one open for it yet
class newfile(defaultdict):
    def __missing__(self, key):
        self[key] = open(str(key)+".txt", 'w')
        return self[key]

# helper to transform a line of text into a list of words
words = lambda line: line.strip().split()

with open("words.txt", 'r') as inputfile:
    # process a word: write it in the correct file
    def procword(filedict, word):
        return filedict[len(word)].write(word+"\n") or filedict
    # process a line in the file: get the words and process them
    def procline(filedict, line):
        return reduce(procword, words(line), filedict)
    # process all lines in the inputfile, starting with an empty length -> file dict
    files = reduce(procline, inputfile, newfile())
    # maybe superfluous, but close all files (it's polite)
    [fd.close() for (_, fd) in files.iteritems()]

答案 3 :(得分:-1)

下面的脚本读取源文件,按字长将其划分为单词列表,然后将列表中的每个元素(如果它不为空)写入单独的文件

words=[set() for _ in range(40)]
with open('source.file') as sfile:
    for line in sfile:
        for word in line.split(" "):
            word=word.strip('''\n!"',.:*?;-''')
            if word != '':
                words[len(word)].add(word)
for i in range(len(words)):
    if len(words[i]) != 0:
        fname='te st/file' + str(i)
        with open(fname, 'w') as tfile:
            tfile.write('\n'.join(words[i]))