将字符串拆分为N个相等的部分?

时间:2014-03-21 23:25:01

标签: python

我有一个字符串我想分成N个相等的部分。

例如,假设我有一个长度为128的字符串,我想将其分成4个长度为32的块;即,前32个字符,然后是第二个32,依此类推。

我该怎么做?

6 个答案:

答案 0 :(得分:38)

import textwrap
print textwrap.wrap("123456789", 2)
#prints ['12', '34', '56', '78', '9']

注意:小心空白等 - 这可能是你想要的也可能不是。

"""Wrap a single paragraph of text, returning a list of wrapped lines.

    Reformat the single paragraph in 'text' so it fits in lines of no
    more than 'width' columns, and return a list of wrapped lines.  By
    default, tabs in 'text' are expanded with string.expandtabs(), and
    all other whitespace characters (including newline) are converted to
    space.  See TextWrapper class for available keyword args to customize
    wrapping behaviour.
    """

答案 1 :(得分:19)

您可以使用简单的循环:

parts = [your_string[i:i+n] for i in range(0, len(your_string), n)]

答案 2 :(得分:12)

将元素分组为n长度组的另一种常用方法:

>>> s = '1234567890'
>>> list(map(''.join, zip(*[iter(s)]*2)))
['12', '34', '56', '78', '90']

此方法直接来自zip()的文档。

答案 3 :(得分:4)

递归方式:

def split_str(seq, chunk, skip_tail=False):
    lst = []
    if chunk <= len(seq):
        lst.extend([seq[:chunk]])
        lst.extend(split_str(seq[chunk:], chunk, skip_tail))
    elif not skip_tail and seq:
        lst.extend([seq])
    return lst

演示:

seq = "123456789abcdefghij"

print(split_str(seq, 3))
print(split_str(seq, 3, skip_tail=True))

# ['123', '456', '789', 'abc', 'def', 'ghi', 'j']
# ['123', '456', '789', 'abc', 'def', 'ghi']

答案 4 :(得分:3)

在许多情况下,您可以将字符串视为与列表类似。这里有很多答案:Splitting a list of into N parts of approximately equal length

例如,您可以计算chunk_size = len(my_string)/N

然后,要访问一个块,您可以my_string[i: i + chunk_size](然后通过chunk_size递增i) - 在for循环或列表推导中。

答案 5 :(得分:2)

我喜欢迭代器!

def chunk(in_string,num_chunks):
    chunk_size = len(in_string)//num_chunks
    if len(in_string) % num_chunks: chunk_size += 1
    iterator = iter(in_string)
    for _ in range(num_chunks):
        accumulator = list()
        for _ in range(chunk_size):
            try: accumulator.append(next(iterator))
            except StopIteration: break
        yield ''.join(accumulator)

## DEMO
>>> string = "a"*32+"b"*32+"c"*32+"d"*32
>>> list(chunk(string,4))
['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb', 'cccccccccccccccccccccccccccccccc', 'dddddddddddddddddddddddddddddddd']
>>> string += "e" # so it's not evenly divisible
>>> list(chunk(string,4))
['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab', 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcc', 'ccccccccccccccccccccccccccccccddd', 'ddddddddddddddddddddddddddddde']

也明显快于textwrap.wrap,但几乎可以肯定更少&#34;好&#34;

>>> timeit.timeit(lambda: list(chunk(string,4)),number=500)
0.047726927170444355
>>> timeit.timeit(lambda: textwrap.wrap(string,len(string)//4),number=500)
0.20812756575945457

很容易破解任何迭代(只需删除str.join并产生累加器,除非isinstance(in_string,str)

# after a petty hack
>>> list(chunk([1,2,3,4,5,6,7,8,9,10,11,12],4))
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]