Question

我有一个看起来像这样的文件：

我希望生成一个数字池，其中列表中第一个数字的出现应该出现x次（x是列表中的第二个数字）。

这就是我想要的输出：

[100, 100, 300, 300, 300, 50, 500, 500, 500, 500, 500]

我写了一个这样的函数：

def Pool(pos, count):
    pool = pos*int(count)
    return pool

并且对于每一行，我将所有数字附加到名为bigpool的变量

bigpool = []
for line in weightposfile:
    line = line.rstrip()
    f = line.split('\t')
    pos = f[0]
    count = int(f[1])
    pool = Pool(pos, count)
    bigpool.append(pool)

但这将返回如下列表：

[100100, 300300300, 50, 500500500500500]

如何分隔数字并得到我想要的输出（如上所示）？

Answer 1

这应该有效：

def Pool(pos, count):
    return [pos] * count

bigpool = []
for line in weightposfile:
    line = line.rstrip()
    f = line.split('\t')
    pos = f[0]
    count = int(f[1])
    pool = Pool(pos, count)
    bigpool += pool

我换了两行。 return [pos] * count会生成pos数组。

bigpool += pool会将pool的元素附加到bigpool。

Answer 2

您可以使用list comprehension和itertools.repeat()功能执行此操作。

from itertools import repeat, chain
with open("file.dat", "r") as f:
    output = list(chain.from_iterable(repeat(int(number), int(count)) for (number, count) in (line.split() for line in f)))
print(output)

这给了我们：

[100, 100, 300, 300, 300, 50, 500, 500, 500, 500, 500]

现在，这是一个非常复杂的列表理解（从技术上讲，它是一个生成器理解），所以让我们分解它。我们首先打开文件（使用with语句，这是最佳做法）。我们做的第一件事就是把所有的线条分开并放在白色空间上，给我们列出数量，数量对。

(line.split() for line in f)

然后我们接受这些对并重复给定次数：

repeat(int(number), int(count)) for (number, count) in ...

我们现在有一个重复生成器生成器（基本上是列表列表），因此我们将它们扩展为一个列表：

list(chain.from_iterable(...))

如果你只是遵循它，这实际上是一个非常好的方法，在一行代码中完成它。它在意义上很重要，实际上非常易读。

Answer 3

def Pool(pos, count):
    pool = [int(pos) for x in range(int(count))]
    return pool

Answer 4

你真是太近了！只是做：

bigpool = []
for line in weightposfile:
    line = line.rstrip()
    f = line.split('\t')
    pos = []
    pos.append(f[0])
    count = int(f[1])
    pool = Pool(pos, count)
    bigpool.extend(pool)

将整数a乘以一个列表，将每个元素的a次加到列表中。

Answer 5

这个怎么样？

fromfile = "100 2\n300 3\n50 1\n500 5"
result = []
for entry in fromfile.split("\n"):
    num, count = entry.split()
    for i in range(int(count)):
        result.append(num)
print result

Answer 6

尝试此实现，它按预期工作，并且更简单：

def pool(pos, count):
    return [pos] * int(count)

bigpool = []
for line in weightposfile:
    pos, count = line.strip().split()
    bigpool.extend(pool(pos, count))

Answer 7

如果你有一个可变间距，这应该符合你的需要：

import re
results = []
pre = re.compile('^(\d+)\s+(\d+)',re.M)

for line in weightposfile.split("\n"):
    matchline = pre.match(line)
    for i in range(int(matchline.group(1))):
        results.append(matchline.group(0))
print results

使用其他列表中的信息生成数字列表

7 个答案: