Question

在Python中创建以下程序：

import csv
import os
from random import randint

with open('number.csv',"w") as f_out:
    f_write = csv.writer(f_out,delimiter=',')
    statinfo = os.stat('number.csv')
    file_size = statinfo.st_size
    x=[randint(0,99) for p in range(0,99)]

    for i in x:
        f_write.writerow(x)

上述计划没有给出预期的结果;它一遍又一遍地写同一行，我不知道如何达到1GB。

允许重复数字。
文件必须在文件大小后停止生成达到1 GB。
所以我想要随机数字（0,99），逗号用一系列相同数字的行分隔每行中的列，直到文件大小达到1 GB。

Answer 1

您每次都需要生成一个 new 行，并在每行写入后检查文件大小。您也可以使用file.tell()方法查看文件位置的位置，一旦超过1GB行，您就拥有足够大的文件：

import csv
from random import randint

per_row = 100  # number of columns per row to generate
target_size = 1024 ** 3  # 1 GiB, see https://en.wikipedia.org/wiki/Gibibyte

with open('number.csv', 'w', newline='') as f_out:
    f_write = csv.writer(f_out)

    while f_out.tell() < target_size:
        row = [randint(0, 99) for _ in range(per_row)]
        f_write.writerow(row)

我假设1GB意味着1 Gibibyte。如果您需要SI单位，可以将1024 ** 3值替换为1000 ** 3。

如果你想要的只是一个连续的数字流（没有行分隔符），写一个数字，然后在下一个数字之前连续写逗号，直到你达到这个大小。此时无需使用csv模块：

from random import choice

target_size = 1024 ** 3  # 1 GiB, see https://en.wikipedia.org/wiki/Gibibyte
# convert to string just the once
numbers = [str(i) for i in range(100)]

with open('number.csv', 'w') as f_out:
    f_out.write(choice(numbers))
    while f_out.tell() < target_size:
        f_out.write(',{}'.format(choice(numbers)))

通过在较大的块中连接一堆数字，你可能会获得更好的性能：

from random import choice

target_size = 1024 ** 3  # 1 GiB, see https://en.wikipedia.org/wiki/Gibibyte
# convert to string just the once
numbers = [str(i) for i in range(100)]

with open('number.csv', 'w') as f_out:
    f_out.write(choice(numbers))
    while f_out.tell() < target_size:
        f_out.write(',')
        chunk = ','.join([choice(numbers) for _ in range(353)])
        f_out.write(chunk)

这可以让你快1GiB，但可能会略微超调。给定输入的分布，353数字为提供 1KiB的文本（~35个1个字符的字符串和~318个两个字符串，加上353个逗号== 1024个字符）号码）。

尝试使用随机生成的数字创建1GB文件

1 个答案: