生成数字列表

时间:2017-10-29 08:32:59

标签: python csv range xrange

您好我想生成从1000000到2000000的数字列表,但问题是我得到错误内存错误我使用随机一切都很好只有我得到dublcated数字我不能有重复号码所以我切换到xrange < / p>

data = []
total = 2000000
def resource_file(info):
    with open(info, "r") as data_file:
        reader = csv_reader(data_file, delimiter=",")
        for row in reader:
            try:
                for i in xrange(1000000,total):
                    new_row = [row[0], row[1], i]
                    data.append(new_row)
            except IndexError as error:
                print(error)
    with open(work_dir + "new_data.csv", "w") as new_data:
        writer = csv_writer(new_data, delimiter=",")
        for new_row in data:
            writer.writerow(new_row)

1 个答案:

答案 0 :(得分:3)

使用1M..2M

的额外列重复每一行

问题是您首先将所有这些配置存储在内存中。 Python的第一个没有非常高效的内存模型,而且每行100万个条目都非常大。

我建议不要将数据存储在列表中,而只需将这些数据立即写入文件:

total = 2000000
def resource_file(info):
    with open(info, "r") as data_file:
        reader = csv_reader(data_file, delimiter=",")
        with open(work_dir + "new_data.csv", "w") as new_data:
            writer = csv_writer(new_data, delimiter=",")
            for row in reader:
                rowa, rowb = row[0:2]
                for data in xrange(1000000,total):
                    writer.writerow([rowa,rowb,data])

获取文件

的1M-2M行

如果您想要获取原始文件的1M到2M行,可以将其写为:

from itertools import islice

total = 2000000
def resource_file(info):
    with open(info, "r") as data_file:
        reader = csv_reader(data_file, delimiter=",")
        with open(work_dir + "new_data.csv", "w") as new_data:
            writer = csv_writer(new_data, delimiter=",")
            for row in islice(reader,1000000,total):
                writer.writerow(row)

或者你可以简化它,就像@JonClemens所说的那样:

from itertools import islice

total = 2000000
def resource_file(info):
    with open(info, "r") as data_file:
        reader = csv_reader(data_file, delimiter=",")
        with open(work_dir + "new_data.csv", "w") as new_data:
            writer = csv_writer(new_data, delimiter=",")
            writer.writerows(islice(reader,1000000,total))