通过不使用多个循环来读取大型数据集来更改代码

时间:2018-10-26 01:10:31

标签: python dictionary

是否有一种简单的方法可以为许多不同的json文件制作字典? 我希望创建一个字典,但也要考虑处理时间,例如,如果我有1000个json文件。

我当前的代码:

#My current approach loop each file and then create a dictionary for the particular file. Since its looping so it will take much time if the files are too huge
# I would like to consider the huge file process time in making the dictionary and searching for another alternative to my current codes.

file = [[[json],[json],[json],[json],[json]]] #this is just an example of 5 json files for explanation and seeking help purpose. In real aim is to focus in 1000++ json files.

full_dic = {}
for i, f in enumerate(file): # this loop each file but what if I have thousands so it will take much time
    dictionary = {}
    for ii, fs in enumerate(f): # this loop each file sentence, even worse when have thousands f(file)
        **Here I will create my dictionary by reading the json file contents, this part does not matter because it depends on my json files, so can be ignored**
        **Finally, dictionary created like this**  
        dictionary[ii] = something
    full_dic[i] = dictionary

在拥有大型数据集时,是否还有更好的方法或更少的时间来执行此操作?

1 个答案:

答案 0 :(得分:1)

可能是一行(假设您在file列表中有一个json文件路径列表):

my_dict = {i: json.load(open(file[i])) for i in range(len(file))}

尽管如果您应该确保首先将这1000个json文件放入内存中。最好不要命名变量file,因为它是python中默认的__builtin__之一。