Python从所有子目录读取JSON文件

时间:2019-07-03 09:12:19

标签: json python-3.x pandas

我具有以下文件夹结构:

Directory    
    - Subdirectory 1:
       file.json
    - Subdirectory 2:
       file.json
    - Subdirectory 3:
       file.json
    - Subdirectory 4:
       file.json

如何使用Pandas读取这些JSON文件?

3 个答案:

答案 0 :(得分:1)

您可以执行以下操作:

import glob, os
working_directory = os.getcwd()

sub_directories = [active_directory + "/" + x for x in os.listdir(working_directory) if os.path.isdir(active_directory + "/"+x)]
all_json_files = []

for sub_dir in sub_directories:
    os.chdir(sub_dir)
    for file in glob.glob("*.json"):
        all_json_files.append(sub_dir + "/" + file)

#Get back to original working directory
os.chdir(working_directory)

list_of_dfs = [pd.read_json(x) for x in all_json_files]

从那里开始,如果所有json文件都具有相同的结构,则可以将它们连接起来以获得一个单个数据帧:

final_df = pd.concat(list_of_dfs)

答案 1 :(得分:0)

尝试以下代码:

import pandas as pd
from pathlib import Path

files = Path("Directory").glob("**/*.json")

for file in files:
    df = pd.read_json(file)

要了解有关将JSON字符串转换为Pandas对象的更多信息:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_json.html

答案 2 :(得分:-2)

import pandas as pd
from pathlib import Path

files = Path("Directory").glob("**/*.json")

for file in files:
    print(file)

它将递归打印给定目录下存在的所有JSON文件。