Question

对于我们的某个环境，我们有一个自动化crontab工具，可以使用以下格式在文件中生成结果

如您所见，目录大小为“0”，文件大小相关。我需要一个读取此文件并生成JSON输出的脚本。但是，需要根据其层次结构中的所有文件计算目录大小。

文件输出

drwxrwxr-x  -    user1       root 1398089926561          0 /user
drwxrwxr-x  -    user1       root 1398089926561          0 /user/user1
drwxr-xr-x  -    user1       root 1398089926586          0 /user/user1/dir1
-rwxr-xr-x  1    user1       root 1398089926972        200 /user/user1/dir1/file1.csv
-rwxr-xr-x  1    user1       root 1398089927009        300 /user/user1/dir1/file2.tsv
drwxr-xr-x  -    user1       root 1398089929786          0 /user/user1/temp/lib
-rwxr-xr-x  1    user1       root 1398089927077         50 /user/user1/temp/lib/file5.txt
-rwxr-xr-x  1    user1       root 1398089927139        600 /user/user1/temp/lib/file.jar
drwxr-xr-x  -     root       root 1398089829218          0 /app
drwxr-xr-x  -     root       root 1398089829218          0 /app/panther
drwxrwxrwx  -    N1234       root 1398176496064          0 /app/panther/warehouse/warehouse
drwxr-xr-x  -      E56       root 1398176493177          0 /app/panther/warehouse/warehouse/sample_07
-rw-r--r--  1      E56       root 1398176493340         50 /app/panther/warehouse/warehouse/sample_07/sample_07.csv
drwxr-xr-x  -      E56       root 1398176495945          0 /app/panther/warehouse/warehouse/sample_08
-rw-r--r--  1      E56       root 1398176495981        250 /app/panther/warehouse/warehouse/sample_08/sample_08.csv

，输出应如下所示

{

"name":"Total",
"size":1450,
"children":[
    {
        "name":"user",
        "size":1150,
        "children":[
            {
                "name":"user1",
                "size":1150,
                "children":[
                    {
                        "name":"dir1",
                        "size":500,
                        "children":[
                            {
                                "name":"file1.csv",
                                "size":200
                            },
                            {
                                "name":"file2.tsv",
                                "size":300
                            }
                        ]
                    },
                    {
                        "name":"temp",
                        "size":650,
                        "children":[
                            {
                                "name":"lib",
                                "size":650,
                                "children":[
                                    {
                                        "name":"file5.txt",
                                        "size":50
                                    },
                                    {
                                        "name":"file.jar",
                                        "size":600
                                    }
                                ]
                            }
                        ]
                    }
                ]
            }
        ]
    },
    {
        "name":"app",
        "size":300,
        "children":[
            {
                "name":"panther",
                "size":300,
                "children":[
                    {
                        "name":"warehouse",
                        "size":300,
                        "children":[
                            {
                                "name":"sample_07",
                                "size":50,
                                "children":[
                                    {
                                        "name":"sample_07.csv",
                                        "size":50
                                    }
                                ]
                            },
                            {
                                "name":"sample_08",
                                "size":250,
                                "children":[
                                    {
                                        "name":"sample_08.csv",
                                        "size":250
                                    }
                                ]
                            }
                        ]
                    }
                ]
            }
        ]
    }
]}

正如您所看到的，目录按层次分布并显示聚合的目录大小。

非常感谢任何帮助。

谢谢！

Answer 1

您可以使用tree命令。请注意，它是一个shell命令，要获取json输出，您必须添加-J标志。

要使用python模块执行此操作，请使用subprocess模块：

from subprocess import call
call(['tree', '-J'])

Answer 2

您的结构很难确定您是否已经看过某个特定目录，如果相反，您选择子项作为属性的名称字典（子项，大小）。然后，您可以使用递归树夏天实现伪递归树构造。

<强>代码：

root = {}
for line in s:
    size, p = line[48:].lstrip().split(' ')
    path = p[1:].split('/')
    o = root
    for i in path:
        o = o.setdefault('children', {})
        o = o.setdefault(i, {})
    o['size'] = int(size)

def sumtree(d):
    if 'children' not in d:
        return d['size']
    d.setdefault('size', 0)
    for c in d['children'].values():
        d['size'] += sumtree(c)
    return d['size']

print("Total: {}".format(sumtree(root)))
print(root)

<强>输出继电器：

Total: 1450
{
    'children': {
        'app': {
            'children': {
                'panther': {
                    'children': {
                        'warehouse': {
                            'children': {
                                'warehouse': {
                                    'children': {
                                        'sample_07': {
                                            'children': {
                                                'sample_07.csv': {'size': 50}
                                            },
                                            'size': 50},
                                        'sample_08': {
                                            'children': {
                                                'sample_08.csv': {'size': 250}
                                            },
                                            'size': 250}},
                                    'size': 300}},
                            'size': 300}},
                    'size': 300}},
            'size': 300},
        'user': {
            'children': {
                'user1': {
                    'children': {
                        'dir1': {
                            'children': {
                                'file1.csv': {'size': 200},
                                'file2.tsv': {'size': 300}
                            },
                            'size': 500
                        },
                        'temp': {
                            'children': {
                                'lib': {
                                    'children': {
                                        'file.jar': {'size': 600},
                                        'file5.txt': {'size': 50}
                                    },
                                    'size': 650}},
                            'size': 650}},
                    'size': 1150}},
            'size': 1150}},
    'size': 1450
}

从文件中提供的路径创建Json文件

2 个答案: