对于我们的某个环境,我们有一个自动化crontab工具,可以使用以下格式在文件中生成结果
如您所见,目录大小为“0”,文件大小相关。 我需要一个读取此文件并生成JSON输出的脚本。 但是,需要根据其层次结构中的所有文件计算目录大小。
文件输出
drwxrwxr-x - user1 root 1398089926561 0 /user
drwxrwxr-x - user1 root 1398089926561 0 /user/user1
drwxr-xr-x - user1 root 1398089926586 0 /user/user1/dir1
-rwxr-xr-x 1 user1 root 1398089926972 200 /user/user1/dir1/file1.csv
-rwxr-xr-x 1 user1 root 1398089927009 300 /user/user1/dir1/file2.tsv
drwxr-xr-x - user1 root 1398089929786 0 /user/user1/temp/lib
-rwxr-xr-x 1 user1 root 1398089927077 50 /user/user1/temp/lib/file5.txt
-rwxr-xr-x 1 user1 root 1398089927139 600 /user/user1/temp/lib/file.jar
drwxr-xr-x - root root 1398089829218 0 /app
drwxr-xr-x - root root 1398089829218 0 /app/panther
drwxrwxrwx - N1234 root 1398176496064 0 /app/panther/warehouse/warehouse
drwxr-xr-x - E56 root 1398176493177 0 /app/panther/warehouse/warehouse/sample_07
-rw-r--r-- 1 E56 root 1398176493340 50 /app/panther/warehouse/warehouse/sample_07/sample_07.csv
drwxr-xr-x - E56 root 1398176495945 0 /app/panther/warehouse/warehouse/sample_08
-rw-r--r-- 1 E56 root 1398176495981 250 /app/panther/warehouse/warehouse/sample_08/sample_08.csv
,输出应如下所示
{
"name":"Total",
"size":1450,
"children":[
{
"name":"user",
"size":1150,
"children":[
{
"name":"user1",
"size":1150,
"children":[
{
"name":"dir1",
"size":500,
"children":[
{
"name":"file1.csv",
"size":200
},
{
"name":"file2.tsv",
"size":300
}
]
},
{
"name":"temp",
"size":650,
"children":[
{
"name":"lib",
"size":650,
"children":[
{
"name":"file5.txt",
"size":50
},
{
"name":"file.jar",
"size":600
}
]
}
]
}
]
}
]
},
{
"name":"app",
"size":300,
"children":[
{
"name":"panther",
"size":300,
"children":[
{
"name":"warehouse",
"size":300,
"children":[
{
"name":"sample_07",
"size":50,
"children":[
{
"name":"sample_07.csv",
"size":50
}
]
},
{
"name":"sample_08",
"size":250,
"children":[
{
"name":"sample_08.csv",
"size":250
}
]
}
]
}
]
}
]
}
]}
正如您所看到的,目录按层次分布并显示聚合的目录大小。
非常感谢任何帮助。
谢谢!
答案 0 :(得分:0)
您可以使用tree
命令。请注意,它是一个shell命令,要获取json输出,您必须添加-J
标志。
要使用python
模块执行此操作,请使用subprocess模块:
from subprocess import call
call(['tree', '-J'])
答案 1 :(得分:0)
您的结构很难确定您是否已经看过某个特定目录,如果相反,您选择子项作为属性的名称字典(子项,大小)。然后,您可以使用递归树夏天实现伪递归树构造。
<强>代码强>:
root = {}
for line in s:
size, p = line[48:].lstrip().split(' ')
path = p[1:].split('/')
o = root
for i in path:
o = o.setdefault('children', {})
o = o.setdefault(i, {})
o['size'] = int(size)
def sumtree(d):
if 'children' not in d:
return d['size']
d.setdefault('size', 0)
for c in d['children'].values():
d['size'] += sumtree(c)
return d['size']
print("Total: {}".format(sumtree(root)))
print(root)
<强>输出继电器强>:
Total: 1450
{
'children': {
'app': {
'children': {
'panther': {
'children': {
'warehouse': {
'children': {
'warehouse': {
'children': {
'sample_07': {
'children': {
'sample_07.csv': {'size': 50}
},
'size': 50},
'sample_08': {
'children': {
'sample_08.csv': {'size': 250}
},
'size': 250}},
'size': 300}},
'size': 300}},
'size': 300}},
'size': 300},
'user': {
'children': {
'user1': {
'children': {
'dir1': {
'children': {
'file1.csv': {'size': 200},
'file2.tsv': {'size': 300}
},
'size': 500
},
'temp': {
'children': {
'lib': {
'children': {
'file.jar': {'size': 600},
'file5.txt': {'size': 50}
},
'size': 650}},
'size': 650}},
'size': 1150}},
'size': 1150}},
'size': 1450
}