我正在从包含路径的外部JSON源接收输入。请按照以下步骤操作:
datalake-dev/facial_recognition/
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic0.jpg
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic1.jpg
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic10.png
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic11.jpg
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic12.png
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic13.jpg
datalake-dev/facial_recognition/landing/input-images/
datalake-dev/facial_recognition/landing/input-images/this_is_a_dir.png
由此,我需要以API / JSON / Dictionary格式将其传递以进行进一步处理。到目前为止,我已经经历过one,two,three和four线程。没有什么足以解决问题的。
从路径中,我需要以以下方式获取字典/ JSON格式:
{
"curation":{
"google-search-images":[
{
"name":"pic0"
},
{
"name":"pic1"
}
]
},
"derived":{
"recognition-matches":[
{
"name":"img2"
}
],
"errors":[
{
"name":"foo"
}
]
}
}
在上述Dictionary / JSON中,名称curation
,google-search-images
,this_is_a_dir.png
都是目录。我需要根据这些路径的长度将它们递归放入字典的内容。
for contents in result['Contents']:
directory_or_file_list = contents['Key'].split('/') # To identify if the path is pointing as file / directory
path = contents['Key']
splitted_path = path.split('/')
# ['datalake-dev', 'facial_recognition', 'landing', 'input-images', 'this_is_a_dir.png', 'pic0.jpg']
if '' in splitted_path:
splitted_path.pop()
all_paths.append(splitted_path)
# The object 'api' holds the dictionary expected.
api[splitted_path[0]] = splitted_path[1]
# api[splitted_path[0]] = {splitted_path[1] : {splitted_path[2] : [append_all_elements_under_this]} }
if directory_or_file_list[-1].split('.')[-1] in ['jpg', 'jpeg', 'png', 'tiff']:
print(path)
else:
print(path)
注意:也许有一种方法可以进行硬编码,但是我不会发表 在这种情况下。另外,没有机会使用os.walk()。 去过也做过。它不是操作系统文件系统。
欢迎在我的代码旁边提供任何帮助!