Question

下面的代码扫描文件夹并列出早于阈值日期的所有文件。例如，阈值时间现在是12小时，那么它将列出所有早于阈值的文件并将它们保存在变量中。

string_output = ""
my_paths = []
for p in ['\\\location\\folderpath\\subfolder\\']:
    for dirpath, dirnames, filenames in os.walk(p):
        my_paths.append(filenames)
        for filename in filenames:
            full_path = os.path.join(dirpath, filename)
            if last_modified(full_path) < threshold:
                string_output+= '\n{}\n{}\n{}\nFile Name: {}\nFile Size: {}\nFile Created: {}\nFile Last Modified: {}\n{}'.format(
                separator, timestamp, app_name, full_path,
                file_size(full_path),
                created(full_path),
                last_modified(full_path),separator)
print string_output;

字符串格式输出看起来不错，没有额外的backslah

----------------------------------------
09/06/2017 13:10:07
File Name: \\location\folderpath\subfolder\1.txt
File Size: 153.0 bytes
File Created: 2017-01-26 14:29:59
File Last Modified: 2017-01-26 14:28:39
----------------------------------------
09/06/2017 13:10:07
File Name: \\location\folderpath\subfolder\2.txt
File Size: 153.0 bytes
File Created: 2017-01-26 14:29:59
File Last Modified: 2017-01-26 14:28:39

但是当我用json转储替换string.format时，循环不起作用

string_output = ""
my_paths = []
for p in ['\\\location\\folderpath\\subfolder\\']:
    for dirpath, dirnames, filenames in os.walk(p):
        my_paths.append(filenames)
        for filename in filenames:
            full_path = os.path.join(dirpath, filename)
            if last_modified(full_path) < threshold:
                string_output=json.dumps({'Timestamp: ':timestamp, 'FileName : ':full_path, 'File Size : ':file_size(full_path), 'File Created : ':created(full_path),'File Last Modified : ':last_modified(full_path)  }, sort_keys=False, indent=8)    
print string_output;

Json输出带有额外的反斜杠（与输入相同）并且只有一个记录

{
        "FileName : ": "\\\location\\folderpath\\subfolder\2.txt", 
        "Timestamp: ": "Timestamp: 09/06/2017 13:10:07", 
        "File Created : ": "2017-01-26 14:29:59",  
        "File Last Modified : ": "2017-01-26 13:14:11", 
        "File Size : ": "48.0 bytes"
}

如何让json.dumps列出所有文件而不是1个结果和输出而没有额外的反斜杠？

Answer 1

您继续在使用string_output的代码中重新分配json，而在原始代码中连接字符串。这意味着传递条件last_modified(full_path) < threshold的最后一个文件最终将成为唯一的输出。

string_output +=代码段的修改版本应达到您想要的效果。

string_output = ""
my_paths = []
for p in ['\\\location\\folderpath\\subfolder\\']:
    for dirpath, dirnames, filenames in os.walk(p):
        my_paths.append(filenames)
        for filename in filenames:
            full_path = os.path.join(dirpath, filename)
            if last_modified(full_path) < threshold:
                string_output += json.dumps({'Timestamp: ':timestamp, 'FileName : ':full_path, 'File Size : ':file_size(full_path), 'File Created : ':created(full_path),'File Last Modified : ':last_modified(full_path)  }, sort_keys=False, indent=8)    

print string_output

如果您想要规范化路径，以便他们不会有额外的\，您可以对full_path进行替换，如下所示。

if last_modified(full_path) < threshold:
    string_output += json.dumps(
        {'Timestamp': timestamp, 
         'FileName': full_path.replace('\\\\', '\\'),
         'File Size': file_size(full_path),
         'File Created': created(full_path),
         'File Last Modified': last_modified(full_path)
        }, sort_keys=False, indent=8)

Answer 2

列出所有文件而不是仅列出一个文件的明显问题是您要重新分配=而不是+=。

然而，+=可能也不是最佳方法。原因是字符串是不可变的，因此每次在字符串上调用+=时，它都会返回原始字符串的副本以及添加的位。对于有大量添加的大字符串，这会对性能产生负面影响。

我建议创建一个列表来保存所有字符串添加项，然后在完成后将结果加入。

line_separator = '----------------------------------------\n'
results = []
for p in [...]:
    ...
    if last_modified(full_path) < threshold:
        results.append(json.dumps({'Timestamp: ':timestamp, 'FileName : ':full_path, 'File Size : ':file_size(full_path), 'File Created : ':created(full_path),'File Last Modified : ':last_modified(full_path)  }, sort_keys=False, indent=8) )

line_seprator.join(results)

Python json.dumps不能用于for循环

2 个答案: