如何更改正在读取/存储的文件的顺序

时间:2017-12-12 15:03:06

标签: python python-3.x

我有一个程序从目录中读取所有文本文件并将其放在excel文件中。所有文本文件都是名称GetData _#+ timestamp,例如:

GetData_0.2017-12-04_160809

GetData_1.2017-12-04_160824

GetData_2.2017-12-04_160843

我的代码是:

import os
import pandas as pd
import config

# Path of directory containing the text files
directory = config.Local_Recieved

# Initialize empty dataframe collector
dframe_collector = []

# For each file in the directory ...
for file_name in sorted(os.listdir(directory)):
    if file_name.startswith('GetData'):
        # Construct full path of file
        file_path = os.path.join(directory, file_name)

        # Read out file and store into a pandas dataframe
        file_dframe = pd.read_csv(file_path, sep=';', header=None)
        dframe_collector.append(file_dframe)

# Concatenate individual dataframes into one single dataframe
master_dframe = pd.concat(dframe_collector)

# With newly created excel file ...
with pd.ExcelWriter('DataLog.xlsx') as writer:
    # For each unique parameter that occurs in the first column of the dataframe ...
    for num, (name, group) in enumerate(master_dframe.groupby(0)):
        # Write corresponding data rows to individual excel sheet
        sheet_name = f"Sheet_{num}"
        group.to_excel(writer, sheet_name=sheet_name, header=None, index=None)

它的效果很好但是我对这部分有问题:

 # For each file in the directory ...
    for file_name in sorted(os.listdir(directory)):
        if file_name.startswith('GetData'):
            # Construct full path of file
            file_path = os.path.join(directory, file_name)

            # Read out file and store into a pandas dataframe
            file_dframe = pd.read_csv(file_path, sep=';', header=None)
            dframe_collector.append(file_dframe)

它读取文件的顺序是以下GetData_0,1,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,... 我想要的订单是GetData_0,1,2,3,4,5,6,7,8,9,10,...,501,502

我需要改变什么?

1 个答案:

答案 0 :(得分:0)

您必须重命名文件以包含前导零。而不是

GetData_0.2017-12-04_160809

GetData_1.2017-12-04_160824

...

GetData_10.2017-12-04_160843

...

GetData_500.2017-12-04_160843

你应该

GetData_000.2017-12-04_160809

GetData_001.2017-12-04_160824

...

GetData_010.2017-12-04_160843

...

GetData_500.2017-12-04_160843

的内容
import os
path = '/path/to/files/'
for filename in os.listdir(path):
    prefix = filename.split('_')[0]
    num = filename.split('_')[1].split('.')[0]
    postfix = filename.split('.')[1]
    num = num.zfill(4)
    new_filename = prefix + "_" + num + '.' + postfix
    os.rename(os.path.join(path, filename), os.path.join(path, new_filename))