熊猫read_csv()解析多种日期时间格式

时间:2020-02-26 15:28:12

标签: python pandas datetime parsing

我想读取具有不同DateTime格式的多个数据文件。但是,我如何一次性解析这些格式?

dateparse_1 = lambda x: pd.datetime.strptime(x, "%d/%m/%Y %H:%M:%S.%f") 
dateparse_2 = lambda x: pd.datetime.strptime(x, "%Y-%m-%d %H:%M:%S.%f")
dateparse_3 = lambda x: pd.datetime.strptime(x, "%Y-%m-%d %H:%M:%S.")

for f in all_filenames:
    df = pd.read_csv(f,encoding='latin-1',low_memory=False, index_col='TimeStamp', parse_dates=True, date_parser = dateparse_1 or dateparse_2 or dateparse_3)

1 个答案:

答案 0 :(得分:1)

您可以组合dateparse_{1,2,3}来尝试,直到成功为止。例如,

def combine_date_parsers(date_parsers):
    def combined_date_parser(value):
        for date_parser in date_parsers:
            try:
                return date_parser(value)
            except ValueError:
                pass
        else:
            raise ValueError(value)
    return combined_date_parser


dateparse_1 = lambda x: pd.datetime.strptime(x, "%d/%m/%Y %H:%M:%S.%f")
dateparse_2 = lambda x: pd.datetime.strptime(x, "%Y-%m-%d %H:%M:%S.%f")
dateparse_3 = lambda x: pd.datetime.strptime(x, "%Y-%m-%d %H:%M:%S.")

date_parser = combine_date_parsers([
    dateparse_1,
    dateparse_2,
    dateparse_3,
])

pd.read_csv(
    f,
    encoding='latin-1',
    low_memory=False,
    index_col='TimeStamp',
    parse_dates=True,
    date_parser=date_parser,
)