将Panda列从天转换为天,小时,分钟

时间:2020-05-14 19:57:22

标签: pandas timedelta

我正在尝试转换列df["time_ro_reply"] 它仅包含十进制的天到timedelta格式,其中包含天,小时,分钟。这使它更易于阅读。

我正在阅读有关pd.to_timedelta的信息,但是我正在努力实现它: pd.to_timedelta(df["time_to_reply"])仅返回0。

样本输入:

df["time_ro_reply"]
1.881551
0.903264
2.931560
2.931560

预期输出:

df["time_ro_reply"]
1 days 19 hours 4 minutes
0 days 23 hours 2 minutes
2 days 2 hours 23 minutes
2 days 2 hours 23 minutes

2 个答案:

答案 0 :(得分:1)

我建议如下使用自定义函数:

import numpy as np
import pandas as pd

# creating the provided dataframe
df = pd.DataFrame([1.881551, 0.903264, 2.931560, 2.931560],
                   columns = ["time_ro_reply"])

# this function converts a time as a decimal of days into the desired format
def convert_time(time):

    # calculate the days and remaining time
    days, remaining = divmod(time, 1)

    # calculate the hours and remaining time
    hours, remaining = divmod(remaining * 24, 1)

    # calculate the minutes
    minutes = divmod(remaining * 60, 1)[0]

    # a list of the strings, rounding the time values
    strings = [str(round(days)), 'days',
               str(round(hours)), 'hours',
               str(round(minutes)), 'minutes']

    # return the strings concatenated to a single string
    return ' '.join(strings)

# add a new column to the dataframe by applying the function
# to all values of the column 'time_ro_reply' using .apply()
df["desired_output"] = df["time_ro_reply"].apply(lambda t: convert_time(t))

这将产生以下数据帧:

    time_ro_reply   desired_output
0   1.881551        1 days 21 hours 9 minutes
1   0.903264        0 days 21 hours 40 minutes
2   2.931560        2 days 22 hours 21 minutes
3   2.931560        2 days 22 hours 21 minutes

但是,这将产生与您描述的输出不同的输出。如果确实将“ time_ro_reply”值解释为纯小数,则我看不到您如何获得预期的结果。您介意分享如何获得它们吗?

我希望这些注释能很好地解释代码。如果不是,并且您不熟悉语法,例如divmod(),apply(),建议您在Python / Pandas文档中查找它们。

让我知道这是否有帮助。

答案 1 :(得分:0)

使用MrB here所示的尼斯函数的修改版本,

def display_time(seconds, granularity=2):
    intervals = (('days', 86400),
                 ('hours', 3600),
                 ('minutes', 60),
                 ('seconds', 1),
                 ('microseconds', 1e-6))
    result = []
    for name, count in intervals:
        value = seconds // count
        if value:
            seconds -= value * count
            name = name.rstrip('s') if value == 1 else name
            result.append(f"{int(value)} {name}")
        else:
            result.append(f"{0} {name}")
    return ', '.join(result[:granularity])

如果将“ time_to_reply”列转换为秒并应用函数,则也可以获得所需的输出:

import pandas as pd

df = pd.DataFrame({"time_to_reply": [1.881551, 0.903264, 2.931560, 2.931560]})
df['td_str'] = df['time_to_reply'].apply(lambda t: display_time(t*24*60*60, 3))
# df['td_str']
# 0      1 day, 21 hours, 9 minutes
# 1    0 days, 21 hours, 40 minutes
# 2    2 days, 22 hours, 21 minutes
# 3    2 days, 22 hours, 21 minutes