我有以下数据集:
Duration1 Duration2
05:13:45 01:09:58
18:53:38 01:53:18
NaT 01:03:38
07:19:38 01:23:26
我想在duration1和duration2之间绘制图表?
df['duration1'] =[" 05:13:45 "," 18:53:38 "," NaT ","07:19:38"]
df['duration2'] = [" 01:09:58","01:53:18","01:03:38","01:23:26"]
持续时间1和持续时间2的数据类型是timedelta64 [ns]
额外奖励:是否可以根据绘制的图表趋势获得函数?
答案 0 :(得分:2)
使用dt.total_seconds
df.stack().dt.total_seconds().unstack().plot.scatter(
'Duration1', 'Duration2')
获取趋势线的最简单方法是使用seaborn.regplot
import seaborn as sns
d = df.stack().dt.total_seconds().unstack()
sns.regplot(d.Duration1, d.Duration2, ci=None)
代码从头到尾
你应该可以复制/粘贴
from io import StringIO
import pandas as pd
import seaborn as sns
txt = """Duration1 Duration2
-1 days +05:13:45 0 days 01:09:58
-6 days +18:53:38 0 days 01:53:18
NaT 0 days 01:03:38
10 days +07:19:38 0 days 01:23:26
"""
df = pd.read_csv(StringIO(txt), sep='\s{2,}', engine='python').apply(pd.to_timedelta)
d = df.stack().dt.total_seconds().unstack()
sns.regplot(d.Duration1, d.Duration2, ci=None)