将python日期列表解析到pandas DataFrame中

时间:2018-07-16 06:58:59

标签: pandas datetime python-3.6

需要一些帮助/建议,以将日期安排到Pandas DataFrame中。我的Python列表如下所示:

sorting happens

是否有一种简单的方法可以将其转换为具有两列(开始时间和结束时间)的Pandas DataFrame?

1 个答案:

答案 0 :(得分:3)

示例:

L = ['',
 '20180715:1700-20180716:1600',
 '20180716:1700-20180717:1600',
 '20180717:1700-20180718:1600',
 '20180718:1700-20180719:1600',
 '20180719:1700-20180720:1600',
 '20180721:CLOSED',
 '20180722:1700-20180723:1600',
 '20180723:1700-20180724:1600',
 '20180724:1700-20180725:1600',
 '20180725:1700-20180726:1600',
 '20180726:1700-20180727:1600',
 '20180728:CLOSED']

我认为最好的方法是使用列表理解和按分隔符分隔并过滤不带分隔符的值:

df = pd.DataFrame([x.split('-') for x in L if '-' in x], columns=['start','end'])
print (df)
           start            end
0  20180715:1700  20180716:1600
1  20180716:1700  20180717:1600
2  20180717:1700  20180718:1600
3  20180718:1700  20180719:1600
4  20180719:1700  20180720:1600
5  20180722:1700  20180723:1600
6  20180723:1700  20180724:1600
7  20180724:1700  20180725:1600
8  20180725:1700  20180726:1600
9  20180726:1700  20180727:1600

熊猫解决方案也是可能的,尤其是在需要流程Series的情况下-此处使用splitdropna

s = pd.Series(L)

df = s.str.split('-', expand=True).dropna(subset=[1])
df.columns = ['start','end']
print (df)
            start            end
1   20180715:1700  20180716:1600
2   20180716:1700  20180717:1600
3   20180717:1700  20180718:1600
4   20180718:1700  20180719:1600
5   20180719:1700  20180720:1600
7   20180722:1700  20180723:1600
8   20180723:1700  20180724:1600
9   20180724:1700  20180725:1600
10  20180725:1700  20180726:1600
11  20180726:1700  20180727:1600