数据集
sample = {'operator': ['op_a',
'op_a',
'op_a',
'op_a',
'op_b',
'op_b',
'op_b',
'op_b',
'op_c',
'op_c',
'op_c',
'op_c'],
'from': ['a', 'a', 'a', 'a', 'c', 'c', 'c', 'c', 'a', 'a', 'a', 'a'],
'to': ['b', 'b', 'b', 'b', 'd', 'd', 'd', 'd', 'b', 'b', 'b', 'b'],
'valid_from': ['13/11/2018',
'13/11/2018',
'13/11/2018',
'13/11/2018',
'13/11/2018',
'13/11/2018',
'13/11/2018',
'13/11/2018',
'15/02/2019',
'15/02/2019',
'15/02/2019',
'15/02/2019'],
'valid_to': ['19/11/2018',
'19/11/2018',
'19/11/2018',
'19/11/2018',
'19/11/2018',
'19/11/2018',
'19/11/2018',
'19/11/2018',
'21/02/2019',
'21/02/2019',
'21/02/2019',
'21/02/2019']}
df_test = pd.DataFrame(sample)
df_test
我希望能够将valid_from
和valid_to
列分成各自的日期并添加到数据框中。
输出
df3 = pd.DataFrame({'operator': ['op_a',
'op_a',
'op_a',
'op_a',
'op_b',
'op_b',
'op_b',
'op_b',
'op_c',
'op_c',
'op_c',
'op_c'],
'from': ['a', 'a', 'a', 'a', 'c', 'c', 'c', 'c', 'a', 'a', 'a', 'a'],
'to': ['b', 'b', 'b', 'b', 'd', 'd', 'd', 'd', 'b', 'b', 'b', 'b'],
'valid_from': ['13/11/2018',
'13/11/2018',
'13/11/2018',
'13/11/2018',
'13/11/2018',
'13/11/2018',
'13/11/2018',
'13/11/2018',
'15/02/2019',
'15/02/2019',
'15/02/2019',
'15/02/2019'],
'valid_1': ['14/11/2018',
'14/11/2018',
'14/11/2018',
'14/11/2018',
'14/11/2018',
'14/11/2018',
'14/11/2018',
'14/11/2018',
'16/02/2019',
'16/02/2019',
'16/02/2019',
'16/02/2019'],
'valid_2': ['15/11/2018',
'15/11/2018',
'15/11/2018',
'15/11/2018',
'15/11/2018',
'15/11/2018',
'15/11/2018',
'15/11/2018',
'17/02/2019',
'17/02/2019',
'17/02/2019',
'17/02/2019'],
'valid_3': ['16/11/2018',
'16/11/2018',
'16/11/2018',
'16/11/2018',
'16/11/2018',
'16/11/2018',
'16/11/2018',
'16/11/2018',
'18/02/2019',
'18/02/2019',
'18/02/2019',
'18/02/2019'],
'valid_4': ['17/11/2018',
'17/11/2018',
'17/11/2018',
'17/11/2018',
'17/11/2018',
'17/11/2018',
'17/11/2018',
'17/11/2018',
'19/02/2019',
'19/02/2019',
'19/02/2019',
'19/02/2019'],
'valid_5': ['18/11/2018',
'18/11/2018',
'18/11/2018',
'18/11/2018',
'18/11/2018',
'18/11/2018',
'18/11/2018',
'18/11/2018',
'20/02/2019',
'20/02/2019',
'20/02/2019',
'20/02/2019'],
'valid_to': ['19/11/2018',
'19/11/2018',
'19/11/2018',
'19/11/2018',
'19/11/2018',
'19/11/2018',
'19/11/2018',
'19/11/2018',
'21/02/2019',
'21/02/2019',
'21/02/2019',
'21/02/2019']})
df2
答案 0 :(得分:1)
您可以尝试:
df_test['valid_from'] = pd.to_datetime(df_test['valid_from'])
df_test['valid_to'] = pd.to_datetime(df_test['valid_to'])
diff_days = int((df_test.loc[0,'valid_to'] - df_test.loc[0,'valid_from']).days)
for i in range(diff_days-1):
df_test['valid_{}'.format(i+1)]= pd.DatetimeIndex(df_test['valid_from']) + pd.DateOffset(i+1)
此解决方案假定所有行的天数相同,因为未另行指定。
输出:
from operator to valid_from valid_to valid_1 valid_2 valid_3 \
0 a op_a b 2018-11-13 19/11/2018 2018-11-14 2018-11-15 2018-11-16
1 a op_a b 2018-11-13 19/11/2018 2018-11-14 2018-11-15 2018-11-16
2 a op_a b 2018-11-13 19/11/2018 2018-11-14 2018-11-15 2018-11-16
3 a op_a b 2018-11-13 19/11/2018 2018-11-14 2018-11-15 2018-11-16
4 c op_b d 2018-11-13 19/11/2018 2018-11-14 2018-11-15 2018-11-16
5 c op_b d 2018-11-13 19/11/2018 2018-11-14 2018-11-15 2018-11-16
6 c op_b d 2018-11-13 19/11/2018 2018-11-14 2018-11-15 2018-11-16
7 c op_b d 2018-11-13 19/11/2018 2018-11-14 2018-11-15 2018-11-16
8 a op_c b 2019-02-15 21/02/2019 2019-02-16 2019-02-17 2019-02-18
9 a op_c b 2019-02-15 21/02/2019 2019-02-16 2019-02-17 2019-02-18
10 a op_c b 2019-02-15 21/02/2019 2019-02-16 2019-02-17 2019-02-18
11 a op_c b 2019-02-15 21/02/2019 2019-02-16 2019-02-17 2019-02-18
valid_4 valid_5
0 2018-11-17 2018-11-18
1 2018-11-17 2018-11-18
2 2018-11-17 2018-11-18
3 2018-11-17 2018-11-18
4 2018-11-17 2018-11-18
5 2018-11-17 2018-11-18
6 2018-11-17 2018-11-18
7 2018-11-17 2018-11-18
8 2019-02-19 2019-02-20
9 2019-02-19 2019-02-20
10 2019-02-19 2019-02-20
11 2019-02-19 2019-02-20