我如何拆分以下 df: 现有数据框
t t1
Test1 [October 22nd, 2019, February 8th, 2020, Augus...
Test2 [July 31st, 2020, September 21st, 2020, March ...
Desired Dataframe
t t1
Test1 October 22nd, 2019
Test1 February 8th, 2020
Test2 July 31st, 2020
Test2 September 21st, 2020
new_df.head().to_dict()
{'t': {0: 'Test1', 1: 'Test2'},
't1': {0: [Date(22,10,2019),
Date(8,2,2020),
Date(8,8,2020),
Date(8,2,2021),
Date(11,6,2021)],
1: [Date(31,7,2020), Date(21,9,2020), Date(21,3,2021), Date(11,6,2021)]}}
按照下面的尝试代码
new_df["t1"]=new_df["t1"].float64.split(",")
print(new_df.explode("t1").reset_index(drop=True))
出现错误:
AttributeError: 'Series' object has no attribute 'float64'
答案 0 :(得分:0)
我不确定 new_df
是如何为您构建的,但 @Henry 走在正确的轨道上,以下内容对我有用。
首先我构造数据框:
data = {
't': ['Test1','Test2'],
't1': [
[date(2019,10,22), date(2020,2,8), date(2020,8,8), date(2021,2,8), date(2021,6,11)],
[date(2020,7,31), date(2020,9,21)]
]}
new_df = pd.DataFrame(data)
然后用explode命令得到你想要的:
new_df.explode("t1").reset_index(drop=True)
Out:
t t1
0 Test1 2019-10-22
1 Test1 2020-02-08
2 Test1 2020-08-08
3 Test1 2021-02-08
4 Test1 2021-06-11
5 Test2 2020-07-31
6 Test2 2020-09-21
只要 t1 中的每一行都是一个日期时间数组,上面的应该可以工作。