我这样创建日期列表:
begin=dt.date(2010,1,1)
back=pd.date_range(end = begin, periods = 11).to_pydatetime().tolist()
forward=pd.date_range(start = begin, periods = 11).to_pydatetime().tolist()
然后我将此日期列表附加到新变量:
date_centered=[]
date_centered.append(back)
date_centered.append(forward)
date_centered
输出:
[[datetime.datetime(2009, 12, 22, 0, 0),
datetime.datetime(2009, 12, 23, 0, 0),
datetime.datetime(2009, 12, 24, 0, 0),
datetime.datetime(2009, 12, 25, 0, 0),
datetime.datetime(2009, 12, 26, 0, 0),
datetime.datetime(2009, 12, 27, 0, 0),
datetime.datetime(2009, 12, 28, 0, 0),
datetime.datetime(2009, 12, 29, 0, 0),
datetime.datetime(2009, 12, 30, 0, 0),
datetime.datetime(2009, 12, 31, 0, 0),
datetime.datetime(2010, 1, 1, 0, 0)],
[datetime.datetime(2010, 1, 1, 0, 0),
datetime.datetime(2010, 1, 2, 0, 0),
datetime.datetime(2010, 1, 3, 0, 0),
datetime.datetime(2010, 1, 4, 0, 0),
datetime.datetime(2010, 1, 5, 0, 0),
datetime.datetime(2010, 1, 6, 0, 0),
datetime.datetime(2010, 1, 7, 0, 0),
datetime.datetime(2010, 1, 8, 0, 0),
datetime.datetime(2010, 1, 9, 0, 0),
datetime.datetime(2010, 1, 10, 0, 0),
datetime.datetime(2010, 1, 11, 0, 0)]]
我要删除重复的日期,在这种情况下为2010,1,1。我该怎么办?
答案 0 :(得分:2)
您可以使用Python set
摆脱重复的项目:
set_back = set(back)
set_forward = set(forward)
# updates `set_back` by removing elements from `set_forward`
set_back.difference_update(set_forward)
# convert `set_back` and `set_forward` back to lists
# and add them to your `date_centered` result
date_centered = [list(set_back), list(set_forward)]
但是,考虑到您的第一行代码,也许您只是想通过从back
列表中删除多余的日期来摆脱多余的日期:
begin = dt.date(2010,1,1)
back = pd.date_range(end = begin, periods = 11).to_pydatetime().tolist()
forward = pd.date_range(start = begin, periods = 11).to_pydatetime().tolist()
back.pop() # removes the last item
date_centered= [back, forward]
答案 1 :(得分:1)
您可以获取两个列表的集合(这将消除每个列表中的重复项),并从所有可能的值(A或B)中减去两个集合(A和B)的共同元素。这将仅返回唯一元素。
list((set(back) | set(forward)) - (set(back) & set(forward)))
答案 2 :(得分:0)
由于您已经在使用pandas
,因此也可以使用drop_duplicates()
:
import pandas as pd
back = pd.date_range(end='2010-01-01', periods = 11)
forward = pd.date_range(start='2010-01-01', periods = 11)
merged = pd.concat([back.to_series(), forward.to_series()]).drop_duplicates()
# merged.values
# array(['2009-12-22T00:00:00.000000000', '2009-12-23T00:00:00.000000000',
# '2009-12-24T00:00:00.000000000', '2009-12-25T00:00:00.000000000',
# '2009-12-26T00:00:00.000000000', '2009-12-27T00:00:00.000000000',
# '2009-12-28T00:00:00.000000000', '2009-12-29T00:00:00.000000000',
# '2009-12-30T00:00:00.000000000', '2009-12-31T00:00:00.000000000',
# '2010-01-01T00:00:00.000000000', '2010-01-02T00:00:00.000000000',
# '2010-01-03T00:00:00.000000000', '2010-01-04T00:00:00.000000000',
# '2010-01-05T00:00:00.000000000', '2010-01-06T00:00:00.000000000',
# '2010-01-07T00:00:00.000000000', '2010-01-08T00:00:00.000000000',
# '2010-01-09T00:00:00.000000000', '2010-01-10T00:00:00.000000000',
# '2010-01-11T00:00:00.000000000'], dtype='datetime64[ns]')