I have a dataframe, D1:
Date Symbol ICO_to
6/12/2017 18:00 MYST 5/30/2017
6/13/2017 18:00 MYST 5/30/2017
6/14/2017 18:00 MYST 5/30/2017
6/15/2017 18:00 MYST 5/30/2017
6/16/2017 18:00 MYST 5/30/2017
6/17/2017 18:00 MYST 5/30/2017
6/18/2017 18:00 MYST 5/30/2017
6/19/2017 18:00 MYST 5/30/2017
6/20/2017 18:00 MYST 5/30/2017
Below, I input logic to see if the Date column is less than (ICO_to - 5 days). If it is less, I want to drop all rows in this specific dataframe:
D1.Date = pd.to_datetime(D1.Date)
D1['Date'] = D1['Date'].dt.strftime('%m-%d-%Y')
D1.rename(columns={'ICO to': 'ICO_to'}, inplace=True)
D1.ICO_to = pd.to_datetime(D1.ICO_to)
for index, row in D1.iterrows():
if D1.loc[index, 'Date'] < (D1.loc[index, 'ICO_to']-pd.Timedelta(5, unit='d')):
D1.drop
But I get the error, referring to the if statement in the above loop:
TypeError: Cannot compare type 'Timestamp' with type 'unicode'
I think it's because you can't subtract the Timedelta value from the Datetime value, but am not sure. How can I make this for loop logic work?
答案 0 :(得分:1)
Vectorise your calculation. Here is one way:
df['Date'] = pd.to_datetime(df['Date'])
df['ICO_to'] = pd.to_datetime(df['ICO_to'])
df = df.loc[~(df['Date'] < (df['ICO_to']-pd.Timedelta(5, unit='d'))), :]
Explanation
~(df['Date'] < (df['ICO_to']-pd.Timedelta(5, unit='d')))
produces a Boolean array. ~
signifies negative. All True
components of the array signify rows that are kept; likewise False
are removed.