Pandas Datetime error: Cannot compare type 'Timestamp' with type 'unicode'

时间:2018-03-23 00:01:34

标签: python pandas datetime for-loop

I have a dataframe, D1:

Date	Symbol	ICO_to
6/12/2017 18:00	MYST	5/30/2017
6/13/2017 18:00	MYST	5/30/2017
6/14/2017 18:00	MYST	5/30/2017
6/15/2017 18:00	MYST	5/30/2017
6/16/2017 18:00	MYST	5/30/2017
6/17/2017 18:00	MYST	5/30/2017
6/18/2017 18:00	MYST	5/30/2017
6/19/2017 18:00	MYST	5/30/2017
6/20/2017 18:00	MYST	5/30/2017

Below, I input logic to see if the Date column is less than (ICO_to - 5 days). If it is less, I want to drop all rows in this specific dataframe:

D1.Date = pd.to_datetime(D1.Date) 
D1['Date'] = D1['Date'].dt.strftime('%m-%d-%Y')

D1.rename(columns={'ICO to': 'ICO_to'}, inplace=True)
D1.ICO_to = pd.to_datetime(D1.ICO_to)

for index, row in D1.iterrows():
    if D1.loc[index, 'Date'] < (D1.loc[index, 'ICO_to']-pd.Timedelta(5, unit='d')):
        D1.drop

But I get the error, referring to the if statement in the above loop:

TypeError: Cannot compare type 'Timestamp' with type 'unicode' 

I think it's because you can't subtract the Timedelta value from the Datetime value, but am not sure. How can I make this for loop logic work?

1 个答案:

答案 0 :(得分:1)

Vectorise your calculation. Here is one way:

df['Date'] = pd.to_datetime(df['Date'])
df['ICO_to'] = pd.to_datetime(df['ICO_to'])

df = df.loc[~(df['Date'] < (df['ICO_to']-pd.Timedelta(5, unit='d'))), :]

Explanation

  • The condition ~(df['Date'] < (df['ICO_to']-pd.Timedelta(5, unit='d'))) produces a Boolean array. ~ signifies negative. All True components of the array signify rows that are kept; likewise False are removed.
  • The documentation is an ideal place to learn about vectorising your calculations.