我正在尝试解决这里遇到的这个问题。
#import libraries
from __future__ import division
from datetime import datetime, timedelta,date
import pandas as pd
%matplotlib inline
from sklearn.metrics import classification_report,confusion_matrix
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from sklearn.cluster import KMeans
import plotly.offline as pyoff
import plotly.graph_objs as go
from sklearn.model_selection import KFold, cross_val_score, train_test_split
#initate plotly
pyoff.init_notebook_mode()
#read data from csv and redo the data work we done before
tx_data = pd.read_csv(r'C:\Users\aayus\OneDrive\Desktop\Aayu\College Project\OnlineRetail.csv', encoding='latin1')
tx_data['InvoiceDate'] = pd.to_datetime(tx_data['InvoiceDate'])
tx_data
tx_uk = tx_data.query("Country=='United Kingdom'").reset_index(drop=True)
tx_uk
一切正常,直到这里。但是,请尽快添加此部分代码。它给出了一个错误。
#create 3m and 6m dataframes
tx_3m = tx_uk[(tx_uk.InvoiceDate < date(2011,6,1)) & (tx_uk.InvoiceDate >= date(2011,3,1))].reset_index(drop=True)
tx_6m = tx_uk[(tx_uk.InvoiceDate >= date(2011,6,1)) & (tx_uk.InvoiceDate < date(2011,12,1))].reset_index(drop=True)
错误是“ dtype = datetime64 [ns]和日期之间的无效比较” 我对numpy和pandas还是很陌生,所以非常感谢您的帮助。
谢谢你们
答案 0 :(得分:0)
.dt.date
会将数据帧序列转换为datetime.date,
,可以与date(2011,6,1)
例如tx_uk.InvoiceDate.dt.date < date(2011,6,1)
将起作用!〜