我正在尝试运行此代码并且它工作正常但是当我包含时间列时出现错误如何解决这个问题我尝试了这个
df['Time'] = pd.to_datetime(df['Time'])
但是没用
ValueError: 无法将字符串转换为浮点数:'12:00:00 PM'
import numpy as np
import pandas as pd
import seaborn as sns
from sklearn.ensemble import IsolationForest
import matplotlib.pyplot as plt
import scipy.stats as stats
df = pd.read_csv('C:/Users/mypc/data/mydata.csv')
df = df.fillna(df.median())
model=IsolationForest(n_estimators=50, max_samples='auto', contamination=float(0.1),max_features=1.0)
model.fit(df[['Year','Month','Day','Time','A','B']])
df['scores']=model.decision_function(df[['Year','Month','Day','Time','A','B']])
df['anomaly']=model.predict(df[['Year','Month','Day','Time','A','B']])
print(df.head(20))
df.to_csv(r'C:/Users/dataanom.csv', index=False, header=True)
anomaly=df.loc[df['anomaly']==-1]
anomaly_index=list(anomaly.index)
print(anomaly_index)
anomaly_index.sort()
print(anomaly_index)
df = pd.DataFrame(anomaly_index)
我的数据是这样的
答案 0 :(得分:0)
是否如您所愿:
>>> pd.to_datetime(df['Hours'], format="%I:%M:%S %p") \
.apply(lambda t: t.hour*3600 + t.minute*60 + t.second)
0 43200
1 10800
2 50400
Name: Hours, dtype: int64