Question

这是文件的前10行。

Year    Revenue
0   Jan-07  1757000
1   Feb-07  2052000
2   Mar-07  2747000
3   Apr-07  2308000
4   May-07  2289000
5   Jun-07  2322000
6   Jul-07  2310000
7   Aug-07  2049000
8   Sep-07  1862000
9   Oct-07  2006000
10  Nov-07  2061000

我按如下所示启动代码：

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
%matplotlib inline
from pandas.plotting import register_matplotlib_converters
from pandas_datareader import data as pdr
from pandas.plotting import autocorrelation_plot
import seaborn as sns

from datetime import datetime
from datetime import timedelta```

I then imported my data set into the file 
```df=pd.read_csv(pathway.csv', sep=',',)

I wanted to see the data types of my file to see what I was working with. 
So I used  ```df.info``` to see what my datafile types were.

RangeIndex：144个条目，0到143 数据列（共2列）： 144年非空对象回收材料销售收入144非空int64 dtypes：int64（1），对象（1）内存使用量：2.3+ K


Then I tried to translate the years into yyyy-mm-dd format by using this code but I error out  with OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-07 00:00:00

df.month = pd.to_datetime（df.month） df.set_index（'month'，inplace = True）


I expect from my data set to change to 
0   2007-01-01 1757000
1  2007-02-01  2052000
2  2007-03-01  2747000
3  2007-04-01  2308000
4  2007-05-01  2289000
5  2007-06-01  2322000
......

once I complete this i will plot a time series graph, with $ on the y column and x being the date

Answer 1

通过添加缺少的内容来固定日期？我想我们可以把这一天定为每月的第一天。

然后将它们转换为日期时间

df["year"] = "01-" + df["year"]

df["year"] = pd.to_datetime(df["year"], format="%d-%b-%y")

df = df.set_index("year")

超出范围的纳秒级时间戳：日期为1-01-01 00:00:00

1 个答案: