我试图定义这两个变量,然后在第6行再次使用它们。但是,运行时出现以下错误。这似乎只发生在pandas.date_range中。我的最终目标是将其作为.py文件运行以生成图表。
start_date = raw_input('enter start date: ')
end_date = raw_input('enter end date: ')
dataPR['date'] = pd.DatetimeIndex(dataPR['intake_date']).date
grouped_dataPR = dataPR.groupby(['date']).sum()
idx = pd.date_range(start='%s', end='%s') % (start_date, end_date)
grouped_dataPR.index = pd.DatetimeIndex(grouped_dataPR.index)
grouped_dataPR = grouped_dataPR.reindex(idx, fill_value=0)
grouped_dataPR['date'] = grouped_dataPR.index
dataPR_df = pd.DataFrame([grouped_dataPR])
ts = pd.Series(grouped_dataPR['count'], index=grouped_dataPR.index)
ts.plot()
pd.rolling_mean(ts,30).plot(style='k')
错误:
ValueError Traceback (most recent call last)
<ipython-input-33-2ac5fe9d8951> in <module>()
2 grouped_dataPR = dataPR.groupby(['date']).sum()
3 #idx = pd.date_range('%s', '%s' % (start_date, end_date))
----> 4 idx = pd.date_range(start='%s', end='%s') % (start_date, end_date)
5 grouped_dataPR.index = pd.DatetimeIndex(grouped_dataPR.index)
6 grouped_dataPR = grouped_dataPR.reindex(idx, fill_value=0)
/Users/abc/anaconda/lib/python2.7/site- packages/pandas/tseries/index.pyc in date_range(start, end, periods, freq, tz, normalize, name, closed, **kwargs)
1921 return DatetimeIndex(start=start, end=end, periods=periods,
1922 freq=freq, tz=tz, normalize=normalize, name=name,
-> 1923 closed=closed, **kwargs)
1924
1925
/Users/abc/anaconda/lib/python2.7/site- packages/pandas/util/decorators.pyc in wrapper(*args, **kwargs)
87 else:
88 kwargs[new_arg_name] = new_arg_value
---> 89 return func(*args, **kwargs)
90 return wrapper
91 return _deprecate_kwarg
/Users/abc/anaconda/lib/python2.7/site- packages/pandas/tseries/index.pyc in __new__(cls, data, freq, start, end, periods, copy, name, tz, verify_integrity, normalize, closed, ambiguous, dtype, **kwargs)
235 return cls._generate(start, end, periods, name, freq,
236 tz=tz, normalize=normalize, closed=closed,
--> 237 ambiguous=ambiguous)
238
239 if not isinstance(data, (np.ndarray, Index, ABCSeries)):
/Users/abc/anaconda/lib/python2.7/site- packages/pandas/tseries/index.pyc in _generate(cls, start, end, periods, name, offset, tz, normalize, ambiguous, closed)
377
378 if start is not None:
--> 379 start = Timestamp(start)
380
381 if end is not None:
pandas/tslib.pyx in pandas.tslib.Timestamp.__new__ (pandas/tslib.c:8973)()
pandas/tslib.pyx in pandas.tslib.convert_to_tsobject (pandas/tslib.c:22522)()
pandas/tslib.pyx in pandas.tslib.convert_str_to_tsobject (pandas/tslib.c:24520)()
ValueError:
答案 0 :(得分:4)
您应该直接调用变量,而不用引号括起来。你试图以一种无趣的方式进行字符串替换。
idx = pd.date_range(start=start_date, end=end_date)
如果由于某种原因你仍然想要进行字符串替换,你必须这样做,单独替换每个字符串:
idx = pd.date_range(start='%s' % (start_date, ), end='%s' % (end_date, ))
答案 1 :(得分:3)
我认为你只需要做
pandas.data_range
原因是start
期望end
和'%s'
参数都有string or datetime-like个对象。 date_range
不像日期时间那样。
如果这是一个有效的选项,那么你编写的代码试图在pandas pd.date_range(start='{}'.format(start_date), end='{}'.format(end_date))
和一个字符串元组之间进行模运算,这很可能会引发其他错误。
如果您确实需要对这些值使用字符串格式,我建议使用新的字符串格式化方式,如
setmode(fileno(stdin), O_BINARY);