当我使用熊猫统计模型时,我得到一个断言错误

时间:2019-01-24 04:14:25

标签: python pandas statsmodels

我正在尝试下面的普通最小二乘示例。

import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
dat = sm.datasets.get_rdataset("Guerry", "HistData").data
results = smf.ols('Lottery ~ Literacy + np.log(Pop1831)', data=dat).fit()
print(results.summary())

但是我在下面收到此错误,示例的来源在网站http://www.statsmodels.org/stable/index.html中 我的statsmodels版本是0.9,我只是按照第一个朋友所说的删除了np,但是我仍然遇到相同的错误,所以它不起作用,这就是我可以尝试的全部,请帮助我.....错误时间太长,所以我需要将它切成碎片

第一篇文章正在整理:

AssertionError                            Traceback (most recent call last)
<ipython-input-6-1d91087b5e15> in <module>()
      3 import statsmodels.formula.api as smf
      4 dat = sm.datasets.get_rdataset("Guerry", "HistData").data
----> 5 results = smf.ols('Lottery ~ Literacy + log(Pop1831)', data=dat).fit()
      6 print(results.summary())

~\Anaconda3\lib\site-packages\statsmodels\base\model.py in from_formula(cls, formula, data, subset, drop_cols, *args, **kwargs)
    153 
    154         tmp = handle_formula_data(data, None, formula, depth=eval_env,
--> 155                                   missing=missing)
    156         ((endog, exog), missing_idx, design_info) = tmp
    157 

~\Anaconda3\lib\site-packages\statsmodels\formula\formulatools.py in handle_formula_data(Y, X, formula, depth, missing)
     63         if data_util._is_using_pandas(Y, None):
     64             result = dmatrices(formula, Y, depth, return_type='dataframe',
---> 65                                NA_action=na_action)
     66         else:
     67             result = dmatrices(formula, Y, depth, return_type='dataframe',

第二部分如下:

~\Anaconda3\lib\site-packages\patsy\highlevel.py in dmatrices(formula_like, data, eval_env, NA_action, return_type)
    308     eval_env = EvalEnvironment.capture(eval_env, reference=1)
    309     (lhs, rhs) = _do_highlevel_design(formula_like, data, eval_env,
--> 310                                       NA_action, return_type)
    311     if lhs.shape[1] == 0:
    312         raise PatsyError("model is missing required outcome variables")

~\Anaconda3\lib\site-packages\patsy\highlevel.py in _do_highlevel_design(formula_like, data, eval_env, NA_action, return_type)
    163         return iter([data])
    164     design_infos = _try_incr_builders(formula_like, data_iter_maker, eval_env,
--> 165                                       NA_action)
    166     if design_infos is not None:
    167         return build_design_matrices(design_infos, data,

~\Anaconda3\lib\site-packages\patsy\highlevel.py in _try_incr_builders(formula_like, data_iter_maker, eval_env, NA_action)
     60                 "ascii-only, or else upgrade to Python 3.")
     61     if isinstance(formula_like, str):
---> 62         formula_like = ModelDesc.from_formula(formula_like)
     63         # fallthrough
     64     if isinstance(formula_like, ModelDesc):

第三部分如下:

~\Anaconda3\lib\site-packages\patsy\desc.py in from_formula(cls, tree_or_string)
    162             tree = tree_or_string
    163         else:
--> 164             tree = parse_formula(tree_or_string)
    165         value = Evaluator().eval(tree, require_evalexpr=False)
    166         assert isinstance(value, cls)

~\Anaconda3\lib\site-packages\patsy\parse_formula.py in parse_formula(code, extra_operators)
    146     tree = infix_parse(_tokenize_formula(code, operator_strings),
    147                        operators,
--> 148                        _atomic_token_types)
    149     if not isinstance(tree, ParseNode) or tree.type != "~":
    150         tree = ParseNode("~", None, [tree], tree.origin)

~\Anaconda3\lib\site-packages\patsy\infix_parser.py in infix_parse(tokens, operators, atomic_types, trace)
    208 
    209     want_noun = True
--> 210     for token in token_source:
    211         if c.trace:
    212             print("Reading next token (want_noun=%r)" % (want_noun,))  

第四部分如下:

~\Anaconda3\lib\site-packages\patsy\parse_formula.py in _tokenize_formula(code, operator_strings)
     92         else:
     93             it.push_back((pytype, token_string, origin))
---> 94             yield _read_python_expr(it, end_tokens)
     95 
     96 def test__tokenize_formula():

~\Anaconda3\lib\site-packages\patsy\parse_formula.py in _read_python_expr(it, end_tokens)
     42     origins = []
     43     bracket_level = 0
---> 44     for pytype, token_string, origin in it:
     45         assert bracket_level >= 0
     46         if bracket_level == 0 and token_string in end_tokens:

~\Anaconda3\lib\site-packages\patsy\util.py in next(self)
    330         else:
    331             # May raise StopIteration
--> 332             return six.advance_iterator(self._it)
    333     __next__ = next

第五部分如下:

~\Anaconda3\lib\site-packages\patsy\tokens.py in python_tokenize(code)
     33                 break
     34             origin = Origin(code, start, end)
---> 35             assert pytype not in (tokenize.NL, tokenize.NEWLINE)
     36             if pytype == tokenize.ERRORTOKEN:
     37                 raise PatsyError("error tokenizing input "

AssertionError: 

1 个答案:

答案 0 :(得分:2)

patsy中有一个错误,patsy是statsmodels使用的公式解析器。 我遇到了同样的问题,升级到patsy 0.5.1可以解决该问题。参见:https://github.com/statsmodels/statsmodels/issues/5343