patsy
patsy copied to clipboard
Function "C" in current environment overrides categorical operator
Sorry in advance if this bug is actually a feature - I've only used patsy under the hoods so I may be missing something.
This said, to reproduce:
In [1]: from patsy import dmatrices, demo_data
In [2]: def C(arg):
...: return None
...:
In [3]: data = demo_data("a", "b", "x1", "x2", "y", "z column")
In [4]: dmatrices("y ~ x1 + x2 + C(a)", data)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-3f624a0ee227> in <module>()
----> 1 dmatrices("y ~ x1 + x2 + C(a)", data)
~/.local/lib/python3.5/site-packages/patsy/highlevel.py in dmatrices(formula_like, data, eval_env, NA_action, return_type)
308 eval_env = EvalEnvironment.capture(eval_env, reference=1)
309 (lhs, rhs) = _do_highlevel_design(formula_like, data, eval_env,
--> 310 NA_action, return_type)
311 if lhs.shape[1] == 0:
312 raise PatsyError("model is missing required outcome variables")
~/.local/lib/python3.5/site-packages/patsy/highlevel.py in _do_highlevel_design(formula_like, data, eval_env, NA_action, return_type)
163 return iter([data])
164 design_infos = _try_incr_builders(formula_like, data_iter_maker, eval_env,
--> 165 NA_action)
166 if design_infos is not None:
167 return build_design_matrices(design_infos, data,
~/.local/lib/python3.5/site-packages/patsy/highlevel.py in _try_incr_builders(formula_like, data_iter_maker, eval_env, NA_action)
68 data_iter_maker,
69 eval_env,
---> 70 NA_action)
71 else:
72 return None
~/.local/lib/python3.5/site-packages/patsy/build.py in design_matrix_builders(termlists, data_iter_maker, eval_env, NA_action)
719 term_to_subterm_infos = _make_subterm_infos(termlist,
720 num_column_counts,
--> 721 cat_levels_contrasts)
722 assert isinstance(term_to_subterm_infos, OrderedDict)
723 assert frozenset(term_to_subterm_infos) == frozenset(termlist)
~/.local/lib/python3.5/site-packages/patsy/build.py in _make_subterm_infos(terms, num_column_counts, cat_levels_contrasts)
626 coded = code_contrast_matrix(factor_coding[factor],
627 levels, contrast,
--> 628 default=Treatment)
629 contrast_matrices[factor] = coded
630 subterm_columns *= coded.matrix.shape[1]
~/.local/lib/python3.5/site-packages/patsy/contrasts.py in code_contrast_matrix(intercept, levels, contrast, default)
600 return contrast.code_with_intercept(levels)
601 else:
--> 602 return contrast.code_without_intercept(levels)
603
~/.local/lib/python3.5/site-packages/patsy/contrasts.py in code_without_intercept(self, levels)
181 else:
182 reference = _get_level(levels, self.reference)
--> 183 eye = np.eye(len(levels) - 1)
184 contrasts = np.vstack((eye[:reference, :],
185 np.zeros((1, len(levels) - 1)),
~/.local/lib/python3.5/site-packages/numpy/lib/twodim_base.py in eye(N, M, k, dtype, order)
184 if M is None:
185 M = N
--> 186 m = zeros((N, M), dtype=dtype, order=order)
187 if k >= M:
188 return m
ValueError: negative dimensions are not allowed
Now sure, I could just change the name of my C() function. But:
- having this problem in the middle of a long notebook, it was pretty hard for me to understand what was going on
- other operators might be exhibiting the same problem (I have no idea)
- from an (admittedly quick) look at the docs, I found no reference that patsy is looking for operators in the current scope
from an (admittedly quick) look at the docs, I found no reference that patsy is looking for operators in the current scope
OK, sorry, I could have searched a bit better. So the question just becomes: is it OK that functions defined in the environment override patsy operator(s)?