patsy icon indicating copy to clipboard operation
patsy copied to clipboard

Function "C" in current environment overrides categorical operator

Open toobaz opened this issue 7 years ago • 1 comments

Sorry in advance if this bug is actually a feature - I've only used patsy under the hoods so I may be missing something.

This said, to reproduce:

In [1]: from patsy import dmatrices, demo_data

In [2]: def C(arg):
   ...:     return None
   ...: 

In [3]: data = demo_data("a", "b", "x1", "x2", "y", "z column")

In [4]: dmatrices("y ~ x1 + x2 + C(a)", data)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-3f624a0ee227> in <module>()
----> 1 dmatrices("y ~ x1 + x2 + C(a)", data)

~/.local/lib/python3.5/site-packages/patsy/highlevel.py in dmatrices(formula_like, data, eval_env, NA_action, return_type)
    308     eval_env = EvalEnvironment.capture(eval_env, reference=1)
    309     (lhs, rhs) = _do_highlevel_design(formula_like, data, eval_env,
--> 310                                       NA_action, return_type)
    311     if lhs.shape[1] == 0:
    312         raise PatsyError("model is missing required outcome variables")

~/.local/lib/python3.5/site-packages/patsy/highlevel.py in _do_highlevel_design(formula_like, data, eval_env, NA_action, return_type)
    163         return iter([data])
    164     design_infos = _try_incr_builders(formula_like, data_iter_maker, eval_env,
--> 165                                       NA_action)
    166     if design_infos is not None:
    167         return build_design_matrices(design_infos, data,

~/.local/lib/python3.5/site-packages/patsy/highlevel.py in _try_incr_builders(formula_like, data_iter_maker, eval_env, NA_action)
     68                                       data_iter_maker,
     69                                       eval_env,
---> 70                                       NA_action)
     71     else:
     72         return None

~/.local/lib/python3.5/site-packages/patsy/build.py in design_matrix_builders(termlists, data_iter_maker, eval_env, NA_action)
    719         term_to_subterm_infos = _make_subterm_infos(termlist,
    720                                                     num_column_counts,
--> 721                                                     cat_levels_contrasts)
    722         assert isinstance(term_to_subterm_infos, OrderedDict)
    723         assert frozenset(term_to_subterm_infos) == frozenset(termlist)

~/.local/lib/python3.5/site-packages/patsy/build.py in _make_subterm_infos(terms, num_column_counts, cat_levels_contrasts)
    626                         coded = code_contrast_matrix(factor_coding[factor],
    627                                                      levels, contrast,
--> 628                                                      default=Treatment)
    629                         contrast_matrices[factor] = coded
    630                         subterm_columns *= coded.matrix.shape[1]

~/.local/lib/python3.5/site-packages/patsy/contrasts.py in code_contrast_matrix(intercept, levels, contrast, default)
    600         return contrast.code_with_intercept(levels)
    601     else:
--> 602         return contrast.code_without_intercept(levels)
    603 

~/.local/lib/python3.5/site-packages/patsy/contrasts.py in code_without_intercept(self, levels)
    181         else:
    182             reference = _get_level(levels, self.reference)
--> 183         eye = np.eye(len(levels) - 1)
    184         contrasts = np.vstack((eye[:reference, :],
    185                                 np.zeros((1, len(levels) - 1)),

~/.local/lib/python3.5/site-packages/numpy/lib/twodim_base.py in eye(N, M, k, dtype, order)
    184     if M is None:
    185         M = N
--> 186     m = zeros((N, M), dtype=dtype, order=order)
    187     if k >= M:
    188         return m

ValueError: negative dimensions are not allowed

Now sure, I could just change the name of my C() function. But:

  • having this problem in the middle of a long notebook, it was pretty hard for me to understand what was going on
  • other operators might be exhibiting the same problem (I have no idea)
  • from an (admittedly quick) look at the docs, I found no reference that patsy is looking for operators in the current scope

toobaz avatar Feb 25 '18 16:02 toobaz

from an (admittedly quick) look at the docs, I found no reference that patsy is looking for operators in the current scope

OK, sorry, I could have searched a bit better. So the question just becomes: is it OK that functions defined in the environment override patsy operator(s)?

toobaz avatar Feb 26 '18 07:02 toobaz