Angela Lin

Results 30 issues of Angela Lin

#2670 introduced `pydocstyle` and `darglint` packages. The `darglint` package specifically increases the runtime of our lint job by a few minutes. While we were okay with this addition, I suspect...

enhancement
testing
performance
spike

Right now, `component_graph.get_component` expects a string which is the unique name used to find a component in the graph (ex: "My Label Encoder", and not "Label Encoder"). This makes it...

new feature
good first issue

Rather than relying on the CV scores to rank the pipelines on the leaderboard, perhaps we should have a model selection split where we hold out some data and rank...

new feature
needs design
spike

``test_components.py::test_describe_component`` is a test that checks if a component.describe() returns the appropriate result. However, if a dev adds a new component, there is nothing requiring the dev to add that...

refactor
testing

I noticed a weird error in https://github.com/alteryx/evalml/pull/2546, a small PR which moved `get_hyperparameter_ranges` to `PipelineBase`. The failed ReadtheDocs build is here: https://readthedocs.com/projects/feature-labs-inc-evalml/builds/683782/, with the following error: ``` Traceback (most recent...

bug
documentation
testing

We currently use graphviz to generate our graphical representation of component graphs / pipelines. https://github.com/alteryx/evalml/pull/2654 updated this representation to include X and y nodes and edges, but seems a little...

enhancement
spike

Separating out work from https://github.com/alteryx/evalml/issues/2058, https://github.com/alteryx/evalml/pull/2968 tackled the first half of creating a preprocessing pipeline that will encompass all of the components created from data check actions. This issue will...

enhancement
new feature
tech debt

Follow up on https://github.com/alteryx/evalml/pull/3182 based on @freddyaboulton's comment: I think we can improve this implementation. Right now we do two scans of the data to determine the highly null columns...

refactor
good first issue
performance

It could be useful to add feature distribution (via histogram?) to our partial dependence plots so users can determine whether there is sufficient data to interpret the relationship between the...

enhancement
good first issue

If I initialize a Woodwork DataTable using a pandas DataFrame and then initialize another Woodwork DataTable using the numpy array underneath, it creates a Woodwork DataTable with category types. However,...

bug