Cardea icon indicating copy to clipboard operation
Cardea copied to clipboard

Cardea class `predict` functionality

Open sarahmish opened this issue 4 years ago • 0 comments

In reference to issue #85, we discover a new need of the predict function.

current support The current version of the predict functionality only supports an intermediary input of a feature matrix cardea.predict(X) where X is a numpy array. However, there are many scenarios when this could not be the case.

new support Assume new data is given to the user, how can they use the current API to get the predictions of this new data? We need to transform the new data into the intermediate representation used by the modeler. Sequence of transformations include:

  1. load the new data into an entityset
  2. use the same labeling function to generate a label per instance. When it comes to cutoff times, they will be decided based on the current real time.
  3. calculate the feature matrix for the new data using the previous features as seed

proposed changes

def predict(self, X: Union[str, np.ndarray, pd.DataFrame]) -> Union[np.ndarray, list]:
    """Get predictions from the cardea pipeline.

    Args:
        X (str, pandas.DataFrame or ndarray):
            Inputs to the pipeline. If string, it points to the data path.

    Returns:
        numpy.ndarray or list:
            Predictions to the input data.
    """
    if isinstance(X, str):
        pass # run the three steps

sarahmish avatar Apr 14 '21 07:04 sarahmish