Model Objects

Overview

Skater contains an abstraction for predictive models. Models apis vary by implementation. The skater Model object manages variations in how models are called, the inputs they expect, and the outputs they generate, so that inputs, outputs, and calls are standardized to both the user and to the rest of the code base. Currently the Model object acts as the base class for the InMemoryModel and DeployedModel class, though this API may change in later versions.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import GradientBoostingClassifier

breast_cancer = load_breast_cancer()
X = breast_cancer.data
y = breast_cancer.target

gb = GradientBoostingClassifier()
gb.fit(X,y)

from skater.model import InMemoryModel
model = InMemoryModel(gb.predict_proba, examples = X)

InMemoryModel

Models that are callable function are exposed via the InMemoryModel object.

InMemoryModel.__init__(prediction_fn, input_formatter=None, output_formatter=None, target_names=None, feature_names=None, unique_values=None, examples=None, model_type=None, probability=None, log_level=30)

This model can be called directly from memory

Parameters:
prediction_fn: callable

function that returns predictions

input_formatter: callable

This function will run on input data before passing to the prediction_fn. This usually should take your data type and convert them to numpy arrays or dataframes.

output_formatter: callable

This function will run on input data before passing to the prediction_fn. This usually should take your data type and convert them to numpy arrays or dataframes.

target_names: array type

(optional) names of classes that describe model outputs.

feature_names: array type

(optional) Names of features the model consumes.

unique_values: array type

The set of possible output values. Only use on classifier models that return “best guess” predictions, not probability scores, e.g.

model.predict(fruit1) -> ‘apple’ model.predict(fruit2) -> ‘banana’

[‘apple’,’banana’] are the unique_values of the classifier

examples: numpy.array or pandas.dataframe

optional examples to use to make inferences about the function.

model_type: None, “classifier”, “regressor”

Indicates which type of model is being used. If left as None, will try to infer based on the signature of the output type.

probability: None, True, False

If using a classifier, indicates whether probabilities are provided (as opposed to indicators/labels).

log_level: int

config setting to see model logs. 10 is a good value for seeing debug messages. 30 is warnings only.

DeployedModel

Models that are deployed, and therefore callable via http posts are exposed via the DeployedModel object.

DeployedModel.__init__(uri, input_formatter, output_formatter, request_kwargs={}, target_names=None, feature_names=None, unique_values=None, examples=None, model_type=None, probability=None, log_level=30)

This model can be called by making http requests to the passed in uri.

Parameters:
uri: string

Where to post requests

input_formatter: callable

This function will run on input data before passing to requests library. This usually should take array types and convert them to JSON.

output_formatter: callable

This function will run on outputs before returning results to interpretation objects. This usually should take request objects and convert them to array types.

request_kwargs: dict

any additional request headers that need to be passed, such as api keys, content types, etc.

target_names: array type

(optional) The names of the target variable/classes. There should be as many names as there are outputs per prediction. Defaults to Predicted Value for regression and Class 1…n for classification.

feature_names: array type

(optional) Names of features the model consumes.

unique_values: array type

The set of possible output values. Only use on classifier models that return “best guess” predictions, not probability scores, e.g.

model.predict(fruit1) -> ‘apple’ model.predict(fruit2) -> ‘banana’

[‘apple’,’banana’] are the unique_values of the classifier

examples:

optional examples to use to make inferences about the function.

model_type: None, “classifier”, “regressor”

Indicates which type of model is being used. If left as None, will try to infer based on the signature of the output type.

probability: None, True, False

If using a classifier, indicates whether probabilities are provided (as opposed to indicators/labels).

log_level: int

see skater.model.Model for details