menelaus.ensemble

menelaus.ensemble.election

class menelaus.ensemble.election.ConfirmedElection(sensitivity: int, wait_time: int)[source]

Bases: Election

Election for handling detectors, typically in streaming setting. In this scheme, when a single detector alarms, the Election will wait for a certain number of samples, until one or more other detectors also alarm, confirming the drift.

Derived from the ensemble scheme described in Maciel et al. [2015].

__init__(sensitivity: int, wait_time: int)[source]

Parameters

sensitivity (int) – how many combined waiting/new drift alarms should result in ensemble alarm
wait_time (int) – after how many steps of waiting, should each detector reset its time spent waiting post-drift-alarm

class menelaus.ensemble.election.Election[source]

Bases: ABC

Abstract base class for implementations of election schemes used to evaluate drift state of ensembles, by operating on the drift states of constituent detectors.

Constructors for sub-classes may differ, but all Election classes are callable classes, where the call takes only the list of detectors to evaluate.

The surrounding Ensemble class will update its drift state within its update function, by calling the Election it is given at initialization-time, upon its detectors.

class menelaus.ensemble.election.MinimumApprovalElection(approvals_needed: int = 1)[source]

Bases: Election

Election that determines drift based on whether a minimum number of provided detectors have alarmed. This threshold can be 1 to the maximum number of detectors.

__init__(approvals_needed: int = 1)[source]

Parameters: approvals_needed (int) – minimum approvals to alarm

class menelaus.ensemble.election.OrderedApprovalElection(approvals_needed: int = 1, confirmations_needed: int = 1)[source]

Bases: Election

Election that determines drift based on whether:

An initial a count of detectors alarmed for drift.

A subsequent c count of detectors confirmed drift.

Hypothethically, the distinction between this and MinimumApprovalElection(a+c), is if the detectors were added to a collection in a meaningful order. As such this voting scheme iterates over detectors in preserved order of insertion into the user-defined list, and uses the first approvals_needed amount for initial detection, and the next confirmations_needed amount for confirmation of drift.

__init__(approvals_needed: int = 1, confirmations_needed: int = 1)[source]

Parameters

approvals_needed (int) – Minimum number of detectors that must alarm for the ensemble to alarm.
confirmations_needed (int) – Minimum number of confirmations needed to alarm, after approvals_needed alarms have been observed.

class menelaus.ensemble.election.SimpleMajorityElection[source]

Bases: Election

Election that determines drift for an ensemble, based on whether a simple majority of the ensemble’s detectors have voted for drift.

Ref Du et al. [2014]

menelaus.ensemble.ensemble

class menelaus.ensemble.ensemble.BatchEnsemble(detectors: dict, election, column_selectors: dict = {})[source]

Bases: BatchDetector, Ensemble

Implements Ensemble class for batch-based drift detectors. Inherits from Ensemble and BatchDetector (i.e., BatchEnsemble IS-A BatchDetector). As such it has the functions of a regular detector, set_reference, update, and reset. These functions will operate not only on the ensemble’s own attributes, but on the set of detectors given to it.

__init__(detectors: dict, election, column_selectors: dict = {})[source]

Parameters

detectors (dict) – Dictionary of detectors in ensemble, where the key is some unique identifier for a detector, and the value is the initialized detector object. For instance, {'p': PCA_CD()}.
election (str) – Initialized Election object for ensemble to evaluate drift among constituent detectors. See implemented election schemes in menelaus.ensemble.
columns_selectors (dict, optional) – Table of functions to use for each detector. Functions should take data instance X and return the columns of X that the corresponding detector should operate on. Should match format of detectors i.e. {'p': PCA_CD()} would need an entry {'a': function} to use this feature. By default, no column selection function is applied to any detector, and they will all use the entirely of the attributes in X.

reset()[source]: Reset ensemble itself, and each constituent detector’s drift state and other relevant attributes. Intended for use after drift_state == 'drift'. Calls Ensemble.reset and BatchDetector.reset to do so.

set_reference(X, y_true=None, y_pred=None)[source]

Initialize ensemble itself, and each constituent detector with a reference batch. Calls Ensemble.set_reference to do so.

Parameters

X (pandas.DataFrame or numpy.array) – baseline dataset
y_true (numpy.array) – actual labels of dataset
y_pred (numpy.array) – predicted labels of dataset

update(X, y_true=None, y_pred=None)[source]

Update ensemble itself, and each constituent detector with new data. Calls Ensemble.update and BatchDetector.update to do so.

Parameters

X (numpy.ndarray) – input data
y_true (numpy.ndarray) – if applicable, true labels of input data
y_pred (numpy.ndarray) – if applicable, predicted labels of input data

property batches_since_reset

Number of batches since last drift detection.

Returns: int

property drift_state: Set detector’s drift state to "drift", "warning", or None.

property total_batches

Total number of batches the drift detector has been updated with.

Returns: int

class menelaus.ensemble.ensemble.Ensemble(detectors: dict, election, column_selectors: dict = {})[source]

Bases: object

Parent class for Ensemble detectors. Does not inherit from any detector parent class, but has similar functions set_reference, update, reset. Can also evaluate the results from all detectors per some voting scheme.

Any class hoping to implement ensemble functionality should implement from this.

__init__(detectors: dict, election, column_selectors: dict = {})[source]

reset()[source]: Initialize each detector’s drift state and other relevant attributes. Intended for use after drift_state == 'drift'.

update(X, y_true=None, y_pred=None)[source]

Update each detector in ensemble with new batch of data. Calls self.evaluate() at the end, to determine voting result.

Parameters

X (numpy.ndarray) – input data
y_true (numpy.ndarray) – if applicable, true labels of input data
y_pred (numpy.ndarray) – if applicable, predicted labels of input data

class menelaus.ensemble.ensemble.StreamingEnsemble(detectors: dict, election, column_selectors: dict = {})[source]

Bases: StreamingDetector, Ensemble

Implements Ensemble class for streaming drift detectors. Inherits from Ensemble and StreamingDetector (i.e., StreamingEnsemble IS-A StreamingDetector). As such it has the functions of a regular detector: update, reset, etc. Internally, these operate not only on the ensemble’s own attributes, but on the set of detectors given to it.

__init__(detectors: dict, election, column_selectors: dict = {})[source]

Parameters

detectors (dict) – Dictionary of detectors in ensemble, where the key is some unique identifier for a detector, and the value is the initialized detector object. For instance, {'a': ADWIN()}.
election (str) – Initialized Election object for ensemble to evaluate drift among constituent detectors. See implemented election schemes in menelaus.ensemble.
columns_selectors (dict, optional) – Functions to use for each detector. Functions should take data instance X and return the columns of X that the corresponding detector should operate on. Should match format of detectors i.e. {'a': ADWIN()} would need an entry {'a': function} to use this feature. By default, no column selection function is applied to any detector, and they will all use the entirely of the attributes in X.

reset()[source]: Reset ensemble itself, and each constituent detector’s drift state and other relevant attributes. Intended for use after drift_state == 'drift'. Calls Ensemble.reset and StreamingDetector.reset to do so.

update(X, y_true, y_pred)[source]

Update ensemble itself, and each constituent detector with new data. Calls Ensemble.update and StreamingDetector.update to do so.

Parameters

X (numpy.ndarray) – input data
y_true (numpy.ndarray) – if applicable, true labels of input data
y_pred (numpy.ndarray) – if applicable, predicted labels of input data

property drift_state: Set detector’s drift state to "drift", "warning", or None.

property samples_since_reset

Number of samples since last drift detection.

Returns: int

property total_samples

Total number of samples the drift detector has been updated with.

Returns: int