menelaus.ensemble
menelaus.ensemble.election
- class menelaus.ensemble.election.ConfirmedElection(sensitivity: int, wait_time: int)[source]
Bases:
Election
Election
for handling detectors, typically in streaming setting. In this scheme, when a single detector alarms, theElection
will wait for a certain number of samples, until one or more other detectors also alarm, confirming the drift.Derived from the ensemble scheme described in Maciel et al. [2015].
- class menelaus.ensemble.election.Election[source]
Bases:
ABC
Abstract base class for implementations of election schemes used to evaluate drift state of ensembles, by operating on the drift states of constituent detectors.
Constructors for sub-classes may differ, but all
Election
classes are callable classes, where the call takes only the list of detectors to evaluate.The surrounding
Ensemble
class will update its drift state within itsupdate
function, by calling theElection
it is given at initialization-time, upon its detectors.
- class menelaus.ensemble.election.MinimumApprovalElection(approvals_needed: int = 1)[source]
Bases:
Election
Election
that determines drift based on whether a minimum number of provided detectors have alarmed. This threshold can be 1 to the maximum number of detectors.
- class menelaus.ensemble.election.OrderedApprovalElection(approvals_needed: int = 1, confirmations_needed: int = 1)[source]
Bases:
Election
Election
that determines drift based on whether:An initial
a
count of detectors alarmed for drift.A subsequent
c
count of detectors confirmed drift.
Hypothethically, the distinction between this and
MinimumApprovalElection(a+c)
, is if the detectors were added to a collection in a meaningful order. As such this voting scheme iterates over detectors in preserved order of insertion into the user-defined list, and uses the firstapprovals_needed
amount for initial detection, and the nextconfirmations_needed
amount for confirmation of drift.- __init__(approvals_needed: int = 1, confirmations_needed: int = 1)[source]
- Parameters
approvals_needed (int) – Minimum number of detectors that must alarm for the ensemble to alarm.
confirmations_needed (int) – Minimum number of confirmations needed to alarm, after
approvals_needed
alarms have been observed.
menelaus.ensemble.ensemble
- class menelaus.ensemble.ensemble.BatchEnsemble(detectors: dict, election, column_selectors: dict = {})[source]
Bases:
BatchDetector
,Ensemble
Implements
Ensemble
class for batch-based drift detectors. Inherits fromEnsemble
andBatchDetector
(i.e.,BatchEnsemble
IS-ABatchDetector
). As such it has the functions of a regular detector,set_reference
,update
, andreset
. These functions will operate not only on the ensemble’s own attributes, but on the set of detectors given to it.- __init__(detectors: dict, election, column_selectors: dict = {})[source]
- Parameters
detectors (dict) – Dictionary of detectors in ensemble, where the key is some unique identifier for a detector, and the value is the initialized detector object. For instance,
{'p': PCA_CD()}
.election (str) – Initialized
Election
object for ensemble to evaluate drift among constituent detectors. See implemented election schemes inmenelaus.ensemble
.columns_selectors (dict, optional) – Table of functions to use for each detector. Functions should take data instance X and return the columns of X that the corresponding detector should operate on. Should match format of
detectors
i.e.{'p': PCA_CD()}
would need an entry{'a': function}
to use this feature. By default, no column selection function is applied to any detector, and they will all use the entirely of the attributes in X.
- reset()[source]
Reset ensemble itself, and each constituent detector’s drift state and other relevant attributes. Intended for use after
drift_state == 'drift'
. CallsEnsemble.reset
andBatchDetector.reset
to do so.
- set_reference(X, y_true=None, y_pred=None)[source]
Initialize ensemble itself, and each constituent detector with a reference batch. Calls
Ensemble.set_reference
to do so.- Parameters
X (pandas.DataFrame or numpy.array) – baseline dataset
y_true (numpy.array) – actual labels of dataset
y_pred (numpy.array) – predicted labels of dataset
- update(X, y_true=None, y_pred=None)[source]
Update ensemble itself, and each constituent detector with new data. Calls
Ensemble.update
andBatchDetector.update
to do so.- Parameters
X (numpy.ndarray) – input data
y_true (numpy.ndarray) – if applicable, true labels of input data
y_pred (numpy.ndarray) – if applicable, predicted labels of input data
- property batches_since_reset
Number of batches since last drift detection.
- Returns
int
- property drift_state
Set detector’s drift state to
"drift"
,"warning"
, orNone
.
- property total_batches
Total number of batches the drift detector has been updated with.
- Returns
int
- class menelaus.ensemble.ensemble.Ensemble(detectors: dict, election, column_selectors: dict = {})[source]
Bases:
object
Parent class for Ensemble detectors. Does not inherit from any detector parent class, but has similar functions
set_reference
,update
,reset
. Can also evaluate the results from all detectors per some voting scheme.Any class hoping to implement ensemble functionality should implement from this.
- reset()[source]
Initialize each detector’s drift state and other relevant attributes. Intended for use after
drift_state == 'drift'
.
- update(X, y_true=None, y_pred=None)[source]
Update each detector in ensemble with new batch of data. Calls self.evaluate() at the end, to determine voting result.
- Parameters
X (numpy.ndarray) – input data
y_true (numpy.ndarray) – if applicable, true labels of input data
y_pred (numpy.ndarray) – if applicable, predicted labels of input data
- class menelaus.ensemble.ensemble.StreamingEnsemble(detectors: dict, election, column_selectors: dict = {})[source]
Bases:
StreamingDetector
,Ensemble
Implements Ensemble class for streaming drift detectors. Inherits from
Ensemble
andStreamingDetector
(i.e.,StreamingEnsemble
IS-AStreamingDetector
). As such it has the functions of a regular detector:update
,reset
, etc. Internally, these operate not only on the ensemble’s own attributes, but on the set of detectors given to it.- __init__(detectors: dict, election, column_selectors: dict = {})[source]
- Parameters
detectors (dict) – Dictionary of detectors in ensemble, where the key is some unique identifier for a detector, and the value is the initialized detector object. For instance,
{'a': ADWIN()}
.election (str) – Initialized
Election
object for ensemble to evaluate drift among constituent detectors. See implemented election schemes inmenelaus.ensemble
.columns_selectors (dict, optional) – Functions to use for each detector. Functions should take data instance X and return the columns of X that the corresponding detector should operate on. Should match format of
detectors
i.e.{'a': ADWIN()}
would need an entry{'a': function}
to use this feature. By default, no column selection function is applied to any detector, and they will all use the entirely of the attributes in X.
- reset()[source]
Reset ensemble itself, and each constituent detector’s drift state and other relevant attributes. Intended for use after
drift_state == 'drift'
. CallsEnsemble.reset
andStreamingDetector.reset
to do so.
- update(X, y_true, y_pred)[source]
Update ensemble itself, and each constituent detector with new data. Calls
Ensemble.update
andStreamingDetector.update
to do so.- Parameters
X (numpy.ndarray) – input data
y_true (numpy.ndarray) – if applicable, true labels of input data
y_pred (numpy.ndarray) – if applicable, predicted labels of input data
- property drift_state
Set detector’s drift state to
"drift"
,"warning"
, orNone
.
- property samples_since_reset
Number of samples since last drift detection.
- Returns
int
- property total_samples
Total number of samples the drift detector has been updated with.
- Returns
int