menelaus.concept_drift

Concept drift algorithms are focused on the detection of drift when true outcomes are available in a supervised learning context. Concept drift is defined as a shift in the joint probability distributions of samples’ feature values and their labels. It can occur when either the distribution of the data changes (as in the unlabeled data case), when the outcome shifts, or when the data samples and the outcomes shift simultaneously.

Concept drift algorithms typically monitor classifier performance metrics over time and signal drift when performance decreases. These algorithms vary in their ability to focus on an isolated performance metric, such as accuracy, or multiple metrics simultaneously, such as true positive, false positive, true negative, and false negative rates.

menelaus.concept_drift.adwin_accuracy

class menelaus.concept_drift.adwin_accuracy.ADWINAccuracy(delta=0.002, max_buckets=5, new_sample_thresh=32, window_size_thresh=10, subwindow_size_thresh=5, conservative_bound=False)[source]

Bases: ADWIN

ADWIN (ADaptive WINdowing) is a change detection algorithm which uses a sliding window to estimate the running mean and variance of a given real-valued number. It can be applied as a concept drift detector by monitoring a performance metric for a given classifier. ADWINAccuracy specifically expects y_true, y_pred, and uses that input to monitor the running accuracy of a classifier. To use ADWIN to monitor other values, see change_detection.ADWIN.

As each sample is added, ADWIN stores a running estimate (mean and variance) for a given statistic, calculated over a sliding window which will grow to the right until drift is detected. The condition for drift is defined over pairs of subwindows at certain cutpoints within the current window. If, for any such pair, the difference between the running estimates of the statistic is over a certain threshold (controlled by delta), we identify drift, and remove the oldest elements of the window until all differences are again below the threshold.

The running estimates in each subwindow are maintained by storing summaries of the elements in “buckets,” which, in this implementation, are themselves stored in the bucket_row_list attribute, whose total size scales with the max_buckets parameter.

When drift occurs, the index of the element at the beginning of ADWIN’s new window is stored in self.retraining_recs.

Ref. Bifet and Gavalda [2007]

__init__(delta=0.002, max_buckets=5, new_sample_thresh=32, window_size_thresh=10, subwindow_size_thresh=5, conservative_bound=False)[source]

Parameters

delta (float, optional) – confidence value on on 0 to 1. ADWIN will incorrectly detect drift with at most probability delta, and correctly detect drift with at least probability 1 - delta. Defaults to 0.002.
max_buckets (int, optional) – the maximum number of buckets to maintain in each BucketRow. Corresponds to the “M” parameter in Bifet 2006. Defaults to 5.
new_sample_thresh (int, optional) – the drift detection procedure will run every new_sample_thresh samples, not in between. Defaults to 32.
window_size_thresh (int, optional) – the minimum number of samples in the window required to check for drift. Defaults to 10.
subwindow_size_thresh (int, optional) – the minimum number of samples in each subwindow reqired to check it for drift. Defaults to 5.
conservative_bound (bool, optional) – whether to assume a ‘large enough’ sample when constructing drift cutoff. Defaults to False.

Raises

ValueError – If ADWIN.delta is not on the range 0 to 1.

mean()

Returns: the estimated average of the passed stream, using the current window
Return type: float

reset(): Initialize the detector’s drift state and other relevant attributes. Intended for use after drift_state == 'drift'.

update(y_true, y_pred, X=None)[source]

Update the detector with a new sample.

Parameters

y_true – one true label from input data.
y_pred – one predicted label from input data.
X – next sample in the stream of data. Not used for this accuracy-based ADWIN. See change_detection.ADWIN for that application.

variance()

Returns: the estimated variance of the passed stream, using the current window
Return type: float

property drift_state: Set detector’s drift state to "drift", "warning", or None.

property retraining_recs

Recommended indices for retraining. If drift is detected, set to [beginning of ADWIN's new window, end of ADWIN's new window]. If these are e.g. the 5th and 13th sample that ADWIN has been updated with, the values will be [4, 12].

Returns: the current retraining recommendations
Return type: list

property samples_since_reset

Number of samples since last drift detection.

Returns: int

property total_samples

Total number of samples the drift detector has been updated with.

Returns: int

menelaus.concept_drift.ddm

class menelaus.concept_drift.ddm.DDM(n_threshold=30, warning_scale=2, drift_scale=3)[source]

Bases: StreamingDetector

DDM is a drift detection algorithm which uses a binary classifier’s error rate, which is binomially distributed. The minimum probability of an error and its standard deviation (p_min, s_min) are found during training. If the running estimates for element i in the stream, probability (p_i) and its standard deviation (s_i), exceeds a certain threshold, then we assume that the distribution of the error rate is no longer stationary (drift has occurred).

If p_i + s_i >= p_min + self.warning_scale * s_min the detector’s state is set to "warning".

If p_i + s_i >= p_min + self.drift_scale * s_min, the detector’s state is set to "drift".

The index of the first sample which triggered a warning/drift state (relative to self.samples_since_reset) is stored in self.retraining_recs.

Ref. Gama et al. [2004]

__init__(n_threshold=30, warning_scale=2, drift_scale=3)[source]

Parameters

n_threshold (int, optional) – the minimum number of samples required to test whether drift has occurred. Defaults to 30.
warning_scale (int, optional) – defines the threshold over which to enter the warning state. Defaults to 2.
drift_scale (int, optional) – defines the threshold over which to enter the drift state. Defaults to 3.

reset()[source]: Initialize the detector’s drift state and other relevant attributes. Intended for use after drift_state == 'drift'.

update(y_true, y_pred, X=None)[source]

Update the detector with a new sample.

Parameters

y_true – one true label from input data.
y_pred – one predicted label from input data.
X – one row of features from input data. Not used in DDM.

property drift_state: Set detector’s drift state to "drift", "warning", or None.

input_type = 'stream'

property retraining_recs

Indices of the first and last recommended training samples. A list of length 2, containing [warning index, drift index]. If no warning occurs, this will instead be [drift index, drift index]. The latter should cause caution, as it indicates an abrupt change. Resets when to [None, None] after drift is detected.

Returns: the current retraining recommendations
Return type: list

property samples_since_reset

Number of samples since last drift detection.

Returns: int

property total_samples

Total number of samples the drift detector has been updated with.

Returns: int

menelaus.concept_drift.eddm

class menelaus.concept_drift.eddm.EDDM(n_threshold=30, warning_thresh=0.95, drift_thresh=0.9)[source]

Bases: StreamingDetector

EDDM is a drift detection algorithm for binary classifiers which uses the distance between two classification errors. The running average distance between two errors (dist_i) and its standard deviation (s_i) are tracked for each element i in the data stream. The maximum values for these two estimates are stored and used to define the warning and drift thresholds. If the distance and its standard deviation exceed a certain threshold, relative to their maxima, we assume that the distance between errors is no longer stationary (drift has occurred).

If (dist_i + 2 * s_i)/(dist_max + 2 * s_max) < warning_thresh, the detector’s state is set to "warning".

If (dist_i + 2 * s_i)/(dist_max + 2 *s_max) < drift_thresh, the detector’s state is set to "drift".

The denominator approximates the 95th percentile of the distribution of the distance for “large” samples.

The index of the first sample which triggered a warning/drift state (relative to self.samples_since_reset) is stored in self.retraining_recs.

Ref. Baena-Garcıa et al. [2006]

__init__(n_threshold=30, warning_thresh=0.95, drift_thresh=0.9)[source]

Parameters

n_threshold (int, optional) – the minimum number of samples required to test whether drift has occurred. Defaults to 30.
warning_thresh (float, optional) – defines the threshold over which to enter the warning state. Defaults to 0.95.
drift_thresh (float, optional) – defines the threshold over which to enter the drift state. Defaults to 0.9.

reset()[source]: Initialize the detector’s drift state and other relevant attributes. Intended for use after drift_state == 'drift'.

update(y_true, y_pred, X=None)[source]

Update the detector with a new sample.

Parameters

y_true – one true label from input data.
y_pred – one predicted label from input data.
X – one row of features from input data. Not used in EDDM.

property drift_state: Set detector’s drift state to "drift", "warning", or None.

input_type = 'stream'

property retraining_recs

Recommended indices for retraining. Usually [first warning index, drift index]. If no warning state occurs, this will instead be [drift index, drift index] – this indicates an abrupt change. Resets when self.drift_state returns to None (no drift nor warning).

Returns: the current retraining recommendations
Return type: list

property samples_since_reset

Number of samples since last drift detection.

Returns: int

property total_samples

Total number of samples the drift detector has been updated with.

Returns: int

menelaus.concept_drift.lfr

class menelaus.concept_drift.lfr.LinearFourRates(time_decay_factor=0.9, warning_level=0.05, detect_level=0.05, burn_in=50, num_mc=10000, subsample=1, rates_tracked=['tpr', 'tnr', 'ppv', 'npv'], parallelize=False, round_val=4)[source]

Bases: StreamingDetector

Linear Four Rates detects drift in a learner’s true positive rate (TPR), true negative rate (TNR), negative predictive value (NPV), and positive predictive value (PPV) over time. It relies on the assumption that a significant change in any of these rates implies a change in the joint distribution of of the features and their classification.

For each rate, the empirical rate is calculated at each sample. The test statistic for each rate is a weighted average of all observed empirical rates, which is used to test the hypothesis that the distribution of the given rate at time t-1 is equal to the distribution of the rate at time t. More accurate estimates for the bounds of these empirical rates are obtained by Monte Carlo simulation.

This implementation incorporates a semi-offline bounds dictionary to reduce runtime. Instead of running the Monte Carlo simulations for each combination of number of time steps and estimated empirical rate, if a given combination has been simulated before, the bounds are re-used.

Ref. Wang and Abraham [2015]

__init__(time_decay_factor=0.9, warning_level=0.05, detect_level=0.05, burn_in=50, num_mc=10000, subsample=1, rates_tracked=['tpr', 'tnr', 'ppv', 'npv'], parallelize=False, round_val=4)[source]

Parameters

time_decay_factor (float, optional) – Amount of weight given to current timepoint, must be in [0,1]. The smaller the value, the more conservative in identifying drift and the less weight given to abrupt changes. Defaults to 0.9.
warning_level (float, optional) – Statistical significance level for warnings. Defaults to 0.05.
detect_level (float, optional) – Statistical significance level for detection. Defaults to 0.05.
burn_in (int, optional) – Number of observations to make up a burn-in period; simulations will not happen until this index has passed, initially and after reaching drift state. Defaults to 50.
num_mc (int, optional) – Number of Monte Carlo iterations to run. Defaults to 10000.
subsample (int, optional) – A subsample of value n will only test for drift every nth observation. Rates will still be calculated, the monte carlo simulation will not be. Larger subsample value will decrease the runtime. Defaults to 1.
rates_tracked (list, optional) – A list of the rates that this LFR algorithm should track and alert the user based on changes. Fewer rates can be tracked based on use case context, as well as to improve runtime performance. Defaults to all four rates, [“tpr”, “tnr”, “ppv”, “npv”].
parallelize (boolean, optional) – A flag that determines whether bound calculations across the rates being tracked by this LFR algorithm will be parallelized or not. Advantageous for large datasets, but will slow down runtime for fewer data due to overhead of threading. Defaults to False.
round_val – number of decimal points the estimate rate is rounded to when stored in bounds dictionary. The greater the round_val, the more precise the bounds dictionary will be, and the longer the runtime. (Default value = 4)

reset()[source]: Initialize the detector’s drift state and other relevant attributes. Intended for use after drift_state == 'drift'.

update(y_true, y_pred, X=None)[source]

Update detector with a new observation:

Updates confusion matrix (self._confusion) with new predictions
Updates the four rates
Test each rate for change over time using bounds from Monte Carlo simulations
If any of the rates exceed bounds, change drift_state to either "warning" or "drift"

Parameters

y_true – one true label from input data.
y_pred – one predicted label from input data.
X – one row of features from input data. Not used in LFR.

property drift_state: Set detector’s drift state to "drift", "warning", or None.

input_type = 'stream'

property retraining_recs

Recommended indices between the first warning and drift for retraining. Resets during return to normal state after each detection of drift. If no warning fires, recommendation is from current drift -> current drift. In this case, caution is urged when retraining, as this situation indicates an abrupt change.

Returns: the current retraining recommendations
Return type: list

property samples_since_reset

Number of samples since last drift detection.

Returns: int

property total_samples

Total number of samples the drift detector has been updated with.

Returns: int

menelaus.concept_drift.md3

class menelaus.concept_drift.md3.MD3(clf, margin_calculation_function=<function MD3.calculate_margin_inclusion_signal>, sensitivity=2, k=10, oracle_data_length_required=None)[source]

Bases: DriftDetector

The Margin Density Drift Detection (MD3) method is a drift detection algorithm that alarms based on the cumulative tracking of the number of samples in the margin, the uncertainty region of a classifier. Tracking samples that fall in the margin is an unsupervised task, as no true labels are required. However, this can lead to more common false alarms.

To counter this, MD3 has an initial drift warning step based on Margin Density, and then confirms or rules out drift based on accuracy of predictions on a labeled dataset that is accumulated from the “Oracle”, or directly from the data stream.

Margin Density (MD): “The expected number of data samples that fall within a robust classifier’s (one that distributes importance weights among its features) region of uncertainty, i.e. its margin.” Sethi and Kantardzic [2017]

Because the MD metric is essentially the total number of samples that fall with the margin divided by the total number of samples in the set, its value is in the range of [0, 1].

Ref. Sethi and Kantardzic [2017]

__init__(clf, margin_calculation_function=<function MD3.calculate_margin_inclusion_signal>, sensitivity=2, k=10, oracle_data_length_required=None)[source]

Parameters

clf (classifier) – the classifier for which we are tracking drift. If classifier is not of type sklearn.svm.svc, a margin_calculation_function must be passed in for appropriate margin signal tracking.
margin_calculation_function (function) – the appropriate margin signal function for the classifier. Takes in two arguments: (1) an incoming sample of size 1 as a numpy array and (2) the classifier for this detector. Should return 1 if the sample falls in the margin of the classifier, 0 if not. Defaults to the calculate_margin_inclusion_signal function, which is designed specifically for an sklearn.svm.SVC classifier.
sensitivity (float) – the sensitivity at which a change in margin density will be detected. Change is signaled when the margin density at a time t, given by MD_t, deviates by more than sensitivity standard deviations from the reference margin density value MD_Ref. A larger value can be set if frequent signaling is not desired. Alternatively, a lower value could be used for applications where small changes could be harmful, if undetected. Defaults to 2.
k (int) – the number of folds that will be used in k-fold cross validation when measuring the distribution statistics of the reference batch of data. Defaults to 10.
oracle_data_length_required (int) – the number of samples that will need to be collected by the oracle when drift is suspected, for the purpose of either confirming or ruling out drift, and then retraining the classifier if drift is confirmed. Defaults to the length of the reference distribution (this is set in the set_reference method).

calculate_distribution_statistics(data)[source]

Calculate the following five statistics for the data distribution passed in:

Length/Number of Samples (len)

Margin Density (md)

Standard Deviation of Margin Density (md_std)

Accuracy (acc)

Standard Deivation of Accuracy (acc_std)

Parameters: data (DataFrame) – batch of data to calculate distribution statistics for

calculate_margin_inclusion_signal(sample, clf)[source]

Calculate the value of the margin inclusion signal for an incoming sample that the detector is being updated with. Uses the classifier passed in for this margin calculation. If the sample lies in the margin of the classifier, then a value of 1 is returned for the margin inclusion signal. Otherwise, a value of 0 is returned.

Parameters

sample (numpy.array) – feature values/sample data for the new incoming sample
clf (sklearn.svm.SVC) – the classifier for which we are calculating margin inclusion signal.

give_oracle_label(labeled_sample)[source]

Provide the detector with a labeled sample to confirm or rule out drift. Once a certain number of samples is accumulated, drift can be confirmed or ruled out. If drift is confirmed, retraining will be initiated using these samples, and the reference distribution will be updated accordingly.

Parameters: labeled_sample (DataFrame) – labeled data sample

reset()[source]: Initialize the detector’s drift state and other relevant attributes. Intended for use after drift_state == 'drift'.

set_reference(X, y_true=None, y_pred=None, target_name=None)[source]

Initialize detector with a reference batch. Reference batch must be manually set and updated by user using this method. Reference batch is not automatically updated after a drift is detected.

Parameters

reference_batch (pandas.DataFrame) – initial baseline dataset
y_true (numpy.array) – true labels of dataset - not used in MD3
y_pred (numpy.array) – predicted labels of dataset - not used in MD3
target_name (string) – name of the column in the reference_batch dataframe which is the target variable

update(X, y_true=None, y_pred=None)[source]

Update the detector with a new sample.

Parameters

X (DataFrame) – feature values/sample data for the new incoming sample
y_true (numpy.array) – true label of new sample - not used in MD3
y_pred (numpy.array) – predicted label of new sample - not used in MD3

property drift_state: Detector’s current drift state, with values "drift", "warning",or None.

input_type = 'stream'

property total_updates

Number of samples/batches the drift detector has ever been updated with.

Returns: int

property updates_since_reset

Number of samples/batches since the last time the drift detector was reset.

Returns: int

menelaus.concept_drift.stepd

class menelaus.concept_drift.stepd.STEPD(window_size=30, alpha_warning=0.05, alpha_drift=0.003)[source]

Bases: StreamingDetector

STEPD is a drift detection algorithm based on a binary classifier’s accuracy, intended for an online classifier.

Two windows are defined – “recent” and “past”, with corresponding accuracies p_r and `p_p`. Roughly, the distribution of their absolute difference, normalized by the accuracy of the two windows combined, T, is normally distributed. So, this test statistic’s p-value P(T) defines the warning and drift regions.

If p_r < p_p (the classifier’s accuracy on recent samples is decreased):

and P(T) < alpha_warning, the detector’s state is set to "warning".
and P(T) < alpha_drift, the detector’s state is set to "drift".

The index of the first sample which triggered a warning/drift state (relative to self.samples_since_reset) is stored in self._retraining_recs, for retraining the classifier when drift occurs.

STEPD is intended for use with an online classifier, which is trained on every new sample. That is, with each new sample, the question is not whether the classifier will be retrained; it’s whether some part of the previous training data should be excluded during retraining. The implementation depends on whether the classifier involved is able to incrementally retrain using only a single data point vs. being required to retrain on the entire set.

Ref. Nishida and Yamauchi [2007]

__init__(window_size=30, alpha_warning=0.05, alpha_drift=0.003)[source]

Parameters

window_size (int, optional) – the size of the “recent” window. Defaults to 30.
alpha_warning (float, optional) – defines the threshold over which to enter the warning state. Defaults to 0.05.
alpha_drift (float, optional) – defines the threshold over which to enter the drift state. Defaults to 0.003.

overall_accuracy()[source]

Returns: the accuracy of the classifier among the samples the detector has seen since the detector was last reset
Return type: float

past_accuracy()[source]

Returns: the accuracy of the classifier among the samples the detector has seen before its current window, but after the last time the detector was reset
Return type: float

recent_accuracy()[source]

Returns: the accuracy of the classifier among the last self.window_size samples the detector has seen
Return type: float

reset()[source]: Initialize the detector’s drift state and other relevant attributes. Intended for use after drift_state == 'drift'.

update(y_true, y_pred, X=None)[source]

Update the detector with a new sample.

Parameters

y_true – one true label from input data
y_pred – one predicted label from input data
X – one row of features from input data. Not used by STEPD.

property drift_state: Set detector’s drift state to "drift", "warning", or None.

input_type = 'stream'

property retraining_recs: Returns: list: the current retraining recommendations

property samples_since_reset

Number of samples since last drift detection.

Returns: int

property total_samples

Total number of samples the drift detector has been updated with.

Returns: int