Skip to content
Snippets Groups Projects
Name Last commit Last update
assets
results
source
README.md

AiGAS-dEVL: An Adaptive Incremental Neural Gas Model for Drifting Data Streams under Extreme Verification Latency

AiGAS-dEVL

Python implementation and results corresponding to the work:

M. Arostegi, M. N. Bilbao, J. L. Lobo, J. Del Ser, "AiGAS-dEVL: An Adaptive Incremental Neural Gas Model for Drifting Data Streams under Extreme Verification Latency", under review, 2024.

The ever-growing speed at which data are generated nowadays, together with the substantial cost of labeling processes (often reliant on human supervision) cause Machine Learning models to often face scenarios in which data are partially labeled, or delayed with respect to their query time. The extreme case where such a supervision is indefinitely unavailable is referred to as extreme verification latency. On the other hand, in streaming setups data flows are affected by exogenous factors that yield non-stationarities in the patterns to be modeled (concept drift), compelling models learned incrementally from the data streams to adapt their modeled knowledge to the prevailing concepts within the stream. In this work we address the casuistry in which these two conditions occur together over the data stream, by which adaptation mechanisms to accommodate drifts within the stream are challenged by the lack of supervision, hence requiring further mechanisms to track the evolution of concepts in the absence of verification. To this end we propose a novel approach, coined as AiGAS-dEVL (Adaptive Incremental neural GAS model for drifting Streams under Extreme Verification Latency), which relies on the use of growing neural gas to characterize the shape and inner point distributions of all concepts detected within the stream over time. Our approach exposes that the online analysis of the behavior of these prototypical points over time facilitates the definition of the evolution of concepts in the feature space, the detection of changes in their behavior, and the design of adaptation policies to mitigate the effect of such changes in the model. We assess the performance of AiGAS-dEVL over several synthetic datasets, comparing it to that of state-of-the-art approaches proposed in the recent past to tackle this stream learning setup. Our reported results reveal that AiGAS-dEVL performs competitively with respect to the rest of baselines, exhibiting a superior adaptability over several datasets in the benchmark while ensuring a simple and interpretable instance-based adaptation strategy.

The code implementing the AiGAS-dEVL approach described above is currently being polished intensively. We expect to upload it soon, together with instructions to run it in a local Python repository. It will be provided under request. Results files have been uploaded: a file for each dataset, following the format:

  • Sample at time 1: Feature1, Feature2, ... FeatureN, TrueLabel
  • Sample at time 2: Feature1, Feature2, ... FeatureN, TrueLabel
  • ...
  • Sample at time Ts: Feature1, Feature2, ... FeatureN, TrueLabel, PredictedLabel
  • Sample at time Ts+1: Feature1, Feature2, ... FeatureN, TrueLabel, PredictedLabel
  • Sample at time Ts+2: Feature1, Feature2, ... FeatureN, TrueLabel, PredictedLabel
  • ...

For more information, please contact the corresponding author (Maria Arostegi, maria.arostegi@tecnalia.com).

Please cite this work as:

@misc{arostegi2024, title={{AiGAS-dEVL}: An Adaptive Incremental Neural Gas Model for Drifting Data Streams under Extreme Verification Latency},
author={Maria Arostegi and Miren Nekane Bilbao and Jesus L. Lobo and Javier Del Ser},
year={2024},
eprint={waiting for ArXiV acceptance},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org}}}