Collection of Time-Series Classification Datasets with Pretrained Deep Models and SHAP, LIME & Anchor Explanations
Authors/Creators
Description
A benchmark suite for reproducible time-series XAI research
Precomputed boundle of post-hoc explanations and black-box models for time-series classification
This dataset contains 83 univariate and 20 multivariate time series datasets from the TSC repository, each used for multiclass classification with a deep learning model. For each dataset, we provide:
• Precomputed train/test splits
• A trained TensorFlow model
• Post-hoc local explanations generated using three methods: SHAP, LIME, and Anchor
Impact of the dataset
These datasets provide a ready-to-use benchmark for Explainable AI in time series classification. Since models and explanation outputs are precomputed, researchers can immediately use them for evaluation, visualization, or developing new post-hoc XAI techniques. This reduces the overhead of retraining or re-explaining models, supports reproducibility, and enables systematic comparisons across explanation methods.
Comprehensive Coverage: 83 univariate and 20 multivariate UCR/UEA time-series, each with a standardized 75/25 train–test split in NumPy .pickle format.
Pretrained Models: Ready-to-use ConvLSTM1D TensorFlow models for every dataset, eliminating costly training and ensuring experimental consistency.
Precomputed Explanations: Post-hoc outputs for training and test sets from KernelSHAP, LIME, and Anchor, including attribution scores and rule sets with confidence and coverage.
Open Data and Code: All files are CC-BY-4.0, and a linked GitHub repository provides Jupyter notebooks for loading data, inspecting outputs, and applying XAI methods.
Repository content
- `train_test.zip` — contains files of the form `{univariate|multivariate}_{series_name}_train_and_test.zip`
Each includes: `trainX.pickle`, `trainy.pickle`, `testX.pickle`, `testy.pickle`
Format: `numpy.array`
- `models.zip` — trained models as directories in the form `{univariate|multivariate}_{series_name}_model_tf.zip`
Each contains a TensorFlow SavedModel. Format: `numpy.array`
- `shap.zip` — SHAP values for each dataset in `{series_name}_shap_values.zip`
Files: `svtr.pickle` (train), `svts.pickle` (test). Format: `numpy.array`
- `lime.zip` — LIME values for each dataset in `{series_name}_lime_values.zip`
Files: `lvtr.pickle` (train), `lvts.pickle` (test). Format: `numpy.array`
- `anchor.zip` — rule-based Anchor explanations per dataset in `{series_name}_anchor_values.zip`
Files: `avtr.pickle` (train), `avts.pickle` (test). Format: `List[List[Dictionary]]`
Technical info (English)
Model Training
The time series classification model was trained using a deep neural architecture based on stacked ConvLSTM1D layers. Input data consisted of time series samples of shape (T, F), where T denotes the number of timesteps and F the number of features. The same model architecture was used for both univariate and multivariate input; in the univariate case, the number of features F = 1. Labels were one-hot encoded, producing output vectors of length equal to the number of classes. Prior to training, the data was normalized using StandardScaler, and any missing values were imputed with zeros.
To facilitate temporal feature learning, the input sequences were divided into smaller temporal blocks. Specifically, each sequence of length T was segmented into n_steps parts, where n_steps corresponds to the third smallest integer divisor of T (excluding 1 and 2). The segment length was then computed as n_length = T / n_steps, resulting in a reshaped input of shape (n_steps, n_length, F). This restructuring enables the model to capture both local patterns within segments and long-range dependencies across segments.
The model architecture includes two ConvLSTM1D layers with 64 and 32 filters, respectively, each using a kernel size of 9 and ReLU activation. A dropout layer with a rate of 0.5 follows for regularization, and the output is flattened to create an intermediate embedding representation. This is followed by two fully connected layers: one with 100 ReLU units and another with softmax activation for classification. Class imbalance was addressed using computed class weights during training. The model was trained using the Adam optimizer and categorical cross-entropy loss for 25 epochs with a batch size of 64.
The architecture of the models used for training different datasets is presented below. The number of trainable parameters depends on the shapes of the input.
Layer (type) Output Shape
=================================================================
reshape (Reshape) (None, n_steps, n_length, F)
conv_lstm1d (ConvLSTM1D) (None, n_steps, n_length, 64)
conv_lstm1d_1 (ConvLSTM1D) (None, n_steps, n_length, 32)
dropout (Dropout) (None, n_steps, n_length, 32)
embedding (Flatten) (None, 1472)
dense (Dense) (None, 100)
dense_1 (Dense) (None, n_classes)
=================================================================
The dataset was split into training and testing sets using a standard random partitioning strategy with a 75:25 ratio. This means that 75% of the samples were used for model training, while the remaining 25% were held out for evaluation. Stratified sampling was applied to preserve class distribution across both sets. Labels were one-hot encoded to support categorical cross-entropy loss during training.
Explanations
The explanations were computed for both training and test subsets. All indices in the explanation files are aligned with the respective train/test instances.
Explanation coverage:
- Univariate: Anchor (39.75%), LIME (100.00%), SHAP (100.00%), all three (39.75%)
- Multivariate: Anchor (100.00%), LIME (95.00%), SHAP (100.00%), all three (95.00%)
Anchor:
Each explanation is a list of rule sets grouped per instance, e.g.:
[
[
{
'index': 0,
'success': True,
'prediction': '1',
'rule': {
'feature_1': ['>-0.74'],
'feature_4': ['<=0.73'],
'feature_11': ['>-0.74'],
'feature_42': ['<=0.73'],
'feature_110': ['>-0.74']
},
'confidence': 0.9565,
'coverage': 0.7197
}
]
]
Each entry corresponds to a sample index. The `rule` defines a conjunction of feature constraints satisfied by the sample. `confidence` measures the fraction of samples fulfilling the rule for which the model gives the same prediction, while `coverage` is the fraction of the train/test set satisfying the rule.
Accompanying GitHub repository
Code used for generation:
- Training code: https://github.com/mozo64/uci-time-series-xai-benchmark/blob/main/notebooks/UCI-workflow-train-balckbox-model.ipynb
- SHAP & LIME: https://github.com/mozo64/metro3/blob/main/notebooks/UCI-workflow-benchmark-2.ipynb
- Anchor: https://github.com/mozo64/metro3/blob/main/notebooks/UCI-workflow-benchmark-4-anchor.ipynb
- Demostration how to read the data: https://github.com/mozo64/uci-time-series-xai-benchmark/blob/main/notebooks/UCI-workflow-how-to.ipynb
List of all Time series
Multivariate:
| Time Series | Model Accuracy | Anchor | LIME | SHAP || :------------------------ | :------------- | :----- | :--- | :--- || ArticularyWordRecognition | 96,53 | Yes | Yes | Yes || AtrialFibrillation | 37,50 | Yes | Yes | Yes || BasicMotions | 100,00 | Yes | Yes | Yes || Cricket | 93,33 | Yes | Yes | Yes || Epilepsy | 91,30 | Yes | Yes | Yes || ERing | 98,67 | Yes | Yes | Yes || EthanolConcentration | 22,14 | Yes | Yes | Yes || FaceDetection | 50,25 | Yes | No | Yes || FingerMovements | 57,69 | Yes | Yes | Yes || HandMovementDirection | 23,73 | Yes | Yes | Yes || Handwriting | 58,40 | Yes | Yes | Yes || Heartbeat | 72,82 | Yes | Yes | Yes || Libras | 63,33 | Yes | Yes | Yes || LSST | 22,16 | Yes | Yes | Yes || NATOPS | 86,67 | Yes | Yes | Yes || PenDigits | 99,24 | Yes | Yes | Yes || RacketSports | 82,89 | Yes | Yes | Yes || SelfRegulationSCP1 | 89,36 | Yes | Yes | Yes || SelfRegulationSCP2 | 45,26 | Yes | Yes | Yes || UWaveGestureLibrary | 94,55 | Yes | Yes | Yes |
Univariate:
| Time Series | Model Accuracy | Anchor | LIME | SHAP ||----------------------------------|------------------|----------|--------|--------|| Adiac | 29.59 | No | Yes | Yes || Beef | 53.33 | No | Yes | Yes || BeetleFly | 80.00 | Yes | Yes | Yes || BirdChicken | 50.00 | No | Yes | Yes || BME | 86.67 | No | Yes | Yes || CBF | 100.00 | No | Yes | Yes || Chinatown | 98.90 | Yes | Yes | Yes || Coffee | 53.33 | No | Yes | Yes || Computers | 53.33 | No | Yes | Yes || CricketX | 61.54 | Yes | Yes | Yes || CricketY | 60.51 | Yes | Yes | Yes || CricketZ | 62.56 | Yes | Yes | Yes || Crop | 76.78 | No | Yes | Yes || DiatomSizeReduction | 100,00 | Yes | Yes | Yes || DistalPhalanxOutlineAgeGroup | 77,78 | Yes | Yes | Yes || DistalPhalanxOutlineCorrect | 78,08 | Yes | Yes | Yes || DistalPhalanxTW | 66,67 | No | Yes | Yes || DodgerLoopDay | 55,00 | Yes | Yes | Yes || DodgerLoopGame | 87,50 | Yes | Yes | Yes || DodgerLoopWeekend | 97,50 | No | Yes | Yes || Earthquakes | 68,10 | No | Yes | Yes || ECG200 | 86,00 | Yes | Yes | Yes || ECG5000 | 91,52 | Yes | Yes | Yes || ECGFiveDays | 100,00 | Yes | Yes | Yes || ElectricDevices | 85,55 | No | Yes | Yes || FaceFour | 96,43 | Yes | Yes | Yes || FiftyWords | 63,00 | No | Yes | Yes || FordA | 86,84 | Yes | Yes | Yes || FordB | 85,61 | No | Yes | Yes || FreezerRegularTrain | 99,87 | No | Yes | Yes || FreezerSmallTrain | 99,44 | No | Yes | Yes || Fungi | 0,00 | Yes | Yes | Yes || GunPoint | 86,00 | No | Yes | Yes || GunPointAgeSpan | 88,50 | Yes | Yes | Yes || GunPointMaleVersusFemale | 100,00 | No | Yes | Yes || GunPointOldVersusYoung | 100,00 | No | Yes | Yes || Herring | 56,25 | No | Yes | Yes || InsectWingbeatSound | 66,91 | No | Yes | Yes || ItalyPowerDemand | 96,35 | No | Yes | Yes || LargeKitchenAppliances | 64,89 | No | Yes | Yes || Lightning2 | 58,06 | No | Yes | Yes || Lightning7 | 66,67 | No | Yes | Yes || Meat | 63,33 | Yes | Yes | Yes || MedicalImages | 68,53 | Yes | Yes | Yes || MiddlePhalanxOutlineAgeGroup | 78,42 | Yes | Yes | Yes || MiddlePhalanxOutlineCorrect | 69,51 | No | Yes | Yes || MiddlePhalanxTW | 64,75 | No | Yes | Yes || MoteStrain | 95,60 | No | Yes | Yes || OliveOil | 13,33 | Yes | Yes | Yes || OSULeaf | 56,76 | No | Yes | Yes || PhalangesOutlinesCorrect | 67,07 | Yes | Yes | Yes || Plane | 96,23 | Yes | Yes | Yes || PowerCons | 100,00 | Yes | Yes | Yes || ProximalPhalanxOutlineAgeGroup | 74,34 | No | Yes | Yes || ProximalPhalanxOutlineCorrect | 73,09 | No | Yes | Yes || ProximalPhalanxTW | 48,68 | Yes | Yes | Yes || RefrigerationDevices | 43,62 | No | Yes | Yes || ScreenType | 40,96 | No | Yes | Yes || ShapeletSim | 50,00 | No | Yes | Yes || ShapesAll | 70,00 | No | Yes | Yes || SmallKitchenAppliances | 60,11 | No | Yes | Yes || SmoothSubspace | 96,00 | Yes | Yes | Yes || SonyAIBORobotSurface1 | 99,36 | No | Yes | Yes || SonyAIBORobotSurface2 | 97,96 | No | Yes | Yes || Strawberry | 77,64 | Yes | Yes | Yes || SwedishLeaf | 82,27 | Yes | Yes | Yes || Symbols | 95,69 | No | Yes | Yes || SyntheticControl | 89,33 | Yes | Yes | Yes || ToeSegmentation2 | 80,95 | No | Yes | Yes || Trace | 68,00 | Yes | Yes | Yes || TwoLeadECG | 98,97 | No | Yes | Yes || TwoPatterns | 99,92 | No | Yes | Yes || UMD | 93,33 | Yes | Yes | Yes || UWaveGestureLibraryAll | 95,54 | No | Yes | Yes || UWaveGestureLibraryX | 81,61 | No | Yes | Yes || UWaveGestureLibraryY | 72,59 | No | Yes | Yes || UWaveGestureLibraryZ | 75,98 | No | Yes | Yes || Wafer | 99,83 | Yes | Yes | Yes || Wine | 60,71 | No | Yes | Yes || WordSynonyms | 67,84 | No | Yes | Yes || Worms | 50,77 | No | Yes | Yes || WormsTwoClass | 50,77 | Yes | Yes | Yes || Yoga | 91,39 | No | Yes | Yes |
Notes (English)
Files
anchor.zip
Files
(3.1 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:feca5e739c85b9c5a582c1fc83aad65d
|
1.3 MB | Preview Download |
|
md5:97c4bf4952e29af5045cb57e4292fff9
|
402.8 MB | Preview Download |
|
md5:1a79cbd4cba223a1e06e25044dd60883
|
1.4 GB | Preview Download |
|
md5:6b5560b1c17f88d72efb5e4fe6c03b7f
|
407.2 MB | Preview Download |
|
md5:20573856720811582ac78becc7845a70
|
977.4 MB | Preview Download |