Paper was accepted to MICCAI 2025 - Global and Local Contrastive Learning for Joint Representations from Cardiac MRI and ECG
An electrocardiogram (ECG) is a widely used, cost-effective tool for detecting cardiac abnormalities, but it lacks the spatial and functional insights provided by cardiac magnetic resonance (CMR). CMR is the gold standard for assessing cardiac structure and function but is costly and less accessible. To address this challenge, we propose PTACL (Patient and Time Alignment Contrastive Learning), a multimodal contrastive learning framework that enhances ECG representations by integrating spatiotemporal information from CMR. PTACL employs a global contrastive loss to align patient-level representations by pulling ECG and CMR embeddings from the same patient closer together while pushing apart embeddings from different patients. Additionally, a local contrastive loss enforces fine-grained temporal alignment within each patient by contrasting encoded ECG segments with corresponding encoded CMR frames. This approach enriches ECG representations with diagnostic information beyond electrical activity, improving interpretability and enabling more accurate functional assessments of the heart. We evaluate PTACL on paired ECG-CMR data from over 27,951 subjects in the UK Biobank. Compared to baseline approaches, PTACL achieves stronger performance in two clinically relevant tasks: (1) retrieving patients with similar cardiac phenotypes/diagnosis and (2) predicting CMR-derived cardiac function metrics, such as ventricular volumes and ejection fraction. These results highlight PTACL’s potential to enhance noninvasive cardiac diagnostics using ECG.
-
Install dependencies using conda:
conda env create -f environment.yml conda activate mmcl
-
Set up the repository:
git clone https://github.com/Alsalivan/ecgcmr.git cd ecgcmr pip install -e . pip install -e external/transformers # Install the custom ViT-MAE transformers fork
We provide pretrained weights for the ECG ViT-MAE encoder on the Hugging Face Hub:
Files included:
model.safetensors: Pretrained model weightsconfig.json: Model architecture
These weights are compatible with the ECG pretraining and downstream regression pipelines in this repository.
(See the Environment Setup section above)
from transformers import ViTMAEConfig, ViTMAEModel
# Load pretrained ECG ViT-MAE model
cfg = ViTMAEConfig.from_pretrained("alsalivan/vitmae_ecg")
model = ViTMAEModel.from_pretrained("alsalivan/vitmae_ecg")Training configurations are set in the conf/ directory, with base settings defined in base.yaml. The following configurations determine which training mode is executed:
models: ecg/masked
training_mode: ecg/pretrain_masked
downstream_task: regression
dataset: ecg/masked
augmentations: without models: imaging/masked
training_mode: imaging/pretrain_masked
downstream_task: regression
dataset: imaging/masked
augmentations: withoutmodels: multimodal/contrastive
training_mode: multimodal/pretrain_contrastive
downstream_task: regression
dataset: multimodal/contrastive
augmentations: heavy # Options: light, medium, heavy, withoutTo start training, use the following command:
python run.pyAlternatively, submit the training job using SLURM:
sbatch slurm_scripts/submit_train_dev.shFor large-scale training on a cluster, use the provided SLURM script (slurm_scripts/submit_train_dev.sh):
#!/bin/bash
#SBATCH --job-name=XXXX
#SBATCH --partition=XXXX
#SBATCH --output=.../output.txt
#SBATCH --error=.../error.txt
#SBATCH --mem=70G
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=8
#SBATCH --gres=gpu:1
#SBATCH --time=00:10:00
module load python/anaconda3
source activate mmcl
export PYTHONPATH=$(pwd)
python run.py models=<CONFIG_MODEL> training_mode=<CONFIG_TRAINING> dataset=<CONFIG_DATASET> augmentations=<AUGMENTATION_TYPE>Replace <CONFIG_MODEL>, <CONFIG_TRAINING>, <CONFIG_DATASET>, and <AUGMENTATION_TYPE> based on the selected pretraining mode.
To reproduce retrieval results, load checkpoints and compute embeddings:
components = load_checkpoints(cfg, path_ecg_checkpoint, path_mri_checkpoint, path_model_checkpoint, model_type)
print("Checkpoints loaded.")
embeddings_dict = compute_embeddings(d, components, model_type, cfg, device=device)
print("Embeddings computed")Then, use the retrieval function:
from inference_retrieval_task import retrieve_and_evaluate
k_values = [1, 3, 5, 10, 15, 20, 30]
margin_type = "zscore"
margin_value = 0.5
for k in k_values:
_, (final_score, final_score_std), _ = retrieve_and_evaluate(
ecg_embeddings_global=embeddings_dict["ecg_embeddings_global"],
mri_embeddings_global=embeddings_dict["mri_embeddings_global"],
labels_table=labels_df,
dataloader_eids=embeddings_dict["eids"],
approach="global",
local_type="timelevel_topk",
k=k,
evaluation_method="absolute",
margin_type=margin_type,
margin_value=margin_value,
)For ECG Regression, use functions from ecgcmr/signal/inference_ecg.py:
- Linear Probing:
sklearn_regression - Supervised Training:
train_ecg_regression_supervised - Fine-Tuning with Grid Search:
train_fine_tuning_with_grid_search
For Imaging Regression, use functions from ecgcmr/imaging/inference_imaging.py:
- Supervised Training:
train_mri_regression_supervised - Linear Probing:
train_regression_mri
Additionally, general regression training functions can be found in ecgcmr/utils/train_evaluate.py, which provides utilities for:
- Loading datasets and data loaders (
DownstreamECGDataset) - Training regression models with different configurations
- Running grid search for hyperparameter tuning
To run linear probing for ECG regression, use:
from ecgcmr.utils.train_evaluate import sklearn_regression
result = sklearn_regression(
ecg_model=ecg_encoder,
train_loader=d_train,
val_loader=d_val,
mean_train_labels_vol=np.load(cfg.downstream_task.paths.mean_train_labels_vol),
std_train_labels_vol=np.load(cfg.downstream_task.paths.std_train_labels_vol),
)For fine-tuning regression with grid search, use:
from ecgcmr.utils.train_evaluate import train_fine_tuning_with_grid_search
train_fine_tuning_with_grid_search(
train_loader=d_train,
val_loader=d_val,
path_ecg_checkpoint=model_path,
device=device,
epochs=epochs,
)@inproceedings{0b1bc77fb4e64e82b05f7c51616d8a61,
title = "Global and Local Contrastive Learning for Joint Representations from Cardiac MRI and ECG",
abstract = "An electrocardiogram (ECG) is a widely used, cost-effective tool for detecting electrical abnormalities in the heart. However, it cannot directly measure functional parameters, such as ventricular volumes and ejection fraction, which are crucial for assessing cardiac function. Cardiac magnetic resonance (CMR) is the gold standard for these measurements, providing detailed structural and functional insights, but is expensive and less accessible. To bridge this gap, we propose PTACL (Patient and Temporal Alignment Contrastive Learning), a multimodal contrastive learning framework that enhances ECG representations by integrating spatio-temporal information from CMR. PTACL uses global patient-level contrastive loss and local temporal-level contrastive loss. The global loss aligns patient-level representations by pulling ECG and CMR embeddings from the same patient closer together, while pushing apart embeddings from different patients. Local loss enforces fine-grained temporal alignment within each patient by contrasting encoded ECG segments with corresponding encoded CMR frames. This approach enriches ECG representations with diagnostic information beyond electrical activity and transfers more insights between modalities than global alignment alone, all without introducing new learnable weights. We evaluate PTACL on paired ECG-CMR data from 27,951 subjects in the UK Biobank. Compared to baseline approaches, PTACL achieves better performance in two clinically relevant tasks: (1) retrieving patients with similar cardiac phenotypes and (2) predicting CMR-derived cardiac function parameters, such as ventricular volumes and ejection fraction. Our results highlight the potential of PTACL to enhance non-invasive cardiac diagnostics using ECG. The code is available at: https://github.com/alsalivan/ecgcmr.",
keywords = "Contrastive Learning, ECG, MRI, Time-Alignment",
author = "Alexander Selivanov and Philip M{\"u}ller and {\"O}zg{\"u}n Turgut and Nil Stolt-Ans{\'o} and Daniel Rueckert",
note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.; 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 ; Conference date: 23-09-2025 Through 27-09-2025",
year = "2026",
doi = "10.1007/978-3-032-04927-8\_21",
language = "English",
isbn = "9783032049261",
series = "Lecture Notes in Computer Science",
publisher = "Springer Science and Business Media Deutschland GmbH",
pages = "217--227",
editor = "Gee, \{James C.\} and Jaesung Hong and Sudre, \{Carole H.\} and Polina Golland and Alexander, \{Daniel C.\} and Iglesias, \{Juan Eugenio\} and Archana Venkataraman and Kim, \{Jong Hyo\}",
booktitle = "Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, 2025, Proceedings",
}
