Cross-Modal-Knowledge-Transfer-Project

IITB EdTech Internship 2025 with DYPCET Track 1 - Educational Data Analysis (EDA)

Project Overview:

This project addresses Problem ID-15: Cross-Modal Knowledge Transfer from the IITB EdTech Internship 2025. The objective is to train a model using EEG data as the teacher modality and evaluate if student models trained on eye-tracking (EYE + IVT), GSR, or facial expressions (TIVA) can approximate the teacher's performance through techniques like domain adaptation or modality dropout. Advanced methods include adversarial domain adaptation and contrastive learning. The project uses multimodal physiological data to predict task accuracy (binary: Correct/Incorrect) or engagement levels from PSY.csv, focusing on synchronized per-trial features across modalities.

Key Objectives:

1.Use EEG to train a teacher model and test approximation by eye-tracking or GSR-only student models.

2.Implement knowledge distillation, adversarial domain adaptation, and contrastive learning.

3.Evaluate performance improvements and interpret key features (e.g., pupil size in eye-tracking).

The dataset consists of physiological signals from 38 participants across multiple modalities, organized per-trial for synchronization. Files are stored in the data/ directory:

*EEG.csv:

EEG signals (teacher modality). Features: Mean and variance of frequency bands (Delta, Theta, Alpha, Beta, Gamma).

*EYE.csv + IVT.csv:

Eye-tracking data (student modality). Features: Average fixation duration, saccade amplitude, mean pupil size.

*GSR.csv:

Galvanic Skin Response (student modality). Features: Mean conductance, slopes, recovery rates.

*TIVA.csv:

Facial expressions (student modality). Features: Average Action Unit (AU) intensities or emotion probabilities.

*PSY.csv:

Psychological data for targets (e.g., Engagement levels or task accuracy).

Data Organization:

Features are extracted and paired per trial, ensuring teacher-student alignment (e.g., EEG features matched with Eye features from the same event).

#Note: Due to data access issues, only EEG features were fully processed in this implementation. Student modalities (Eye, GSR, Facial) are placeholders for future integration.

Installation & Setup

Clone the Repository:

textgit clone

cd project

Environment Setup (Google Colab recommended):

Mount Google Drive:

from google.colab import drive; drive.mount('/content/drive')

Install dependencies:

text!pip install pandas numpy scikit-learn xgboost tensorflow shap matplotlib seaborn

Data Preparation:

Place dataset files in data/ or mount /content/drive/MyDrive/STData/STData/ for participant subdirectories (1-38).

Ensure PSY.csv is available for target encoding.

Usage:

Run the notebooks sequentially in Jupyter/Colab:

01_preprocessing.ipynb:

Load data, extract features (EEG frequency bands, Eye summaries, GSR slopes, TIVA AUs), normalize with z-score, apply PCA (10 components per modality), and save preprocessed_trials.csv.

02_baseline_single_modality.ipynb:

Train XGBoost baselines on each modality and evaluate F1-score, Accuracy, ROC-AUC.

03_teacher_student_transfer.ipynb:

Train EEG teacher, then distill knowledge to student models (Eye, GSR, Facial) using soft labels.

04_domain_adaptation.ipynb:

Implement adversarial adaptation with gradient reversal for domain-invariant features.

05_contrastive_learning.ipynb:

Use NT-Xent loss for shared embedding space across modalities.

Example Command (Colab):

Open each notebook and run all cells. Outputs include saved models, metrics CSVs, and plots.

Key Hyperparameters:

1.XGBoost: random_state=42, scale_pos_weight for imbalance.

2.Distillation: Temperature=2.0, Alpha=0.5.

3.Adaptation: Embedding dim=16, Loss weights=[1.0, 0.5, 0.5].

#Results

Baseline EEG Performance

MetricValue

F1-score0.5263

Accuracy0.6213

ROC-AUC0.7595

Future Work

1.Resolve Step 2 data loading for full modalities.

2.Implement hyperparameter tuning (GridSearchCV) and 5-fold cross-validation.

3.Explore modality dropout and pretraining for robustness.

4.Ensemble models across modalities.

5.Full SHAP analysis for student models post-transfer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cross-Modal-Knowledge-Transfer-Project

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
01_preprocessing.ipynb		01_preprocessing.ipynb
02_baseline_single_modality.ipynb		02_baseline_single_modality.ipynb
03_teacher_student_transfer.ipynb		03_teacher_student_transfer.ipynb
04_domain_adaptation.ipynb		04_domain_adaptation.ipynb
05_contrastive_learning.ipynb		05_contrastive_learning.ipynb
Dataset		Dataset
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Cross-Modal-Knowledge-Transfer-Project

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages