The Switchboard Affect dataset contains perceptual emotion annotations for 10,000 publicly available audio segments. This amounts to 25 hours of speech. The source audio files can be acquired from LDC (https://catalog.ldc.upenn.edu/LDC97S62) and we include a script to extract audio segments from the corpus.
Each segment was annotated independently by 6 graders who passed a training and certification process. The set of labels includes both categorical and dimensional emotions, and we provide a detailed format (annotations from each grader) as well as a consensus format (annotations aggregated for each segment).
-
Categorical emotions.
Selections of primary and secondary emotions from the following set:
Anger Contempt Disgust Sadness Fear Surprise Happiness Tenderness Calmness Neutral Other -
Dimensional emotions.
Ratings for valence, activation, and dominance ranging from 1 to 5:
1 5 Valence negative positive Activation drained energetic Dominance weak strong
labels_detailed.csvincludes annotations from each annotator, unaggregated.labels_consensus.csvincludes consensus annotations aggregated for each segment. For consensus on categorical emotions, 50%+ of graders need to agree on a primary or secondary emotion. For consensus on dimensional emotions, we take the mean of ratings by all annotators.
The script extract_segments.py reads in the raw audio and metadata from the LDC corpus and saves .wav files for each segment. [LDC_DIR] refers to the folder that contains raw audio files (in subfolder swb1_LDC97S62) and segment metadata (in subfolder ms98_transcriptions). [SEG_DIR] refers to the folder in which you want to save the segments.
To extract the segments, run the following from this directory
python3 extract_segments.py --ldc_dir [LDC_DIR] --seg_dir [SEG_DIR]
If you find the SWB-Affect dataset or this code useful in your research, please cite the following paper:
@misc{romana2025,
author = {Amrit Romana AND Jaya Narain AND Tien Dung Tran AND Andrea Davis AND Jason Fong AND Ramya Rasipuram AND Vikramjit Mitra},
title = {Switchboard-Affect: Emotion Perception Labels from Conversational Speech},
howpublished = {ACII 2025},
}