Skip to content

DELTA-DoubleWise/OmniReason

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Compose and Fuse: Revisiting the Foundational Bottlenecks in Multimodal Reasoning

This directory contains the code used in our paper “Compose and Fuse: Revisiting the Foundational Bottlenecks in Multimodal Reasoning”.


Setup

This repository itself does not impose any strict package requirements. Instead, you should set up your environment according to the dependencies of the specific model you intend to run:

⚠️ Make sure the environment matches the model runner you select (see src/evaluation/models/). The code framework itself requires no additional dependencies beyond the model-specific setup.


Dataset

We publish per-subset configs on the Hub (example): ycwang11/OmniReason with configs:

  • alternative, independent, complementary, contradictory, equivalent, entailment, recognition

Media handling:

  • The dataset uses datasets.Image/datasets.Audio features. When pushed to the Hub, media are embedded into parquet shards. HF stores only a basename in the path field and the actual bytes in bytes.

  • To make this seamless for downstream scripts, we provide src/utils/hf_loader.py, which:

    • casts media columns with decode=False
    • materializes embedded bytes into deterministic local files and returns absolute paths
    • cache location: ~/.cache/omnireason_media_cache (override with OMNIREASON_MEDIA_CACHE)

Build + publish (optional): see dataset/hf_publish/build_hg_dataset.py for staging local CSVs + media and pushing to the Hub.


Evaluation

Minimal CLI is in src/evaluation/eval_pipeline.py. It wires a registered model runner and a task:

# Example: Qwen2.5-Omni on the `equivalent` subset
python src/evaluation/eval_pipeline.py \
  --model Qwen2.5-Omni \
  --task equivalent \
  • Model runners: see src/evaluation/models/*.py (Qwen2.5-Omni, MiniCPM, Baichuan, Phi-4 Omni, etc.). Each runner contains paths and options you may need to adjust (e.g., local checkpoint directory).
  • Tasks are registered via decorators in src/evaluation/tasks/.

If you prefer -m invocation, set PYTHONPATH as above and run:

python -m src.evaluation.eval_pipeline \
  --model Qwen2.5-Omni --task independent

Interpretation

Two standalone scripts consume the HF dataset via the shared loader, then run Qwen2.5-Omni for analysis.

  1. Extract Attention (src/interpretation/extract_attention.py)
  • Runs the model and exports layerwise attention vectors associated with facts, rules, and questions.

  • Key args:

    • --type {subset} (HF config)
    • --hf-repo-id, --split
    • --pooling {mean,max,none}: head pooling mode for exported vectors
    • --mod_order (optional): permutation of IAT as above

Example:

python extract_attention.py \
  --type independent --pooling mean
  1. Attention Manipulation (src/interpretation/attention_manipulation.py)
  • Adjusts per-head temperatures on selected layers and runs the model.

  • Key args:

    • --type {subset}: HF config/subset name (e.g., equivalent)
    • --setting {vanilla,layer} and --layer {bottom,middle,top}
    • --temp_mode {decrease,increase} and --scale (amount)
    • --mod_order (optional): a permutation of IAT to select which slot is used for each modality

Example:

python attention_manipulation.py \
  --type independent --setting layer --layer top --temp_mode decrease --scale 0.2

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages