Skip to content

YaxitaAmin/surgical_robot_rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

27 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 Reinforcement Learning-Based Path Planning for Robotic Brain Surgery Simulation

Autonomous surgical needle path planning using Q-Learning and DQN in a PyBullet physics simulation with a Franka Panda robotic arm.

University of Maryland, College Park | Yaxita Amin & K. Manasanjani


πŸ“– Overview

This project presents a reinforcement learning approach for autonomous path planning in robotic brain surgery simulation. An AI agent learns to navigate a surgical needle through complex 3D brain vasculature β€” avoiding blood vessels while reaching tumor targets β€” using Q-Learning and Deep Q-Network (DQN) algorithms.

Key highlights:

  • 98–100% training success rate with tabular Q-Learning over 3,000 episodes
  • 80–100% generalization to unseen tumor targets
  • 0% vessel collision rate vs. 40% for straight-line approaches
  • β‰₯4mm safety margin guaranteed from all blood vessels
  • Full integration with a Franka Panda robotic arm via Inverse Kinematics

For more in detail information check out this document - (https://drive.google.com/file/d/1P-ZHq9-bp2RFAncW7hEp7lvw01aQN47K/view?usp=drive_link)

πŸŽ₯ Demo Videos (copyrighted by yaxita and manasa)

WITHOUT ROBOT ARM

Copy of Video Project 5 (1) (1) (1)

Copy of Video Project 6 (1)

WITH ROBOT ARM AND NEEDLE


πŸ—οΈ System Architecture

System Architecture


βš™οΈ Installation

Prerequisites

  • Python 3.10+
  • Ubuntu (recommended) or macOS

Dependencies

pybullet
trimesh
scipy
numpy
torch
matplotlib

πŸš€ Usage

Run Q-Learning (Path Planning + PyBullet Simulation)

python final13.py

Run DQN Agent

python brain_surgery_dqn.py

πŸ§ͺ Results

Training Performance

Episode Q-Learning DQN Epsilon
500 84.8% 95.2% ~0.22
1000 98.2% 98.0% ~0.05
2000 99.6% 97.2% ~0.01
3000 99.6% 98.8% 0.01

Q-Learning vs DQN

Metric Q-Learning DQN
Training Time ~2 min ~15 min
Training Success 94.6% 89.2%
Test Success 80% 75%
Path Quality 1.1Γ— 1.15Γ—
Memory Usage 4,003 states 128KB model
Inference Time <1ms ~5ms

Path Planning Method Comparison

Metric Dijkstra/A* RRT Potential Fields Q-Learning
Computation 30–35s 5–10s 2–5s ~2 min train
Path Quality 1.0–1.3Γ— 1.4–1.6Γ— 1.2–1.4Γ— ~1.1Γ—
Success Rate 60–65% 70% 65–70% 98–100%
Vessel Safety Occasional Generally >4mm* Always β‰₯4mm
Reproducibility High Low Medium High
Learning None None None Yes

πŸ€– RL Formulation

State Space

3D voxel grid position (2mm resolution) relative to tumor location.

Action Space

6 discrete moves: {+X, βˆ’X, +Y, βˆ’Y, +Z, βˆ’Z} (2mm per step)

Reward Function

R(s, a, s') = R_goal + R_collision + R_timeout + R_shaping
  • R_goal = +100 β€” reaching the tumor
  • R_collision = -100 β€” vessel proximity < 4mm
  • R_timeout = -50 β€” exceeding 100 steps
  • R_shaping = Ξ”d β€” distance reduction to target

Hyperparameters

Parameter Value
Learning rate Ξ± 0.15
Discount factor Ξ³ 0.95
Epsilon (start β†’ end) 1.0 β†’ 0.01
Epsilon decay 0.997
Max steps/episode 100
Training episodes 3,000

🦾 Robotic Arm Integration

  • Robot: Franka Emika Panda (7 DOF, simulated in PyBullet)
  • IK Solver: PyBullet damped least-squares with joint limit handling
  • IK success rate: 98%
  • End-effector accuracy: <0.5mm positioning error
  • Average execution time: 8.5 seconds per path

πŸ“ Project Structure

brain_surgery_docker/
β”œβ”€β”€ data/                       # STL models & brain vasculature data (~7.5MB)
β”œβ”€β”€ final13.py                  # Q-Learning main script (path planning + simulation)
β”œβ”€β”€ brain_surgery_dqn.py        # Deep Q-Network implementation
β”œβ”€β”€ Dockerfile                  # Docker container configuration
β”œβ”€β”€ requirements.txt            # Python dependencies
└── README.md

🐳 Docker

A Dockerfile is included for containerized, reproducible execution.

Build the Image

docker build -t brain-surgery-rl .

Run the Container

docker run --rm brain-surgery-rl

Note: PyBullet GUI visualization requires passing through a display. On Linux, use:

docker run --rm -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix brain-surgery-rl

⚠️ Limitations

  • Static environment (no tissue deformation modeling)
  • Discrete action space limits path smoothness
  • Single-needle, straight-segment paths only
  • Simulated environment differs from real surgical conditions

πŸ”­ Future Work

  • Extend to continuous state/action spaces using actor-critic methods
  • Incorporate curved needle steering for challenging targets
  • Multi-objective optimization (path length, clearance, energy)
  • Dynamic replanning with real-time MRI/CT intraoperative feedback
  • Sim-to-real transfer for physical Franka Panda deployment
  • Training on diverse anatomical models for patient-agnostic planning


πŸ™ Acknowledgments

We thank Dr. Jerry Wu for guidance throughout this project, and teaching assistants Siddhant and Aswin for valuable feedback during development.


πŸ“œ License

This project is intended for educational and pre-operative planning research purposes only. Not for clinical use.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors