Skip to content

RayLam2022/ppt-master

 
 

Repository files navigation

PPT Master — AI generates natively editable PPTX from any document

Version License: MIT GitHub stars

English | 中文

Drop in a PDF, DOCX, URL, or Markdown file — AI generates natively editable PowerPoint presentations with real shapes, not images. Every text box, chart, and graphic is a real PowerPoint object you can click and edit. Supports PPT 16:9, social media cards, marketing posters, and 10+ other formats.

🔥 NEW: Native Editable PPTX — Generated presentations now contain real PowerPoint shapes (DrawingML) by default — text, charts, and graphics are directly editable in PowerPoint without any extra steps. No more "Convert to Shape"!

💡 Architecture Update: The project uses a Skill-based architecture:

  1. Lower Token Consumption & Model Dependency: Significantly reduced token consumption. Now, even non-Opus models can generate decent results.
  2. High Extensibility: The skills folder is organized according to the Agent Skills standard, with each subdirectory being a fully self-contained Skill. It can be natively invoked by dropping it into the skills directory of compatible AI clients (e.g., .claude/skills/ or ~/.claude/skills/ for Claude Code; global skills directory referenced via .agent/workflows/ for Antigravity; .github/skills/ or ~/.copilot/skills/ for GitHub Copilot).
  3. Stable Fallback:Although the previous multi-platform architecture consumes more tokens, it has been more extensively tested. If you experience instability with the current version, you can always fall back to the last release of the old architecture: v1.3.0.

Online Examples: GitHub Pages Preview — See actual generated results


🎴 Featured Examples

Example Library: examples/ · 15 projects · 229 pages

Category Project Pages Features
🏢 Consulting Style Attachment in Psychotherapy 32 Top consulting style, largest scale example
Building Effective AI Agents 15 Anthropic engineering blog, AI Agent architecture
Chongqing Regional Report 20 Regional fiscal analysis
Ganzi Prefecture Economic Analysis 17 Government fiscal analysis, Tibetan cultural elements
🎨 General Flexible Debug Six-Step Method 10 Dark tech style
Chongqing University Thesis Format 11 Academic standards guide
Creative Style I Ching Qian Hexagram Study 20 I Ching aesthetics, Yin-Yang design
Diamond Sutra Chapter 1 Study 15 Zen academic, ink wash whitespace
Git Introduction Guide 10 Pixel retro game style

📖 View Complete Examples Documentation


🏗️ System Architecture

User Input (PDF/DOCX/URL/Markdown)
    ↓
[Source Content Conversion] → pdf_to_md.py / doc_to_md.py / web_to_md.py
    ↓
[Create Project] → project_manager.py init <project_name> --format <format>
    ↓
[Template Option] A) Use existing template B) No template
    ↓
[Need New Template?] → Use /create-template workflow separately
    ↓
[Strategist] - Eight Confirmations & Design Specifications
    ↓
[Image_Generator] (When AI generation is selected)
    ↓
[Executor] - Two-Phase Generation
    ├── Visual Construction Phase: Generate all SVG pages → svg_output/
    └── Logic Construction Phase: Generate complete speaker notes → notes/total.md
    ↓
[Post-processing] → total_md_split.py (split notes) → finalize_svg.py → svg_to_pptx.py
    ↓
Output: Native editable PPTX with real shapes + SVG reference PPTX (auto-embeds speaker notes)
    ↓
[Optimizer_CRAP] (Optional, only if the first draft is unsatisfactory)
    ↓
If optimized: re-run post-processing and export

📚 Documentation Navigation

Document Description
🧭 AGENTS.md Repository-level entry overview for general AI agents
📖 SKILL.md Canonical ppt-master workflow and rules
🎨 Design Guidelines Colors, typography, and layout specifications
📐 Canvas Formats PPT, Xiaohongshu (RED), WeChat Moments, and 10+ formats
🖼️ Image Embedding Guide SVG image embedding best practices
📊 Chart Template Library Standardized chart templates
🔧 Role Definitions Role definitions and technical references
🛠️ Toolset Usage instructions for all tools
💼 Examples Index 15 projects, 229 SVG pages of examples

🚀 Quick Start

1. Configure Environment

Python Environment (Required)

This project requires Python 3.8+ for running PDF conversion, SVG post-processing, PPTX export, and other tools.

Platform Recommended Installation
macOS Use Homebrew: brew install python
Windows Download installer from Python Official Website
Linux Use package manager: sudo apt install python3 python3-pip (Ubuntu/Debian)

💡 Verify Installation: Run python3 --version to confirm version ≥ 3.8

Node.js Environment (Optional)

If you need to use the web_to_md.cjs tool (for converting web pages from WeChat and other high-security sites), install Node.js.

Platform Recommended Installation
macOS Use Homebrew: brew install node
Windows Download LTS version from Node.js Official Website
Linux Use NodeSource: curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash - && sudo apt-get install -y nodejs

💡 Verify Installation: Run node --version to confirm version ≥ 18

Pandoc (Optional)

If you need to use the doc_to_md.py tool (for converting DOCX, EPUB, LaTeX, and other document formats to Markdown), install Pandoc.

Platform Recommended Installation
macOS Use Homebrew: brew install pandoc
Windows Download installer from Pandoc Official Website
Linux Use package manager: sudo apt install pandoc (Ubuntu/Debian)

💡 Verify Installation: Run pandoc --version to confirm it is installed

2. Clone Repository and Install Dependencies

git clone https://github.com/hugohe3/ppt-master.git
cd ppt-master
pip install -r requirements.txt

If you encounter permission issues, use pip install --user -r requirements.txt or install in a virtual environment.

3. Open AI Editor

Recommended AI editors:

Tool Rating Description
Claude Code ⭐⭐⭐ Highly Recommended! Anthropic official CLI, native Opus support, largest context window
Codebuddy IDE ⭐⭐ Great Chinese AI IDE, good support for local models like Kimi 2.5 and MiniMax 2.7
Cursor ⭐⭐ Mainstream AI editor, great experience but relatively expensive
VS Code + Copilot ⭐⭐ Microsoft official solution, cost-effective, but limited context window (200k, 35% reserved for output)
Antigravity Free but very limited quota and unstable. Alternative only.

4. Start Creating

Open the AI chat panel in your editor and describe what content you want to create:

User: I have a Q3 quarterly report that needs to be made into a PPT

AI: Sure. First we'll confirm whether to use a template; after that Strategist will
   continue with the eight confirmations and generate the design spec.
   [Template Option] [Recommended] B) No template
   [Strategist] 1. Canvas format: [Recommended] PPT 16:9
   [Strategist] 2. Page count: [Recommended] 8-10 pages
   ...

💡 Model Recommendation: Claude Opus works best, but most mainstream models today (like Kimi 2.5 and MiniMax 2.7, tested via Codebuddy IDE) can also generate decent results with only minor gaps in layout details. Due to the instability of Opus on some IDEs (like Antigravity), trying other stable AI clients is recommended.

📝 Post-Export Editing: The default exported PPTX (.pptx) contains native PowerPoint shapes — text, graphics, and colors are directly editable, no extra steps needed. A second SVG reference file (_svg.pptx) is also generated; for that version, select the content in PowerPoint and use "Convert to Shape" to edit. Requires Office 2016 or later.

💡 AI Lost Context? Ask the AI to read skills/ppt-master/SKILL.md first; use AGENTS.md as the repository-level entry overview.

5. Gemini Image Generation (Optional)

The nano_banana_gen.py tool can generate high-quality images via the Gemini API directly within AI clients. Configure the following environment variables before use:

# Required: Gemini API Key (obtain from https://aistudio.google.com/apikey)
export GEMINI_API_KEY="your-api-key"

# Optional: Custom API endpoint (for proxy services)
export GEMINI_BASE_URL="https://your-proxy-url.com/v1beta"

💡 Persist settings: Add the export commands above to ~/.zshrc (macOS/Linux zsh) or ~/.bashrc (Linux bash), then restart your terminal.

💡 If using the Antigravity proxy, pass the model parameter (-m gemini-3.1-flash-image).

💡 AI Image Generation Tip: For AI-generated images, we recommend generating them in Gemini and selecting Download full size for higher resolution. Gemini images have a star watermark in the bottom right corner, which can be removed using gemini-watermark-remover or this project's skills/ppt-master/scripts/gemini_watermark_remover.py.


📁 Project Structure

ppt-master/
├── skills/
│   └── ppt-master/                 # Main skill source
│       ├── SKILL.md                #   Main entry: workflow definition
│       ├── workflows/              #   Workflow entry files
│       ├── references/             #   Role definitions and specs
│       ├── scripts/                #   Tool scripts
│       └── templates/              #   Layouts, charts, icons
├── examples/                       # Example projects
├── projects/                       # User project workspace
├── AGENTS.md                       # General AI agent entry
└── CLAUDE.md                       # Dedicated Claude Code CLI entry

🛠️ Common Commands

# Initialize project
python3 skills/ppt-master/scripts/project_manager.py init <project_name> --format ppt169

# Archive source materials into the project folder
python3 skills/ppt-master/scripts/project_manager.py import-sources <project_path> <source_file_or_url...>

# Note: files outside the workspace are copied by default; files already in the workspace are moved into sources/

# PDF to Markdown
python3 skills/ppt-master/scripts/pdf_to_md.py <PDF_file>

# DOCX / Office documents to Markdown (requires pandoc)
python3 skills/ppt-master/scripts/doc_to_md.py <DOCX_file>

# Post-processing (run in order)
python3 skills/ppt-master/scripts/total_md_split.py <project_path>
python3 skills/ppt-master/scripts/finalize_svg.py <project_path>
python3 skills/ppt-master/scripts/svg_to_pptx.py <project_path> -s final
# Default: generates two files — native shapes (.pptx) + SVG reference (_svg.pptx)
# Use --only native  to skip SVG reference version
# Use --only legacy  to only generate SVG image version

📖 For complete tool documentation, see Tools Usage Guide


❓ FAQ

Q: Can I edit the generated presentations?

Yes! The default export (.pptx) produces native PowerPoint shapes — all text, graphics, and colors are directly editable in PowerPoint without any conversion. An SVG reference version (_svg.pptx) is also generated; for that file, select the content and use "Convert to Shape" to unlock editing. Requires Office 2016 or later.

Q: What's the difference between the three Executors?
  • Executor_General: General scenarios, flexible layout
  • Executor_Consultant: General consulting, data visualization
  • Executor_Consultant_Top: Top consulting (MBB level), 5 core techniques
Q: Is Optimizer_CRAP required?

No. Only use it when you need to optimize the visual effects of key pages.

📖 For more questions, see SKILL.md and AGENTS.md


🤝 Contributing

Contributions are welcome!

  1. Fork this repository
  2. Create your branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Contribution Areas: 🎨 Design templates · 📊 Chart components · 📝 Documentation · 🐛 Bug reports · 💡 Feature suggestions


📄 License

This project is licensed under the MIT License.

🙏 Acknowledgments

  • SVG Repo - Open source icon library
  • Robin Williams - CRAP design principles
  • McKinsey, Boston Consulting, Bain - Design inspiration

📮 Contact


🌟 Star History

If this project helps you, please give it a ⭐ Star!

Star History Chart

Made with ❤️ by Hugo He

⬆ Back to Top

About

AI generates editable, beautifully designed PPTX from any document — no design skills needed | 15 examples, 229 pages

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 81.0%
  • HTML 14.3%
  • JavaScript 4.7%