English | 中文
Drop in a PDF, DOCX, URL, or Markdown file — AI generates natively editable PowerPoint presentations with real shapes, not images. Every text box, chart, and graphic is a real PowerPoint object you can click and edit. Supports PPT 16:9, social media cards, marketing posters, and 10+ other formats.
🔥 NEW: Native Editable PPTX — Generated presentations now contain real PowerPoint shapes (DrawingML) by default — text, charts, and graphics are directly editable in PowerPoint without any extra steps. No more "Convert to Shape"!
💡 Architecture Update: The project uses a Skill-based architecture:
- Lower Token Consumption & Model Dependency: Significantly reduced token consumption. Now, even non-Opus models can generate decent results.
- High Extensibility: The
skillsfolder is organized according to the Agent Skills standard, with each subdirectory being a fully self-contained Skill. It can be natively invoked by dropping it into the skills directory of compatible AI clients (e.g.,.claude/skills/or~/.claude/skills/for Claude Code; global skills directory referenced via.agent/workflows/for Antigravity;.github/skills/or~/.copilot/skills/for GitHub Copilot).- Stable Fallback:Although the previous multi-platform architecture consumes more tokens, it has been more extensively tested. If you experience instability with the current version, you can always fall back to the last release of the old architecture: v1.3.0.
Online Examples: GitHub Pages Preview — See actual generated results
Example Library:
examples/· 15 projects · 229 pages
| Category | Project | Pages | Features |
|---|---|---|---|
| 🏢 Consulting Style | Attachment in Psychotherapy | 32 | Top consulting style, largest scale example |
| Building Effective AI Agents | 15 | Anthropic engineering blog, AI Agent architecture | |
| Chongqing Regional Report | 20 | Regional fiscal analysis | |
| Ganzi Prefecture Economic Analysis | 17 | Government fiscal analysis, Tibetan cultural elements | |
| 🎨 General Flexible | Debug Six-Step Method | 10 | Dark tech style |
| Chongqing University Thesis Format | 11 | Academic standards guide | |
| ✨ Creative Style | I Ching Qian Hexagram Study | 20 | I Ching aesthetics, Yin-Yang design |
| Diamond Sutra Chapter 1 Study | 15 | Zen academic, ink wash whitespace | |
| Git Introduction Guide | 10 | Pixel retro game style |
📖 View Complete Examples Documentation
User Input (PDF/DOCX/URL/Markdown)
↓
[Source Content Conversion] → pdf_to_md.py / doc_to_md.py / web_to_md.py
↓
[Create Project] → project_manager.py init <project_name> --format <format>
↓
[Template Option] A) Use existing template B) No template
↓
[Need New Template?] → Use /create-template workflow separately
↓
[Strategist] - Eight Confirmations & Design Specifications
↓
[Image_Generator] (When AI generation is selected)
↓
[Executor] - Two-Phase Generation
├── Visual Construction Phase: Generate all SVG pages → svg_output/
└── Logic Construction Phase: Generate complete speaker notes → notes/total.md
↓
[Post-processing] → total_md_split.py (split notes) → finalize_svg.py → svg_to_pptx.py
↓
Output: Native editable PPTX with real shapes + SVG reference PPTX (auto-embeds speaker notes)
↓
[Optimizer_CRAP] (Optional, only if the first draft is unsatisfactory)
↓
If optimized: re-run post-processing and export
| Document | Description |
|---|---|
| 🧭 AGENTS.md | Repository-level entry overview for general AI agents |
| 📖 SKILL.md | Canonical ppt-master workflow and rules |
| 🎨 Design Guidelines | Colors, typography, and layout specifications |
| 📐 Canvas Formats | PPT, Xiaohongshu (RED), WeChat Moments, and 10+ formats |
| 🖼️ Image Embedding Guide | SVG image embedding best practices |
| 📊 Chart Template Library | Standardized chart templates |
| 🔧 Role Definitions | Role definitions and technical references |
| 🛠️ Toolset | Usage instructions for all tools |
| 💼 Examples Index | 15 projects, 229 SVG pages of examples |
This project requires Python 3.8+ for running PDF conversion, SVG post-processing, PPTX export, and other tools.
| Platform | Recommended Installation |
|---|---|
| macOS | Use Homebrew: brew install python |
| Windows | Download installer from Python Official Website |
| Linux | Use package manager: sudo apt install python3 python3-pip (Ubuntu/Debian) |
💡 Verify Installation: Run
python3 --versionto confirm version ≥ 3.8
If you need to use the web_to_md.cjs tool (for converting web pages from WeChat and other high-security sites), install Node.js.
| Platform | Recommended Installation |
|---|---|
| macOS | Use Homebrew: brew install node |
| Windows | Download LTS version from Node.js Official Website |
| Linux | Use NodeSource: curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash - && sudo apt-get install -y nodejs |
💡 Verify Installation: Run
node --versionto confirm version ≥ 18
If you need to use the doc_to_md.py tool (for converting DOCX, EPUB, LaTeX, and other document formats to Markdown), install Pandoc.
| Platform | Recommended Installation |
|---|---|
| macOS | Use Homebrew: brew install pandoc |
| Windows | Download installer from Pandoc Official Website |
| Linux | Use package manager: sudo apt install pandoc (Ubuntu/Debian) |
💡 Verify Installation: Run
pandoc --versionto confirm it is installed
git clone https://github.com/hugohe3/ppt-master.git
cd ppt-master
pip install -r requirements.txtIf you encounter permission issues, use
pip install --user -r requirements.txtor install in a virtual environment.
Recommended AI editors:
| Tool | Rating | Description |
|---|---|---|
| Claude Code | ⭐⭐⭐ | Highly Recommended! Anthropic official CLI, native Opus support, largest context window |
| Codebuddy IDE | ⭐⭐ | Great Chinese AI IDE, good support for local models like Kimi 2.5 and MiniMax 2.7 |
| Cursor | ⭐⭐ | Mainstream AI editor, great experience but relatively expensive |
| VS Code + Copilot | ⭐⭐ | Microsoft official solution, cost-effective, but limited context window (200k, 35% reserved for output) |
| Antigravity | ⭐ | Free but very limited quota and unstable. Alternative only. |
Open the AI chat panel in your editor and describe what content you want to create:
User: I have a Q3 quarterly report that needs to be made into a PPT
AI: Sure. First we'll confirm whether to use a template; after that Strategist will
continue with the eight confirmations and generate the design spec.
[Template Option] [Recommended] B) No template
[Strategist] 1. Canvas format: [Recommended] PPT 16:9
[Strategist] 2. Page count: [Recommended] 8-10 pages
...
💡 Model Recommendation: Claude Opus works best, but most mainstream models today (like Kimi 2.5 and MiniMax 2.7, tested via Codebuddy IDE) can also generate decent results with only minor gaps in layout details. Due to the instability of Opus on some IDEs (like Antigravity), trying other stable AI clients is recommended.
📝 Post-Export Editing: The default exported PPTX (
.pptx) contains native PowerPoint shapes — text, graphics, and colors are directly editable, no extra steps needed. A second SVG reference file (_svg.pptx) is also generated; for that version, select the content in PowerPoint and use "Convert to Shape" to edit. Requires Office 2016 or later.
💡 AI Lost Context? Ask the AI to read
skills/ppt-master/SKILL.mdfirst; useAGENTS.mdas the repository-level entry overview.
The nano_banana_gen.py tool can generate high-quality images via the Gemini API directly within AI clients. Configure the following environment variables before use:
# Required: Gemini API Key (obtain from https://aistudio.google.com/apikey)
export GEMINI_API_KEY="your-api-key"
# Optional: Custom API endpoint (for proxy services)
export GEMINI_BASE_URL="https://your-proxy-url.com/v1beta"💡 Persist settings: Add the
exportcommands above to~/.zshrc(macOS/Linux zsh) or~/.bashrc(Linux bash), then restart your terminal.
💡 If using the Antigravity proxy, pass the model parameter (
-m gemini-3.1-flash-image).
💡 AI Image Generation Tip: For AI-generated images, we recommend generating them in Gemini and selecting Download full size for higher resolution. Gemini images have a star watermark in the bottom right corner, which can be removed using gemini-watermark-remover or this project's
skills/ppt-master/scripts/gemini_watermark_remover.py.
ppt-master/
├── skills/
│ └── ppt-master/ # Main skill source
│ ├── SKILL.md # Main entry: workflow definition
│ ├── workflows/ # Workflow entry files
│ ├── references/ # Role definitions and specs
│ ├── scripts/ # Tool scripts
│ └── templates/ # Layouts, charts, icons
├── examples/ # Example projects
├── projects/ # User project workspace
├── AGENTS.md # General AI agent entry
└── CLAUDE.md # Dedicated Claude Code CLI entry
# Initialize project
python3 skills/ppt-master/scripts/project_manager.py init <project_name> --format ppt169
# Archive source materials into the project folder
python3 skills/ppt-master/scripts/project_manager.py import-sources <project_path> <source_file_or_url...>
# Note: files outside the workspace are copied by default; files already in the workspace are moved into sources/
# PDF to Markdown
python3 skills/ppt-master/scripts/pdf_to_md.py <PDF_file>
# DOCX / Office documents to Markdown (requires pandoc)
python3 skills/ppt-master/scripts/doc_to_md.py <DOCX_file>
# Post-processing (run in order)
python3 skills/ppt-master/scripts/total_md_split.py <project_path>
python3 skills/ppt-master/scripts/finalize_svg.py <project_path>
python3 skills/ppt-master/scripts/svg_to_pptx.py <project_path> -s final
# Default: generates two files — native shapes (.pptx) + SVG reference (_svg.pptx)
# Use --only native to skip SVG reference version
# Use --only legacy to only generate SVG image version📖 For complete tool documentation, see Tools Usage Guide
Q: Can I edit the generated presentations?
Yes! The default export (.pptx) produces native PowerPoint shapes — all text, graphics, and colors are directly editable in PowerPoint without any conversion. An SVG reference version (_svg.pptx) is also generated; for that file, select the content and use "Convert to Shape" to unlock editing. Requires Office 2016 or later.
Q: What's the difference between the three Executors?
- Executor_General: General scenarios, flexible layout
- Executor_Consultant: General consulting, data visualization
- Executor_Consultant_Top: Top consulting (MBB level), 5 core techniques
Q: Is Optimizer_CRAP required?
No. Only use it when you need to optimize the visual effects of key pages.
Contributions are welcome!
- Fork this repository
- Create your branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Contribution Areas: 🎨 Design templates · 📊 Chart components · 📝 Documentation · 🐛 Bug reports · 💡 Feature suggestions
This project is licensed under the MIT License.
- SVG Repo - Open source icon library
- Robin Williams - CRAP design principles
- McKinsey, Boston Consulting, Bain - Design inspiration
- Issue: GitHub Issues
- GitHub: @hugohe3
If this project helps you, please give it a ⭐ Star!
Made with ❤️ by Hugo He