Documentation
¶
Overview ¶
Package epub provides functionality for reading and parsing EPUB files.
It allows you to open EPUB files and extract their metadata, chapters, and other content. The package provides both high-level functions for common operations and low-level access to the internal structure of EPUB files.
Basic usage:
package main
import (
"fmt"
"log"
"io"
"os"
"github.com/mszlu521/go-epub/epub"
)
func main() {
// Open an EPUB file
e, err := epub.Open("book.epub")
if err != nil {
log.Fatal(err)
}
defer e.Close()
// Get book metadata
fmt.Println("Title:", e.GetTitle())
fmt.Println("Author:", e.GetAuthor())
fmt.Println("Description:", e.GetDescription())
// Get chapters
chapters, err := e.GetChapters()
if err != nil {
log.Fatal(err)
}
fmt.Printf("Found %d chapters\n", len(chapters))
// Read content of the first chapter using io.Reader
reader, err := e.GetChapterReader(0)
if err != nil {
log.Fatal(err)
}
// Copy the content to stdout
_, err = io.Copy(os.Stdout, reader)
if err != nil {
log.Fatal(err)
}
}
Index ¶
- type Chapter
- type Container
- type Epub
- func (e *Epub) Close() error
- func (e *Epub) GetAuthor() string
- func (e *Epub) GetChapterContent(chapterIndex int, opts ...Option) (string, error)
- func (e *Epub) GetChapterReader(chapterIndex int, opts ...Option) (io.Reader, error)
- func (e *Epub) GetChapters(opts ...Option) ([]Chapter, error)
- func (e *Epub) GetCover() (io.ReadCloser, error)
- func (e *Epub) GetDescription() string
- func (e *Epub) GetFileReader(path string) (io.ReadCloser, error)
- func (e *Epub) GetItems() []Item
- func (e *Epub) GetMetadata() Metadata
- func (e *Epub) GetTitle() string
- type Item
- type ItemRef
- type Metadata
- type NCX
- type NavPoint
- type Option
- type Package
- type Rootfile
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Chapter ¶
Chapter represents a book chapter
A Chapter contains the title, content, and order of a chapter in the EPUB file. Chapters are extracted based on the spine order defined in the EPUB package document.
type Container ¶
type Container struct {
Rootfiles []Rootfile `xml:"rootfiles>rootfile"`
}
Container represents the container.xml file structure
type Epub ¶
type Epub struct {
File *zip.Reader
RootFile string
Metadata Metadata
Manifest []Item
Spine []ItemRef
TOC *NCX
// contains filtered or unexported fields
}
Epub represents an EPUB file
The Epub struct contains the parsed contents of an EPUB file, including its metadata, manifest, spine, and table of contents. It also maintains a reference to the underlying zip.Reader for accessing the raw file contents.
func New ¶
New creates and parses an EPUB from a zip.Reader
The New function takes a zip.Reader and returns a pointer to an Epub struct that represents the parsed contents of the EPUB. This is useful when you already have a zip.Reader and want to parse it as an EPUB file.
Example:
// When you have an io.Reader containing EPUB data
reader, err := os.Open("book.epub")
if err != nil {
log.Fatal(err)
}
defer reader.Close()
// Get file info to create a sized reader
stat, err := reader.Stat()
if err != nil {
log.Fatal(err)
}
// Create a zip reader
zipReader, err := zip.NewReader(reader, stat.Size())
if err != nil {
log.Fatal(err)
}
// Parse as EPUB
e, err := epub.New(zipReader)
if err != nil {
log.Fatal(err)
}
defer e.Close()
func NewReader ¶ added in v1.0.1
NewReader creates and parses an EPUB from an io.Reader
The NewReader function takes an io.Reader and returns a pointer to an Epub struct that represents the parsed contents of the EPUB. This is useful when you have an io.Reader and want to parse it as an EPUB file. Note that this function reads the entire content into memory to provide random access.
Example:
// When you have an io.Reader containing EPUB data
reader, err := os.Open("book.epub")
if err != nil {
log.Fatal(err)
}
defer reader.Close()
// Parse as EPUB
e, err := epub.NewReader(reader)
if err != nil {
log.Fatal(err)
}
defer e.Close()
func Open ¶
Open opens and parses an EPUB file from a file path
The Open function takes a path to an EPUB file and returns a pointer to an Epub struct that represents the parsed contents of the file. It handles all the necessary parsing of the EPUB structure, including the container file, package document, and table of contents.
It is the caller's responsibility to call Close on the returned Epub when finished with it to free up resources.
Example:
e, err := epub.Open("book.epub")
if err != nil {
log.Fatal(err)
}
defer e.Close()
title := e.GetTitle()
func (*Epub) Close ¶
Close closes the EPUB file
This method closes the underlying EPUB file and releases any associated resources. It should be called when finished working with the EPUB to prevent resource leaks.
Example:
e, err := epub.Open("book.epub")
if err != nil {
log.Fatal(err)
}
defer e.Close() // Ensures the file is closed when done
func (*Epub) GetAuthor ¶
GetAuthor returns the book author
This method returns the creator/author of the EPUB book as defined in its metadata. If no author is defined in the EPUB metadata, an empty string is returned.
func (*Epub) GetChapterContent ¶
GetChapterContent returns the content of a specific chapter
This method returns the content of a chapter at the specified index as a string. The index is zero-based, so the first chapter is at index 0.
If the chapter index is out of range or an error occurs while retrieving the chapter content, an error is returned.
Example:
content, err := e.GetChapterContent(0)
if err != nil {
log.Fatal(err)
}
fmt.Println(content)
func (*Epub) GetChapterReader ¶
GetChapterReader returns an io.Reader for a specific chapter
This method returns an io.Reader for the content of a chapter at the specified index. The index is zero-based, so the first chapter is at index 0.
This is useful when you want to stream the chapter content rather than load it entirely into memory. The returned reader can be used with standard Go io operations.
If the chapter index is out of range or an error occurs while retrieving the chapter content, an error is returned.
Example:
reader, err := e.GetChapterReader(0)
if err != nil {
log.Fatal(err)
}
// Copy the content to stdout
_, err = io.Copy(os.Stdout, reader)
if err != nil {
log.Fatal(err)
}
func (*Epub) GetChapters ¶
GetChapters returns all chapter content
This method extracts all chapters from the EPUB file based on the spine order defined in the package document. It only processes items with HTML media types and attempts to extract chapter titles from the table of contents.
The method returns a slice of Chapter structs containing the title, content, and order of each chapter. If there are no chapters or an error occurs during processing, an empty slice and an error may be returned.
Example:
chapters, err := e.GetChapters()
if err != nil {
log.Fatal(err)
}
for _, chapter := range chapters {
fmt.Printf("Chapter %d: %s\n", chapter.Order, chapter.Title)
}
func (*Epub) GetCover ¶
func (e *Epub) GetCover() (io.ReadCloser, error)
GetCover returns a reader for the cover image of the EPUB, if one exists
This method attempts to locate and return a reader for the cover image of the EPUB. Not all EPUBs have a cover image, and the location of the cover can vary between EPUB versions. If a cover image is found, an io.ReadCloser is returned which the caller must close. If no cover is found, nil is returned with no error.
Example:
cover, err := e.GetCover()
if err != nil {
log.Fatal(err)
}
if cover != nil {
defer cover.Close()
// Process cover image
} else {
fmt.Println("No cover image found")
}
func (*Epub) GetDescription ¶
GetDescription returns the book description
This method returns the description of the EPUB book as defined in its metadata. If no description is defined in the EPUB metadata, an empty string is returned.
func (*Epub) GetFileReader ¶
func (e *Epub) GetFileReader(path string) (io.ReadCloser, error)
GetFileReader returns an io.Reader for a file in the EPUB by path
This method returns an io.ReadCloser for any file within the EPUB archive, identified by its path. The path should be relative to the root of the EPUB.
This is useful for accessing specific files within the EPUB, such as CSS files, images, or other resources. The caller is responsible for closing the returned ReadCloser when finished with it.
If the specified file is not found in the EPUB, an error is returned.
Example:
reader, err := e.GetFileReader("META-INF/container.xml")
if err != nil {
log.Fatal(err)
}
defer reader.Close()
content, err := io.ReadAll(reader)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(content))
func (*Epub) GetItems ¶
GetItems returns all items in the EPUB manifest
This method returns the complete list of items declared in the EPUB manifest. Each item contains its ID, href (path), and media type. This can be useful for examining all resources included in the EPUB file.
Example:
items := e.GetItems()
for _, item := range items {
fmt.Printf("ID: %s, Href: %s, MediaType: %s\n", item.ID, item.Href, item.MediaType)
}
func (*Epub) GetMetadata ¶
GetMetadata returns the complete metadata of the book
This method returns the complete metadata struct of the EPUB book, which includes all available metadata fields like title, author, subject, description, publisher, etc.
Example:
metadata := e.GetMetadata()
fmt.Println("Title:", metadata.Title)
fmt.Println("Author:", metadata.Creator)
fmt.Println("Publisher:", metadata.Publisher)
type Item ¶
type Item struct {
ID string `xml:"id,attr"`
Href string `xml:"href,attr"`
MediaType string `xml:"media-type,attr"`
}
Item represents an item in the manifest
type Metadata ¶
type Metadata struct {
Title string `xml:"title"`
Creator string `xml:"creator"`
Subject string `xml:"subject"`
Description string `xml:"description"`
Publisher string `xml:"publisher"`
Contributor string `xml:"contributor"`
Date string `xml:"date"`
Type string `xml:"type"`
Format string `xml:"format"`
Identifier string `xml:"identifier"`
Language string `xml:"language"`
Rights string `xml:"rights"`
}
Metadata represents the metadata of an EPUB
type NCX ¶
type NCX struct {
Title string `xml:"docTitle>text"`
}
NCX represents the NCX file structure (table of contents)
type Option ¶
type Option func(*epubOptions)
Option defines a functional option for configuring EPUB parsing
func WithChapterFilter ¶
WithChapterFilter sets a filter function for chapters
func WithContext ¶
WithContext sets the context for the EPUB parsing operation
func WithMaxContentLength ¶
WithMaxContentLength sets the maximum content length to process
func WithMetadata ¶
func WithMetadata() Option
WithMetadata includes full metadata in the parsed content