This document provides an overview of the striprtf R package, which extracts plain text from Rich Text Format (RTF) files and strings. The package is designed to handle diverse RTF sources, character encodings, and complex document structures including tables.
For detailed usage instructions, see User Guide. For complete function documentation, see API Reference. For implementation details, see Implementation Details.
The striprtf package serves as a specialized text extraction tool for RTF documents. RTF is a proprietary document file format developed by Microsoft that allows cross-platform document interchange while preserving formatting information. This package strips away the formatting markup to extract clean, plain text content.
The package addresses common challenges in RTF processing including:
Sources: DESCRIPTION8 README.md11 README.md30-34
| Feature | Description | Functions |
|---|---|---|
| File Processing | Extract text from RTF files on disk | read_rtf |
| String Processing | Extract text from RTF content in memory | strip_rtf |
| Table Support | Customizable table formatting with row and cell delimiters | read_rtf, strip_rtf with table parameters |
| Input Validation | Check if files are valid RTF format | looks_rtf |
| Internationalization | Support for multiple character encodings and code pages | Built into core functions |
| Performance | C++ implementation for efficient parsing | Core processing engine |
Sources: README.md30-34 README.md91-95 NEWS16-17
Sources: README.md15-26
Sources: README.md38-48 README.md101-104
Sources: DESCRIPTION12-16 DESCRIPTION20
Sources: README.md30-34 NEWS16-17
The package underwent significant API changes to improve usability:
| Version | Change | Old Function | New Function |
|---|---|---|---|
| 0.3.1+ | Function renaming | striprtf() | read_rtf() |
| 0.3.1+ | Function renaming | rtf2text() | strip_rtf() |
| 0.4.1+ | Table support | N/A | Table parameters added |
| 0.5.2+ | File validation | N/A | looks_rtf() added |
Sources: README.md82-89 NEWS105-112 NEWS16-17 NEWS73-76
The package has been tested with RTF files generated by:
Sources: README.md50 README.md98-99
Sources: README.md117-119
Refresh this wiki