home/categories/documents/openclaw-skills-skills-baichenwzj-pdf-miner-skill-md
documentscontent-media

pdf-miner

Extract text and tables from PDF files with robust support for global market data formats (currencies, percentages, units). Use when: (1) User asks to read/extract content from a PDF file, (2) User needs text or tables from industry reports, research papers, or financial documents, (3) web_fetch or scrapling fail on a PDF. Supports: keyword search, metrics extraction, table of contents detection, PDF diff/comparison, LLM chunk splitting, batch processing, header/footer cleaning. NOT for: OCR on scanned image-based PDFs, editing/merging PDFs, or creating new PDFs.

openclaw
maintainer
openclaw
Mis à jour 4/7/2026
Étoiles
4001
Forks
1095
quick start

Installation and usage

Extract text and tables from PDF files with robust support for global market data formats (currencies, percentages, units). Use when: (1) User asks to read/extract content from a PDF file, (2) User needs text or tables from industry reports, research papers, or financial documents, (3) web_fetch or scrapling fail on a PDF. Supports: keyword search, metrics extraction, table of contents detection, PDF diff/comparison, LLM chunk splitting, batch processing, header/footer cleaning. NOT for: OCR on scanned image-based PDFs, editing/merging PDFs, or creating new PDFs.

Installation
$ install --globalskills.sh
Utilisation

Après l'installation, vous pouvez utiliser ce skill en exécutant la commande suivante dans votre terminal :

skills use pdf-miner