home/categories/llm-ai/patricio0312rev-skills-ai-engineering-doc-to-vector-dataset-generator-skill-md
llm-aidata-ai
doc-to-vector-dataset-generator
Converts documents into clean, chunked datasets suitable for embeddings and vector search. Produces chunked JSONL files with metadata, deduplication logic, and quality checks. Use when preparing "training data", "vector datasets", "document processing", or "embedding data".
maintainer
patricio0312rev
更新日 1/12/2026
スター
6
フォーク
0
quick start
Installation and usage
Converts documents into clean, chunked datasets suitable for embeddings and vector search. Produces chunked JSONL files with metadata, deduplication logic, and quality checks. Use when preparing "training data", "vector datasets", "document processing", or "embedding data".
インストール
$ install --globalskills.sh
使い方
インストール後、ターミナルで以下のコマンドを実行してこのスキルを使用できます:
skills use doc-to-vector-dataset-generator