textsnap Command: Examples, Options, and Usage

常用示例

OCR a local image

textsnap [path/to/image.png]

OCR an image at a URL

textsnap [https://example.com/image.png]

OCR a webpage

textsnap [https://example.com/article]

OCR an image already on the clipboard

textsnap

Write output to a specific file

textsnap [image.png] -o [out.txt]

Strip markdown

textsnap [image.png] --plaintext

Cap the decoder

textsnap [image.png] --max-tokens 1024

Use a custom local model directory

textsnap [image.png] --model-dir [path/to/model]

Show progress diagnostics

textsnap -v [image.png]

说明

textsnap is a command-line OCR utility built around the PaddleOCR-VL-1.5 vision-language model exported to ONNX. It reads an image from a file path, a URL, or the system clipboard and writes the recognized text to a file inside ./textsnaps/, printing only the output path on stdout so it composes cleanly in shell pipelines. The model runs entirely on the CPU, no GPU and no cloud calls. Default output is Markdown so structure such as tables, headings, and lists is preserved; --plaintext flattens it for callers that only want raw text. Webpage URLs are rendered before OCR, which makes the tool usable as a "screenshot to text" pipeline for content that is hard to copy directly.

参数

-o, --output _PATH_: Write the OCR text to _PATH_. Default: ./textsnaps/_name_\_ocr.txt.
-v, --verbose: Print progress diagnostics to stderr.
--plaintext: Convert the default Markdown output into plain text (no tables, no headings).
--model-dir _PATH_: Use ONNX model files from _PATH_ instead of the cached download.
--max-tokens _N_: Cap the decoder at _N_ generated tokens (default 2048).
--max-pixels _N_: Limit the vision encoder pixel budget per image to _N_.
--no-verify: Skip SHA-256 verification of downloaded model files.
--generate-checksums: Re-download the model and rewrite the checksum manifest.

FAQ

What is the textsnap command used for?

textsnap is a command-line OCR utility built around the PaddleOCR-VL-1.5 vision-language model exported to ONNX. It reads an image from a file path, a URL, or the system clipboard and writes the recognized text to a file inside ./textsnaps/, printing only the output path on stdout so it composes cleanly in shell pipelines. The model runs entirely on the CPU, no GPU and no cloud calls. Default output is Markdown so structure such as tables, headings, and lists is preserved; --plaintext flattens it for callers that only want raw text. Webpage URLs are rendered before OCR, which makes the tool usable as a "screenshot to text" pipeline for content that is hard to copy directly.

How do I run a basic textsnap example?

Run `textsnap [path/to/image.png]` in a terminal, then adjust file names, paths, flags, or remote targets for your system.

What does -o, --output _PATH_ do in textsnap?

Write the OCR text to _PATH_. Default: ./textsnaps/_name_\_ocr.txt.

textsnap 命令

常用示例

说明

参数

FAQ

相关命令