Status: Idle
— fps
Voice: off
Output
How it works
- Object mode uses the COCO‑SSD model (on‑device) to find common objects and their rough positions (left/center/right, near/far).
- Read Text uses Tesseract.js OCR. Hold steady on signage, menus, or documents.
- Voice reads results aloud using your system’s speech engines. Toggle with the checkbox or press V.
- Accessibility: large buttons, high contrast, zoom & contrast sliders, and keyboard shortcuts: Space scan, O detect, R read, V voice, +/- zoom.
Tip: For richer scene descriptions (relationships, actions), you can connect an optional server caption API later (see code comments – Describe Scene hook).
Privacy
All processing runs locally in the browser. No images are uploaded unless you enable an external caption API.