Local-First Intelligence
We have constructed an AI navigation rig which translates the surrounding environment into audio and haptic feedback by using depth cameras, computer vision, and inferencing AI models. All function entirely locally with zero latency.
Sensory Awareness
Haptic Feedback
3 independent vibrating motors pulse with intensity proportional to obstacle proximity.
Spoken Alerts
Priority-ranked spatial audio for critical warnings like "Car ahead, 0.9m".
Facial Recognition
Registered people are identified and announced by name and direction.
AI Scene Narration
On-device Google Gemma 4 provides natural language descriptions of your environment.
The Local-First Stack
NVIDIA Jetson Orin NX
Advanced computer capable of running computer vision models and LLMs in real time.
ZED 2i Stereo Camera
Industrial-grade depth sensing that works in any lighting condition.
Arduino Uno
Low-latency control for 3-point haptic motors and sensory arrays.
Google Gemma 4
SOTA scene narration running entirely on-device without internet.
Intelligent Perception
Visual Engine
The ZED 2i stereo camera captures depth and motion at 1080p, detecting everything from stairs to low overheads up to 8m away.
Intelligence
Powered by NVIDIA Jetson, our on-device models recognize faces and objects in real-time without ever needing the cloud.
Haptic Mapping
A 3-zone vibration belt translates distance into intuitive pulses. Left, center, and right zones keep you centered and safe.
Smart Routing
Record familiar routes once to receive waypoint-based navigation with combined haptic and audio turn-by-turn cues.
Everything the Rig Can Do
Haptic Vision is a vest that reads your surroundings and turns them into haptic and audio feedback. Here is the full set of what it does today.
Core Perception
- Sub-Meter Depth: 1080p/30fps mapping out to 8 meters.
- Threat Detection: Two YOLO models flag hazards in real time.
- Stairs and Curbs: Depth-profile detection for steps, drop-offs, and low overheads.
Recognition & Memory
- Face Recognition: Depth-verified liveness rejects photos and screens.
- Auto-Enrollment: New faces are saved and can be renamed from a phone.
- Event Memory: A rolling log of what the rig just saw grounds every narration.
Haptic & Audio
- Scaled Haptics: Natural-feeling vibration via Weber-Fechner scaling.
- Haptic Vocabulary: Distinct patterns for furniture, doorways, stairs, and warnings.
- Pre-emptive Audio: Priority-queued alerts interrupt for critical warnings.
Voice Control
- Push-to-Talk: Ask a question or run a command out loud.
- On-Device Speech: Recognition runs locally; no utterance leaves the rig.
- Ask Anything: Free-form questions go to the model with the live scene as context.
Scene Reading
- On-Demand OCR: Reads exit signs, room numbers, and package labels.
- One Tap: Runs only when asked, so there is no per-frame cost.
- Spoken Promptly: Recognized text is read just under safety alerts.
Navigation & Positioning
- Saved Places: Walk a route once, then get turn-by-turn cues later.
- Place-Aware Boot: A WiFi and BLE fingerprint recognizes where you are on startup.
- Step Counting: "Ten steps; the kitchen door is on your right."
Sensor Fusion
- Heading Lock: Magnetometer-anchored haptic drift correction.
- Fall Detection: Barometer and IMU catch impacts and floor changes.
- Pocket Mode: Quiet the belt to hold the camera for a one-off query.
On-Device Intelligence
- Scene Narration: Gemma 4 describes the scene in a sentence or two.
- Multi-Language: Replies in the user's language, fifteen supported.
- Natural Voice: Optional Piper voices in place of robotic speech.
No Cloud. No Internet.
Haptic Vision processes everything locally on your backpack. Your visual data never leaves the device, ensuring total privacy and maximum reliability in any environment.