Computer Vision Engineer-Edge
Viatouch Media incAbout the Role
VICKI is an AI-powered IoT self-checkout company backed by one of the world’s largest payment providers. We’re hiring an immediate Senior Computer Vision Engineer to own a major platform upgrade: moving from a Raspberry Pi-based CV stack to GPU-accelerated Jetson Orin—and specifically delivering a production-ready 3–4 camera vision system inside the Vicki AI Machine.
This is an end-to-end role: multi-camera video pipelines → dataset strategy → model training → edge optimization (TensorRT) → production reliability.
What You’ll DoMulti‑camera (3–4 camera) ownership on Jetson Orin
- Lead production computer vision pipelines running on Jetson Orin (JetPack-based Linux environment).
- Build and stabilize a 3–4 camera capture + inference pipeline suitable for real-time retail use.
- Drive practical multi-camera readiness:
- Field-of-view coverage and camera placement tradeoffs
- Calibration concepts (intrinsics/extrinsics), lens distortion handling, alignment checks
- Multi-stream handling (throughput, dropped frames, pipeline stability)
Computer vision + machine learning (retail-first)
- Research, design, train, and ship object detection + tracking models for retail environments:
- SKU / product recognition, multi-item scenes, occlusions
- Glare/reflective packaging, variable lighting, crowded bins/shelves
- Own dataset evolution:
- Collection strategy designed around multi-camera coverage and real failure modes
- Labeling specs, QA processes, augmentation, hard-negative mining
- Active learning loops based on production misses and edge cases
- Benchmark models for accuracy + robustness + speed, and define acceptance criteria for rollout.
Edge deployment + performance engineering
- Convert and optimize models for embedded deployment:
- ONNX export and runtime considerations
- TensorRT optimization (FP16/INT8 where appropriate)
- Improve throughput/latency/thermals/reliability for long-running production operation.
- Troubleshoot production issues spanning camera feeds, inference services, tracking behavior, and device constraints.
Leadership
- Mentor CV engineers and collaborate cross-functionally (embedded, backend, product, ops).
- Set standards for code quality, experimentation hygiene, reproducibility, and production readiness.
Required Qualifications
- MS or PhD in Computer Science, Electrical Engineering, or related field with strong focus in computer vision / deep learning / machine learning (or equivalent practical experience).
- 4+ years professional experience in Deep Learning / Computer Vision / Machine Learning with production responsibility.
- Retail CV experience (required): hands-on experience training object detection models on retail products/objects (e.g., consumer packaged goods, SKU-level recognition, shelf/bin/basket scenes), including dataset design and iteration based on real-world failures.
- Embedded/Edge experience: 1+ year deploying CV/ML to embedded devices, including performance constraints and production troubleshooting.
- Proven ability to take models from training → deployment → evaluation → iteration in real environments.
Required Technical SkillsProgramming
- Python (3.x) — training, experimentation, tooling, and/or production services
- C++ (C++14/17) — performance-critical inference/pipeline code on-device
- GPU fundamentals (CUDA-enabled stacks) — can reason about bottlenecks and optimize for GPU deployment on Jetson-class devices
(You don’t need to write custom CUDA kernels daily, but you must be effective optimizing in a CUDA-enabled environment.)
- Bash / Linux shell — device-level debugging, profiling, automation
Frameworks / Tooling
- PyTorch and/or TensorFlow (PyTorch strongly preferred)
- OpenCV
- TensorRT, ONNX (conversion + runtime considerations)
- Linux (Ubuntu-based stacks common for Jetson)
- Git
Preferred (Strong Pluses)
- Direct Jetson Orin / Xavier deployment experience + JetPack familiarity
- DeepStream and/or GStreamer for real-time multi-stream video pipelines
- YOLO-family, transformer-based detectors, lightweight/mobile architectures
- Multi-object tracking (DeepSORT/ByteTrack-style) and real-world tuning
- MLOps / experiment tracking / dataset versioning (W&B, MLflow, DVC, or equivalent)
- Experience building systems robust to packaging changes, planogram drift, glare, motion blur, and occlusions
Benefits
- 401(k)
- Health insurance
- Dental insurance
- Vision insurance
- Paid time off
- RSUs
Job Type: Full-time
Pay: $88,294.67 - $135,000.00 per year
Benefits
- Dental insurance
- Employee stock purchase plan
- Gym membership
- Health insurance
- Professional development assistance
- RSU
- Stock options
- Tuition reimbursement
Work Location: In person
Job Type
- Job Type
- Full Time
- Salary Range
- USD 88,294.67 - 135,000 yearly
- Location
- New York, NY
Share this job: