VM

Computer Vision Engineer-Edge

Viatouch Media inc

About the Role

VICKI is an AI-powered IoT self-checkout company backed by one of the world’s largest payment providers. We’re hiring an immediate Senior Computer Vision Engineer to own a major platform upgrade: moving from a Raspberry Pi-based CV stack to GPU-accelerated Jetson Orin—and specifically delivering a production-ready 3–4 camera vision system inside the Vicki AI Machine.

This is an end-to-end role: multi-camera video pipelines → dataset strategy → model training → edge optimization (TensorRT) → production reliability.

What You’ll DoMulti‑camera (3–4 camera) ownership on Jetson Orin

  • Lead production computer vision pipelines running on Jetson Orin (JetPack-based Linux environment).
  • Build and stabilize a 3–4 camera capture + inference pipeline suitable for real-time retail use.
  • Drive practical multi-camera readiness:
  • Field-of-view coverage and camera placement tradeoffs
  • Calibration concepts (intrinsics/extrinsics), lens distortion handling, alignment checks
  • Multi-stream handling (throughput, dropped frames, pipeline stability)

Computer vision + machine learning (retail-first)

  • Research, design, train, and ship object detection + tracking models for retail environments:
  • SKU / product recognition, multi-item scenes, occlusions
  • Glare/reflective packaging, variable lighting, crowded bins/shelves
  • Own dataset evolution:
  • Collection strategy designed around multi-camera coverage and real failure modes
  • Labeling specs, QA processes, augmentation, hard-negative mining
  • Active learning loops based on production misses and edge cases
  • Benchmark models for accuracy + robustness + speed, and define acceptance criteria for rollout.

Edge deployment + performance engineering

  • Convert and optimize models for embedded deployment:
  • ONNX export and runtime considerations
  • TensorRT optimization (FP16/INT8 where appropriate)
  • Improve throughput/latency/thermals/reliability for long-running production operation.
  • Troubleshoot production issues spanning camera feeds, inference services, tracking behavior, and device constraints.

Leadership

  • Mentor CV engineers and collaborate cross-functionally (embedded, backend, product, ops).
  • Set standards for code quality, experimentation hygiene, reproducibility, and production readiness.

Required Qualifications

  • MS or PhD in Computer Science, Electrical Engineering, or related field with strong focus in computer vision / deep learning / machine learning (or equivalent practical experience).
  • 4+ years professional experience in Deep Learning / Computer Vision / Machine Learning with production responsibility.
  • Retail CV experience (required): hands-on experience training object detection models on retail products/objects (e.g., consumer packaged goods, SKU-level recognition, shelf/bin/basket scenes), including dataset design and iteration based on real-world failures.
  • Embedded/Edge experience: 1+ year deploying CV/ML to embedded devices, including performance constraints and production troubleshooting.
  • Proven ability to take models from training → deployment → evaluation → iteration in real environments.

Required Technical SkillsProgramming

  • Python (3.x) — training, experimentation, tooling, and/or production services
  • C++ (C++14/17) — performance-critical inference/pipeline code on-device
  • GPU fundamentals (CUDA-enabled stacks) — can reason about bottlenecks and optimize for GPU deployment on Jetson-class devices

(You don’t need to write custom CUDA kernels daily, but you must be effective optimizing in a CUDA-enabled environment.)

  • Bash / Linux shell — device-level debugging, profiling, automation

Frameworks / Tooling

  • PyTorch and/or TensorFlow (PyTorch strongly preferred)
  • OpenCV
  • TensorRT, ONNX (conversion + runtime considerations)
  • Linux (Ubuntu-based stacks common for Jetson)
  • Git

Preferred (Strong Pluses)

  • Direct Jetson Orin / Xavier deployment experience + JetPack familiarity
  • DeepStream and/or GStreamer for real-time multi-stream video pipelines
  • YOLO-family, transformer-based detectors, lightweight/mobile architectures
  • Multi-object tracking (DeepSORT/ByteTrack-style) and real-world tuning
  • MLOps / experiment tracking / dataset versioning (W&B, MLflow, DVC, or equivalent)
  • Experience building systems robust to packaging changes, planogram drift, glare, motion blur, and occlusions

Benefits

  • 401(k)
  • Health insurance
  • Dental insurance
  • Vision insurance
  • Paid time off
  • RSUs

Job Type: Full-time

Pay: $88,294.67 - $135,000.00 per year

Benefits

  • Dental insurance
  • Employee stock purchase plan
  • Gym membership
  • Health insurance
  • Professional development assistance
  • RSU
  • Stock options
  • Tuition reimbursement

Work Location: In person

Job Type

Job Type
Full Time
Salary Range
USD 88,294.67 - 135,000 yearly
Location
New York, NY

Share this job: