ML Engineer - Vision at photalabs.com (Expired)

About us

At Phota Labs, we’re building visual GenAI that helps people capture, express, and relive their memories — in ways that feel effortless, personal, and emotionally resonant. Our core technology enables personalized image generation that faithfully reflects who you are and the moments you experienced. Our first goal is to bring visual GenAI into everyday photography.

We're a small team of researchers, engineers, and designers who have always been at the forefront of how people capture, edit, and share images and videos. We build with our hands and hearts. We believe GenAI is the next shift for photography, and are seeking builders who share this vision — people like us, like you. We're just getting started!

The role

Connecting AI-generated visual content with real subjects and contexts requires more than just generation capabilities. It demands a deep understanding of identity, context, intentions, and semantic elements. As our first Machine Learning Engineer focused on computer vision, you will lead this crucial effort and build pipeline components that are as vital as the generative models themselves. You'll work with various tools including but not limited to large vision-language models, segmentation, and detection models. Additionally, you may occasionally support our data processing efforts by applying your expertise in image understanding.

What you’ll do

Rapidly experiment with diverse image understanding tools — including large vision-language models, segmentation, and detection methods — to solve real-world image understanding problems in our pipeline and enhance generative components.
Train and adapt semantic understanding models to meet specific product requirements, including model specialization and performance improvement.
Collaborate with our backend engineer to develop an end-to-end pipeline that integrates various understanding and generative components, and deploy them as APIs.
Work closely with GenAI researchers and product designers to identify and address key image understanding challenges.
Occasionally, support our data team with your expertise in image understanding.

You may be a strong fit if you

Have experience training and fine-tuning image understanding models for segmentation, detection, and masking. Experience with faces and human subjects is preferred.
Are familiar with various open-source models for different image understanding tasks. Experience with efficient models is preferred.
Have experience using and prompting large Vision-Language Models (VLMs) — either open-source or via API — for image understanding. Experience with data specialization, fine-tuning, or reinforcement learning is a plus.
Possess strong proficiency in PyTorch, transformers, and other neural network architectures.
Have experience developing and integrating ML models and pipelines for products, with strong engineering practices and a deep understanding of product requirements.
Have demonstrated research capability through publications and models you've successfully deployed.
Take ownership, work persistently to solve real problems, and collaborate effectively to transform research models into real-world products.

Logistics

This role is based in San Jose, where we work in person. We believe the best ideas come from being in the same room.
We sponsor visas. We are committed to working through the process together for the right candidates. If you're currently outside the US, we're also committed to helping you relocate to the US throughout this process.
We offer generous health, dental, and vision coverage, unlimited PTO, paid parental leave, and relocation support as needed.
Don't meet every single qualification? That’s okay — we care more about your trajectory than checking every box. If the role excites you and the mission resonates, we'd love to hear from you.

Note: In the event your application is successful and an offer of employment is made to you, any offer of employment will be conditional on the results of a background check, performed by a third party acting on our behalf.

ML Engineer - Vision

About us

The role

What you’ll do

You may be a strong fit if you

Logistics

Other Recent Opportunities

AI Language Model Trainer

AI Video Model Trainer for Real-Time Lip Syncing

AI Model Trainer for ChatGPT and Llama 3

Foundry AI Data Engineering integration specialist

API Integration Specialist

AI Model Integration Specialist

Job Alerts

Need Help?