Senior Computer Vision / Generative AI Engineer – High-Resolution Real Estate Photo Coloring - Contract to Hire

Upwork

Hi there,

We’re building AI-powered photo editing tools for real estate. Our goal: turn raw property photos into polished, booking-ready images, fast, consistent, and without manual editing work.

We already have a large paired dataset of before/after images and working pipelines for sky replacement, perspective correction, etc. Now we’re focusing on the coloring model, which is currently based on CycleGAN but needs a full upgrade.

The challenge

CycleGAN is not enough. We need a production-ready image-to-image pipeline that:

  • Processes 3000×2000 photos in ≤30 seconds on A100 GPUs.
  • Matches our “after” style with 90%+ acceptability rate (minimal re-edits).
  • Runs with clean, consistent outputs (no artifacts, seams, or tone mismatches).
  • Can later scale to other editing tasks (sky replacement, blurring, object removal).

What we expect you to build

  • Fine-tuned Stable Diffusion img2img pipeline (LoRA preferred; DreamBooth if needed).
  • ControlNet integration (Canny + Depth) for structural consistency.
  • High-resolution optimization: tiling, FP16, memory-efficient attention.
  • Evaluation metrics: SSIM, ΔE color difference, side-by-side reviews.
  • Clean API or script for easy integration into our workflow.
  • Versioned, documented model with reproducible training.

What we’re looking for

  • Strong Stable Diffusion fine-tuning experience (LoRA, DreamBooth).
  • Hands-on with high-res GPU optimization (tiling, FP16, xFormers).
  • Proven record of real-world deployment, not just research code.
  • Comfort with API development (FastAPI/Flask).
  • Experience with evaluation + QA (SSIM, ΔE, human review loops).
  • Bonus: Background in real estate or e-commerce imagery.

How to apply (no copy-paste, please)

To avoid generic answers, please answer these 3 questions clearly and briefly:

  • Dataset alignment → Our before/after pairs are large and diverse. How would you ensure the model learns the correct transformation instead of artifacts?
  • Speed vs quality → What exact tricks would you use to guarantee ≤30s/image while keeping outputs natural and costs low?
  • Your edge → What’s one idea or method you’d bring that most others probably won’t suggest?

⚠️ Important: If your answer looks like a generic “fine-tune SD with LoRA” template, we’ll skip it. We want to see your own thought process. Short, specific, and practical beats long copy-paste answers.

Budget & Timeline

  • First milestone: working model at 90% accuracy incl API: What is your estimate?
  • Budget: $3,000–$5000,- depending on experience and approach.
  • Next: extend the pipeline to other industries.

Why this project is exciting

This isn’t academic R&D. It’s applied AI, your work will go straight into production, with measurable impact. Every improvement you make saves hours of manual editing for real estate businesses.

If you’re motivated by turning cutting-edge generative AI into real-world results, we’d love to hear from you.

Thank you for your time

Job Alerts

Get notified when new positions matching your interests become available at Gen AI Careers.

Need Help?

Questions about our hiring process or want to learn more about working with us?