[Remote] English Audio Model Trainer/QA

YO IT Group

Note: The job is a remote job and is open to candidates in USA. YO IT Group is a cutting-edge AI research initiative seeking detail-oriented individuals for the role of English Audio Model Trainer/QA. In this position, you will record audio clips that describe visual content to help build datasets for multimodal AI systems, contributing to the development of advanced models that understand and interact with the world.

Responsibilities

  • View a series of images and generate clear, concise, and natural-sounding spoken descriptions.
  • Record short audio clips (typically 2-3 minutes each) using provided tools or platforms.
  • Ensure recordings are high quality and free from background noise or distortion.
  • Follow specific linguistic, timing, or stylistic guidelines as outlined by the research team.
  • Collaborate with AI researchers and QA teams to review and iterate on data quality.

Skills

  • Excellent verbal communication and enunciation skills.
  • Native or near-native fluency in English (other language fluencies are a plus).
  • Strong attention to detail and the ability to follow annotation guidelines precisely.
  • Comfortable working independently and handling repetitive tasks with consistency.
  • Prior experience with voice recording or data annotation is a plus, but not required.

Benefits

  • Flexible, remote-friendly work structure.

Company Overview

  • YO IT Group is a global leader in innovative IT solutions, dedicated to transforming businesses with cutting-edge technologies. It was founded in 2018, and is headquartered in Al Ghuweifat, Abu Dhabi Emirate, AE, with a workforce of 201-500 employees. Its website is https://shorturl.at/L4TnX.

Job Alerts

Get notified when new positions matching your interests become available at Gen AI Careers.

Need Help?

Questions about our hiring process or want to learn more about working with us?