Synthesia Jobs

Senior Research Engineer - Voice

Synthesia

Senior Research Engineer - Voice

Reposted Yesterday

Be an Early Applicant

Remote

Hiring Remotely in Ireland, IRL

Senior level

Remote

Hiring Remotely in Ireland, IRL

Senior level

Develop and optimize real-time generative speech and voice synthesis systems. Work on streaming and speech-to-speech models, conditioning inputs (emotion, prosody, speaker control), post-training optimizations (quantization, pruning, distillation), integrate novel architectures (neural codecs, diffusion, flow-matching), and define latency-aware evaluation metrics for production deployment.

The summary above was generated by AI

Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London, with offices and teams across Europe and the US.

As AI continues to shape the way we live and work, Synthesia develops products to enhance visual communication and enterprise skill development, helping people work better and stay at the center of successful organizations.

Following our recent Series E funding round, where we raised $200 million, our valuation stands at $4 billion. Our total funding exceeds $530 million from premier investors including Accel, NVentures (Nvidia's VC arm), Kleiner Perkins, GV, and Evantic Capital, alongside the founders and operators of Stripe, Datadog, Miro, and Webflow.

What you'll do at Synthesia

As a Research Engineer you will join a team of 40+ Researchers and Engineers within the R&D Department working on cutting-edge challenges in the Generative AI space, with a focus on creating high-quality, expressive and real-time synthetic voices. Within the team you’ll have the opportunity to work on the applied side of our research efforts and directly impact our solutions that are used worldwide by over 60,000 businesses.

If you are an expert in ML, LLMs, speech generation, conversational models, this is your chance to make a global impact. You will join our Audio Post-Training Team, which works on generative speech and voice synthesis, ensuring our in-house voice models reach production-level quality, speed, and robustness. Typical projects include:

Develop and evaluate streaming and speech-to-speech systems, enabling low-latency, interactive voice synthesis.
Adapt models for new conditioning inputs (emotion, speed, prosody, speaker control, etc.).
Implement post-training optimization techniques (quantization, pruning, distillation) to improve efficiency and latency in real-time speech generation.
Integrate and test novel architectures, such as neural codecs, diffusion, or flow-matching models, to enhance realism and responsiveness.
Contribute to defining new evaluation metrics for conversational speech, including latency-aware and online MOS prediction systems.
Stay updated with the latest research in audio diffusion, autoregressive models, neural codecs, and multimodal LLMs.
Apply DPO (Direct Preference Optimization) and distillation to fine-tune large-scale speech models.

What we're looking for:

Strong understanding of generative modeling, ideally applied to sequential or multimodal data.
Hands-on experience with large language models (LLMs) or similar transformer-based architectures.
High proficiency in PyTorch, including experience with distributed training and model optimization.
Solid grasp of time-series modeling and tokenization, preferably in the context of audio or speech.
Demonstrated ability to prototype quickly, test hypotheses, and iterate efficiently.
Proven experience in training deep learning models end-to-end, from data preparation to evaluation.
Strong general software engineering skills, enabling contributions to a large, shared research infrastructure.

Nice to have experience:

Experience with real-time or streaming architectures is a big plus.
Familiarity with state-of-the-art architectures in audio and speech generation (e.g., diffusion models, neural codecs, flow-matching models, autoregressive decoders).
Experience with speech-to-speech or text-to-speech (TTS) systems.
Evidence of original research contributions, such as publications or open-source work in top-tier venues (e.g., ICASSP, Interspeech, NeurIPS, ICML).

Similar Jobs

Drata

Corporate Counsel

Yesterday

Remote

Ireland, IRL

Senior level

Security • Software • Cybersecurity • Automation

Provide commercial and privacy legal support for SaaS agreements (SA, DPA, NDA), negotiate B2B contracts, maintain contract templates and playbooks, ensure privacy/security and regulatory compliance, advise business stakeholders, and implement processes and tools to scale the legal function.

Top Skills: Contract Database SolutionDocusignFinance Invoicing SystemsGoogle SuiteGmailGoogle DocsProject Management SoftwareSlack

GitLab

Marketing Manager

Yesterday

Easy Apply

Remote

Ireland, IRL

Easy Apply

Senior level

Cloud • Security • Software • Cybersecurity • Automation

The Senior Regional Marketing Manager will develop and execute regional marketing strategies, manage budgets, coordinate with sales, and analyze program performance across EMEA.

Top Skills: Google WorkspaceSalesforceSlack

GitLab

Business Development Representative

Yesterday

Easy Apply

Remote

Ireland, IRL

Easy Apply

Entry level

Cloud • Security • Software • Cybersecurity • Automation

The Business Development Representative is responsible for generating qualified leads and pipeline through prospecting and outreach, collaborating with multiple teams to achieve sales targets.

Top Skills: OutreachSales NavigatorSalesforce

What you need to know about the Dublin Tech Scene

From Bono and Oscar Wilde to today's tech leaders, Dublin has always attracted trailblazers, with more than 70,000 people working in the city's expanding digital sector. Continuing its legacy of drawing pioneers, the city is advancing rapidly. Ireland is now ranked as one of the top tech clusters in the region and the number one destination for digital companies, with the highest hiring intention of any region across all sectors.