HomeCompaniesPlayHT

PlayHT

Our mission is to make Voice AI accessible and useful to all.

Play is a Voice AI company that specializes in building conversational voice models capable of cloning any voice or accent and generating speech in real-time.
Active Founders
Hammad Syed
Hammad Syed
Founder
Mahmoud felfel
Mahmoud felfel
Founder
Jobs at PlayHT
Palo Alto, CA, US
$150K - $250K
3+ years
Palo Alto, CA
$80K - $150K
1+ years
Palo Alto, CA
$100K - $160K
Any (new grads ok)
Palo Alto, CA, US
$120K - $180K
1+ years
PlayHT
Founded:2021
Batch:W23
Team Size:35
Status:
Active
Location:Mountain View
Primary Partner:Gustaf Alstromer
Company Launches
PlayHT 2.0 Turbo ⚡️ - The fastest generative AI Text-to-Speech API (+ YC deal)
See original launch post

TL;DR

We are thrilled to announce the release of the FASTEST Voice LLM to date! Real-time speech streaming from text in 300ms or less. Dive in and test it using our Playground, available SDKs, or these Replit demos for Nodejs and a chatGPT integration.

YC Deal

For all YC companies, get 50% off API Plans for 2 years, check it here.

https://youtu.be/QBvugSdHpW8

Introduction

At PlayHT, our vision is to redefine human interactions with AI agents. Whether it’s for customer support or sales calls, AI tutors, or bringing Gaming NPCs to life, our goal is to revolutionize the way humans communicate with generative AI agents.

Today we announce our latest milestone on the road to fulfilling that vision: the launch of PlayHT Turbo, a new version of our conversational voice model, PlayHT 2.0 that generates speech in under 300ms via network and < 100ms for on-premise solutions.

Input Text Streaming

PlayHT 2.0 Turbo supports input text streaming. This feature seamlessly integrates with LLMs, like chatGPT. Simply feed the output stream of tokens/words from the LLM and the SDK will process the tokens in the best way that can balance both generating expressive contextual speech and reducing the TTFB (time to first byte).

Output Speech Streaming

Once Turbo receives text, it starts streaming audio in approximately 70ms. However, due to inevitable network costs, users typically receive the audio stream within a 200ms to 400ms window.

Check out our demo showcasing the integration with chatGPT with both input and output streaming:

https://www.youtube.com/watch?v=hF6IueCacfg

Create Delightful Conversations

Ready to redefine human-AI communication? Build the next AI Therapist, AI Tutor, Gaming NPCs, or Personal Assistant that actually sounds human? We built this API for you, get started now for free, and join our discord and show us what you are building!

How can you help?

  • If you have any connections or potential partners in the conversational AI, customer support, or AI agents space, please send intros.
  • Retweet.
  • Build cool stuff :)
Other Company Launches

PlayHT: The Generative AI voice platform

Generate or clone voices and turn any text into human-like speech
Read Launch ›