Clone with Audio Prompt

Overview

There are two different ways you can manage voices with the Deeptune API.

  1. Use Deeptune’s inbuilt voices to upload and manage voices.
  2. Manage voices yourself (eg in your own DB) and clone with generate_from_prompt.
This tutorial will cover #2: how to manage voices yourself

Using Prompt Audio

If you prefer to manage voices on your own, you can use your own audio file as a reference for the voice clone.

Using a URL prompt

The URL must be publicly accessible (so our servers download it) and a valid audio format.

1from deeptune.client import Deeptune
2from deeptune.utils import play
3
4client = Deeptune(
5 api_key="YOUR_API_KEY",
6)
7
8audio = client.text_to_speech.generate_from_prompt(
9 text="Wow, Deeptune's text to speech API is amazing!",
10 prompt_audio="https://deeptune-demo.s3.amazonaws.com/Michael.wav",
11)
12play(audio)

Using a file prompt

The file must be a valid audio format and encoded as a base64 data URI.

1import base64
2from deeptune.client import Deeptune
3from deeptune.utils import play
4
5client = Deeptune(
6 api_key="YOUR_API_KEY",
7)
8
9# Open the file and read its contents as bytes
10with open("Michael.wav", "rb") as audio_file:
11 audio_bytes = audio_file.read()
12
13# Encode the bytes to base64
14audio_base64 = base64.b64encode(audio_bytes).decode("utf-8")
15audio = client.text_to_speech.generate_from_prompt(
16 text="Wow, Deeptune's text to speech API is amazing!",
17 prompt_audio=f"data:audio/wav;base64,{audio_base64}",
18)
19play(audio)