{"info":{"_postman_id":"90fe79a8-8c90-4e6c-afec-89de2cce1039","name":"API V8","description":"<html><head></head><body><ul>\n<li><p>Use <code>/stream</code> for up to 1,000 characters, time-sensitive cases (e.g. chatbot).</p>\n</li>\n<li><p>NEW: Use a websocket connection to <code>wss://api.v8.unrealspeech.com/streamWithTimestamps</code> to stream both audio and per-word timestamps.</p>\n</li>\n<li><p>Use <code>/speech</code> for up to 3,000 characters, which returns publicly accessible URLs to the audio and timestamps.</p>\n</li>\n<li><p>Use <code>/synthesisTasks</code> for up to 500,000 characters, which returns a task ID you can use to retrieve publicly accessible URLs to the audio and timestamps.</p>\n</li>\n</ul>\n<h1 id=\"migrating-from-v6-or-v7\">Migrating from V6 or V7</h1>\n<ol>\n<li><p>Change the endpoint from <a href=\"https://api.v6.unrealspeech.com\"><code>https://api.v6.unrealspeech.com</code></a> to <a href=\"https://api.v6.unrealspeech.com\"><code>https://api.v8.unrealspeech.com</code></a></p>\n</li>\n<li><p>Update the <code>VoiceID</code></p>\n</li>\n<li><p>Everything else is backward-compatible.</p>\n</li>\n</ol>\n<h1 id=\"common-request-body-schema\">Common Request Body Schema</h1>\n<p>Below parameters are used for <code>/stream</code>, <code>/speech</code> and <code>/synthesisTasks</code> endpoints.</p>\n<div class=\"click-to-expand-wrapper is-table-wrapper\"><table>\n<thead>\n<tr>\n<th><strong>Property</strong></th>\n<th><strong>Type</strong></th>\n<th><strong>Required?</strong></th>\n<th><strong>Default Value</strong></th>\n<th><strong>Allowed Values</strong></th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td><code>VoiceId</code></td>\n<td>string</td>\n<td>Required</td>\n<td>N/A</td>\n<td><code>see below</code></td>\n</tr>\n<tr>\n<td><code>Bitrate</code></td>\n<td>string</td>\n<td>Optional</td>\n<td><code>192k</code></td>\n<td><code>16k</code> <code>32k</code> <code>48k</code> <code>64k</code> <code>128k</code> <code>192k</code> <code>256k</code> <code>320k</code></td>\n</tr>\n<tr>\n<td><code>Speed</code></td>\n<td>float</td>\n<td>Optional</td>\n<td>0</td>\n<td><code>-1.0</code> to <code>1.0</code></td>\n</tr>\n<tr>\n<td><code>Pitch</code></td>\n<td>float</td>\n<td>Optional</td>\n<td>1.0</td>\n<td><code>0.5</code> to <code>1.5</code></td>\n</tr>\n</tbody>\n</table>\n</div><h2 id=\"parameter-details\">Parameter Details</h2>\n<p><code>VoiceId</code></p>\n<ul>\n<li><p>American Female: Autumn, Melody, Hannah, Emily, Ivy, Kaitlyn, Luna, Willow, Lauren, Sierra</p>\n</li>\n<li><p>American Male: Noah, Jasper, Caleb, Ronan, Ethan, Daniel, Zane</p>\n</li>\n<li><p>Chinese Female: Mei, Lian, Ting, Jing</p>\n</li>\n<li><p>Chinese Male: Wei, Jian, Hao, Sheng</p>\n</li>\n<li><p>Spanish Female: Lucía</p>\n</li>\n<li><p>Spanish Male: Mateo, Javier</p>\n</li>\n<li><p>French Female: Élodie</p>\n</li>\n<li><p>Hindi Female: Ananya, Priya</p>\n</li>\n<li><p>Hindi Male: Arjun, Rohan</p>\n</li>\n<li><p>Italian Female: Giulia</p>\n</li>\n<li><p>Italian Male: Luca</p>\n</li>\n<li><p>Portuguese Female: Camila</p>\n</li>\n<li><p>Portuguese Male: Thiago, Rafael</p>\n</li>\n</ul>\n<p><code>Bitrate</code>: Defaults to <code>192k</code>. Use lower values for low bandwidth or to reduce the transferred file size. Use higher values for higher fidelity.</p>\n<p><code>Speed</code>: Defaults to <code>0</code>. Examples:</p>\n<ul>\n<li><p><code>0.5</code>: makes the audio 50% faster. (i.e., 60-second audio becomes 42 seconds)</p>\n</li>\n<li><p><code>-0.5</code>: makes the audio 50% slower. (i.e., 60-second audio becomes 90 seconds.)</p>\n</li>\n</ul>\n<p><code>Pitch</code>: Defaults to <code>1</code>. However, on the landing page, we default male voices to <code>0.92</code> as people tend to prefer lower/deeper male voices.</p>\n</body></html>","schema":"https://schema.getpostman.com/json/collection/v2.0.0/collection.json","toc":[{"content":"Migrating from V6 or V7","slug":"migrating-from-v6-or-v7"},{"content":"Common Request Body Schema","slug":"common-request-body-schema"}],"owner":"21194381","collectionId":"90fe79a8-8c90-4e6c-afec-89de2cce1039","publishedId":"2sAYXCmKEh","public":true,"customColor":{"top-bar":"FFFFFF","right-sidebar":"303030","highlight":"FF6C37"},"publishDate":"2025-02-13T20:45:40.000Z"},"item":[{"name":"/stream","id":"c13301ad-f392-400a-8ff7-96569f26d34e","protocolProfileBehavior":{"disableBodyPruning":true},"request":{"auth":{"type":"bearer","bearer":{"basicConfig":[{"key":"token","value":"<token>"}]},"isInherited":false},"method":"POST","header":[{"key":"Authorization","value":"Bearer API_KEY","type":"text"}],"body":{"mode":"raw","raw":"{\n    \"Text\": \"On the unfinished nature of kitchen knives\\n\\nI remember the first really sharp kitchen knife I used. At the time, I was a pissant stagiaire without a knife to my name, and this knife belonged to a piratical sous chef named Ho. Ho wanted me to cut a case of oranges for marmalade, the sort of task you give the stagiaire to keep them out of your way.\",\n    \"VoiceId\": \"Sierra\",\n    \"Bitrate\": \"192k\",\n    \"Pitch\": 1.02,\n    \"Speed\": 0.1\n}","options":{"raw":{"language":"json"}}},"url":"https://api.v8.unrealspeech.com/stream","description":"<p>Send up to 1000 characters and a playback stream of audio back in 0.3 seconds. The is the endpoint to use for the fastest response.</p>\n<h1 id=\"endpoint-specific-body-schema\">Endpoint-Specific Body Schema</h1>\n<p>In addition to the common parameters, the <code>stream</code> endpoint takes below paramter(s).</p>\n<div class=\"click-to-expand-wrapper is-table-wrapper\"><table>\n<thead>\n<tr>\n<th><strong>Property</strong></th>\n<th><strong>Type</strong></th>\n<th><strong>Required?</strong></th>\n<th><strong>Default Value</strong></th>\n<th><strong>Allowed Values</strong></th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td><code>Text</code></td>\n<td>string</td>\n<td>Required</td>\n<td>N/A</td>\n<td>A string of text to be synthesized, up to 1000 characters.</td>\n</tr>\n<tr>\n<td><code>Codec</code></td>\n<td>string</td>\n<td>Optional</td>\n<td><code>libmp3lame</code></td>\n<td><code>pcm_mulaw</code>, <code>pcm_s16le</code></td>\n</tr>\n<tr>\n<td><code>Temperature</code></td>\n<td>float</td>\n<td>Optional</td>\n<td><code>0.25</code></td>\n<td><code>0.1</code> to <code>0.8</code></td>\n</tr>\n</tbody>\n</table>\n</div><h2 id=\"parameter-details\">Parameter Details</h2>\n<p><code>Text</code>: This is the text to be synthesized to audio. Up to 1000 characters.</p>\n<p><code>Codec</code>: Defaults to <code>libmp3lame</code> (MP3). Use <code>pcm_mulaw</code> for phone calls. <code>pcm_s16le</code> returns 22050 Hz raw audio.<br /><code>Temperature</code>: Defaults to 0.25. The lower values make audio deterministic and more stable. The higher values make audio more expressive and less-deterministic. With a high Temperature value, audio will be different every time. However, it also increases the probability of mispronunciation.</p>\n","urlObject":{"protocol":"https","path":["stream"],"host":["api","v8","unrealspeech","com"],"query":[],"variable":[]}},"response":[],"_postman_id":"c13301ad-f392-400a-8ff7-96569f26d34e"},{"name":"/speech","id":"294158b9-d2e0-4619-8f66-667520278637","protocolProfileBehavior":{"disableBodyPruning":true},"request":{"method":"POST","header":[{"key":"Authorization","value":"Bearer API_KEY","type":"text"}],"body":{"mode":"raw","raw":"{\n    \"Text\": \"Amid the intricate labyrinth of human neurons lies a molecule that has confounded and fascinated scientists for ages: the neurotransmitter known as dopamine. Often heralded as the pleasure molecule, dopamine's role is far more nuanced than just mediating euphoria.\",\n    \"VoiceId\": \"Sierra\",\n    \"Bitrate\": \"320k\",\n    \"OutputFormat\": \"uri\",\n    \"TimestampType\": \"word\"\n}","options":{"raw":{"language":"json"}}},"url":"https://api.v8.unrealspeech.com/speech","description":"<p>Send up to 3,000 characters and synchronously receive an MP3 and JSON timestamp URLs.</p>\n<h1 id=\"endpoint-specific-body-schema\">Endpoint-Specific Body Schema</h1>\n<p>In addition to the common parameters, the <code>/speech</code> endpoint takes below paramter(s).</p>\n<div class=\"click-to-expand-wrapper is-table-wrapper\"><table>\n<thead>\n<tr>\n<th><strong>Property</strong></th>\n<th><strong>Type</strong></th>\n<th><strong>Required?</strong></th>\n<th><strong>Default Value</strong></th>\n<th><strong>Allowed Values</strong></th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td><code>Text</code></td>\n<td>string</td>\n<td>Required</td>\n<td>N/A</td>\n<td>A string of text to be synthesized, up to 3,000 characters.</td>\n</tr>\n<tr>\n<td><code>TimestampType</code></td>\n<td>string</td>\n<td>Optional</td>\n<td>sentence</td>\n<td><code>word</code> or <code>sentence</code></td>\n</tr>\n</tbody>\n</table>\n</div><h2 id=\"parameter-details\">Parameter Details</h2>\n<p><code>Text</code>: This is the text to be synthesized to audio.</p>\n<ul>\n<li>On average, 850 characters result in 1 minute of audio. In other words, 3,000 characters will result in approximately 3.5 minutes of audio.</li>\n<li>On average, it takes 1 second per 700 characters. In other words, 3,000 characters will take approximately 4 seconds.</li>\n</ul>\n<p><code>TimestampType</code>: By default, the endpoint returns per-sentence timestamps. Use <code>word</code> to get per-word timestamps. The timestamp feature is currently not supported via the <code>/stream</code> endpoint.</p>\n","urlObject":{"protocol":"https","path":["speech"],"host":["api","v8","unrealspeech","com"],"query":[],"variable":[]}},"response":[{"id":"b3832c3d-87ab-4148-8ad7-c971afc3ac31","name":"Successful Request","originalRequest":{"method":"POST","header":[{"key":"Authorization","value":"Bearer API_KEY","type":"text"}],"body":{"mode":"raw","raw":"{\n    \"Text\": \"The milestone Overture 2023-07-26-alpha.0 release includes four unique data layers: Places of Interest (POIs), Buildings, Transportation Network, and Administrative Boundaries. These layers, which combine various sources of open map data, have been validated and conflated through a series of quality checks, and are released in the Overture Maps data schema which was released publicly in June 2023. The Places dataset includes data on over 59 million places worldwide and will be a foundational element of navigation, local search, and many other location-based applications. The datasets are available for download at https://overturemaps.org/download/.\",\n    \"VoiceId\": \"Sierra\",\n    \"Bitrate\": \"320k\",\n    \"AudioFormat\": \"mp3\",\n    \"OutputFormat\": \"uri\",\n    \"TimestampType\": \"sentence\",\n    \"sync\": false\n}","options":{"raw":{"language":"json"}}},"url":"https://api.v8.unrealspeech.com/speech"},"status":"OK","code":200,"_postman_previewlanguage":"json","header":[{"key":"Server","value":"nginx/1.18.0 (Ubuntu)"},{"key":"Date","value":"Fri, 01 Sep 2023 22:05:22 GMT"},{"key":"Content-Type","value":"application/json"},{"key":"Content-Length","value":"234"},{"key":"Connection","value":"keep-alive"},{"key":"Access-Control-Allow-Origin","value":"*"}],"cookie":[],"responseTime":null,"body":"{\n    \"CreationTime\": \"2025-02-13T12:25:25.81Z\",\n    \"OutputUri\": \"https://unreal-expire-in-90-days.s3-us-west-2.amazonaws.com/996b0447-e400-409a-8069-99a6a6c6ee43-0.mp3\",\n    \"RequestCharacters\": 158,\n    \"TaskId\": \"996b0447-e400-409a-8069-99a6a6c6ee43\",\n    \"TaskStatus\": \"completed\",\n    \"TimestampsUri\": \"https://unreal-expire-in-90-days.s3-us-west-2.amazonaws.com/996b0447-e400-409a-8069-99a6a6c6ee43-0.json\",\n    \"VoiceId\": \"Sierra\"\n}"}],"_postman_id":"294158b9-d2e0-4619-8f66-667520278637"},{"name":"/synthesisTasks","id":"436b26dc-d6cc-4a99-9670-b68e17b813ae","protocolProfileBehavior":{"disableBodyPruning":true},"request":{"method":"POST","header":[{"key":"Authorization","value":"Bearer API_KEY","type":"text"}],"body":{"mode":"raw","raw":"{\n    \"Text\": [\"The milestone Overture 2023-07-26-alpha.0 release includes four unique data layers: Places of Interest (POIs), Buildings, Transportation Network, and Administrative Boundaries. These layers, which combine various sources of open map data, have been validated and conflated through a series of quality checks, and are released in the Overture Maps data schema which was released publicly in June 2023. The Places dataset includes data on over 59 million places worldwide and will be a foundational element of navigation, local search, and many other location-based applications. The datasets are available for download at https://overturemaps.org/download/.\"],\n    \"VoiceId\": \"Sierra\",\n    \"Bitrate\": \"320k\",\n    \"OutputFormat\": \"uri\",\n    \"TimestampType\": \"word\"\n}","options":{"raw":{"language":"json"}}},"url":"https://api.v8.unrealspeech.com/synthesisTasks","description":"<p>Send up to 500,000 characters and immediately receive a <code>TaskId</code>. GET <code>TaskId</code> to check the status. Optimized for cost savings and longer requests. It generally processes ~700 character per seconds. A 500,000-char request takes ~15 minutes and results in ~10 hours of audio.</p>\n<h1 id=\"endpoint-specific-body-schema\">Endpoint-Specific Body Schema</h1>\n<p>In addition to the common parameters, the <code>/synthesisTasks</code> endpoint takes below paramter(s).</p>\n<div class=\"click-to-expand-wrapper is-table-wrapper\"><table>\n<thead>\n<tr>\n<th><strong>Property</strong></th>\n<th><strong>Type</strong></th>\n<th><strong>Required?</strong></th>\n<th><strong>Default Value</strong></th>\n<th><strong>Allowed Values</strong></th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td><code>Text</code></td>\n<td>string, array</td>\n<td>Required</td>\n<td>N/A</td>\n<td>A string of text to be synthesized, up to 500,000 characters.</td>\n</tr>\n<tr>\n<td><code>TimestampType</code></td>\n<td>string</td>\n<td>Optional</td>\n<td>sentence</td>\n<td><code>word</code> or <code>sentence</code></td>\n</tr>\n<tr>\n<td><code>CallbackUrl</code></td>\n<td>string</td>\n<td>Optional</td>\n<td>N/A</td>\n<td>A callback URL to POST <code>TaskId</code> and <code>TaskStatus</code></td>\n</tr>\n</tbody>\n</table>\n</div><h2 id=\"parameter-details\">Parameter Details</h2>\n<p><code>Text</code>: This is the text to be synthesized to audio.</p>\n<ul>\n<li><p>On average, 800 characters result in 1 minute of audio. In other words, 500,000 characters will result in approximately 10 hours of audio.</p>\n</li>\n<li><p>On average, it takes 1 second per 800 characters. In other words, 500,000 characters will take approximately 10 minutes.</p>\n</li>\n<li><p>If you POST a string, it'll return a single <code>OutputUri</code>. If you POST a list of strings, <code>OutputUri</code> will be a list of MP3 URLs.</p>\n</li>\n<li><p>For longer texts, we recommend sending a list of strings. This way, instead of waiting for the entire task to be completed to access audio, you can start accessing the first audios in the list. i.e., <code>-.mp3</code></p>\n</li>\n</ul>\n<p><code>TimestampType</code>: By default, the endpoint returns per-sentence timestamps. Use <code>word</code> to get per-word timestamps. The timestamp feature is currently not supported via the <code>/stream</code> endpoint.</p>\n<p><code>CallbackUrl</code>: If provided, the server will POST a JSON body to the <code>CallbackUrl</code>. A sample body looks like below:</p>\n<pre class=\"click-to-expand-wrapper is-snippet-wrapper\"><code class=\"language-json\"> {\n   \"TaskId\": \"8282b92d\",\n   \"TaskStatus\": \"completed\", // or \"failed\"\n }\n\n</code></pre>\n","urlObject":{"protocol":"https","path":["synthesisTasks"],"host":["api","v8","unrealspeech","com"],"query":[],"variable":[]}},"response":[{"id":"162d8429-9038-44ca-951d-b1afcf218ccf","name":"Successful Request","originalRequest":{"method":"POST","header":[{"key":"Authorization","value":"Bearer API_KEY","type":"text"}],"body":{"mode":"raw","raw":"{\n    \"Text\": \"The milestone Overture 2023-07-26-alpha.0 release includes four unique data layers: Places of Interest (POIs), Buildings, Transportation Network, and Administrative Boundaries. These layers, which combine various sources of open map data, have been validated and conflated through a series of quality checks, and are released in the Overture Maps data schema which was released publicly in June 2023. The Places dataset includes data on over 59 million places worldwide and will be a foundational element of navigation, local search, and many other location-based applications. The datasets are available for download at https://overturemaps.org/download/.\",\n    \"VoiceId\": \"Sierra\",\n    \"Bitrate\": \"320k\",\n    \"AudioFormat\": \"mp3\",\n    \"OutputFormat\": \"uri\",\n    \"TimestampType\": \"sentence\",\n    \"sync\": false\n}","options":{"raw":{"language":"json"}}},"url":"https://api.v6.unrealspeech.com/synthesisTasks"},"status":"OK","code":200,"_postman_previewlanguage":"json","header":[{"key":"Server","value":"nginx/1.18.0 (Ubuntu)"},{"key":"Date","value":"Fri, 01 Sep 2023 22:05:22 GMT"},{"key":"Content-Type","value":"application/json"},{"key":"Content-Length","value":"234"},{"key":"Connection","value":"keep-alive"},{"key":"Access-Control-Allow-Origin","value":"*"}],"cookie":[],"responseTime":null,"body":"{\n    \"SynthesisTask\": {\n        \"CreationTime\": \"2025-02-13T12:26:01.22Z\",\n        \"OutputUri\": [\n            \"https://unreal-expire-in-90-days.s3-us-west-2.amazonaws.com/b7cf8b52-c269-4906-8177-22c8fb1010c7-0.mp3\"\n        ],\n        \"RequestCharacters\": 4,\n        \"TaskId\": \"b7cf8b52-c269-4906-8177-22c8fb1010c7\",\n        \"TaskStatus\": \"scheduled\",\n        \"TimestampsUri\": [\n            \"https://unreal-expire-in-90-days.s3-us-west-2.amazonaws.com/b7cf8b52-c269-4906-8177-22c8fb1010c7-0.json\"\n        ],\n        \"VoiceId\": \"Sierra\"\n    }\n}"}],"_postman_id":"436b26dc-d6cc-4a99-9670-b68e17b813ae"},{"name":"/synthesisTasks","id":"95b3b7f4-708b-45e1-8c0e-6421764beea8","protocolProfileBehavior":{"disableBodyPruning":true},"request":{"method":"GET","header":[{"key":"Authorization","value":"Bearer API_KEY","type":"text"}],"body":{"mode":"raw","raw":"","options":{"raw":{"language":"json"}}},"url":"https://api.v8.unrealspeech.com/synthesisTasks/TaskId","description":"<p>Check the status of a speech synthesis task.</p>\n","urlObject":{"protocol":"https","path":["synthesisTasks","TaskId"],"host":["api","v8","unrealspeech","com"],"query":[],"variable":[]}},"response":[{"id":"29ebf6c8-e0e7-443c-bed5-d60b7fdcec8d","name":"Completed Request","originalRequest":{"method":"GET","header":[{"key":"Authorization","value":"Bearer API_KEY","type":"text"}],"url":"https://api.v6.unrealspeech.com/synthesisTasks/785010b1"},"status":"OK","code":200,"_postman_previewlanguage":"json","header":[{"key":"Server","value":"nginx/1.18.0 (Ubuntu)"},{"key":"Date","value":"Fri, 01 Sep 2023 22:06:09 GMT"},{"key":"Content-Type","value":"application/json"},{"key":"Content-Length","value":"236"},{"key":"Connection","value":"keep-alive"},{"key":"Access-Control-Allow-Origin","value":"*"}],"cookie":[],"responseTime":null,"body":"{\n    \"SynthesisTask\": {\n        \"CreationTime\": \"2025-02-13T12:26:01.22Z\",\n        \"OutputUri\": [\n            \"https://unreal-expire-in-90-days.s3-us-west-2.amazonaws.com/b7cf8b52-c269-4906-8177-22c8fb1010c7-0.mp3\"\n        ],\n        \"RequestCharacters\": 4,\n        \"StatusDetails\": \"N/A\",\n        \"TaskId\": \"b7cf8b52-c269-4906-8177-22c8fb1010c7\",\n        \"ProcessingTime\": \"0.6402294635772705\",\n        \"TaskStatus\": \"completed\",\n        \"TimestampsUri\": [\n            \"https://unreal-expire-in-90-days.s3-us-west-2.amazonaws.com/b7cf8b52-c269-4906-8177-22c8fb1010c7-0.json\"\n        ],\n        \"VoiceId\": \"Sierra\"\n    }\n}"}],"_postman_id":"95b3b7f4-708b-45e1-8c0e-6421764beea8"}]}