Hackviking He killed Chuck Norris, he ruled dancing so he took up a new hobby…

12Feb/160

Google Speech API – returns no result

Google Speech API takes a flac audio file and converts it to text. This API is aimed at Chromium and Android developers first and for most so the documentation for other plattforms isn't that good. This article contains C# code but the principal is the same for other languages. I haven't looked at this before today but got a question from a developer that was stuck only receiving {"result":[]} in return from the API even though the HTTP response was 200 OK. I was pretty impressed with the whole thing so here is what I found today and a simple example to get you started.

Get access

To get access to the API you need to obtain an API key. Simply logging into Google Cloud Console, creating a new project and adding the API will not work. First you need to join the Chromium-dev group on Google groups with the same Google account used in the cloud console.

Then you can head over to Google Cloud Console and:

  1. Create a new project.
  2. Go into API and search for Speech API and add it.
  3. Create credentials for the API, select Server Key.
  4. Put in your public IP and click Create.
  5. Save your API key generated somewhere.

Note: The API only allows you 50 requests a day!

Test is out

You can use the same flac file I used or get/create your own. Just make sure its 16-bit PCM and mono. That was the issue causing the {"result":[]} result.

I used a simple C# windows application with one button running this code:

string api_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
string path = @"C:\temp\good-morning-google.flac";

byte[] bytes = System.IO.File.ReadAllBytes(path);

WebClient client = new WebClient();
client.Headers.Add("Content-Type", "audio/x-flac; rate=44100");
byte[] result = client.UploadData(string.Format(
				"https://www.google.com/speech-api/v2/recognize?client=chromium&lang=en-us&key={0}", api_key), "POST", bytes);

string s = client.Encoding.GetString(result);

Then you should get this result back:

{"result":[{"alternative":[{"transcript":"good morning Google how are you feeling today","confidence":0.987629}],"final":true}],"result_index":0}

Important about the flac file

For this to work you need to have a 16-bit PCM mono flac file. The issue that the developer I helped with this ran into was that his file was 32-bit float and stereo. If your not sure download Audacity and check/convert your file.

audacity google speech api

Posted by Kristofer Källsbo

Comments (0) Trackbacks (0)

No comments yet.


Leave a Reply

No trackbacks yet.