In April of this year, Google Chrome browser released support for voice input. What this means is that you can speak through the microphone to populate an input field. I saw this feature in action at Wordnik, where they let you speak the word that you wanted to get the definition of.
In this blog post, I shall discuss how simple it is to enable voice input for a text field in a web application. Of course, speech enabled input is a work in progress and currently it seems that only Google Chrome supports it. And at times, the results are quite funny but that is not a problem. In the future, we should be saying more of these interactions and right within the browser itself.
So, lets get going with the application. First, you can check out the application for yourself to get familiar with how it is all supposed to work. Preconditions are that you should one of the latest versions of Google Chrome and should have a microphone enabled laptop/desktop. Go to Voice Enabled Quiz.
This should bring up a screen as shown below. Notice the little microphone shown at the end of the input text field.
The Quiz is simple and all you have to do is identify the person in the photo. However try not to type the answer. Simply click on the microphone and this should prompt you to speak as shown below:
Once it records your voice input, it contacts some Google Servers to do the translation for you and the value will be put in the text field. If the answer is right (and which I won’t disclose here .. :-)), you will get some accolades. Go ahead and try it out. Just refresh the browser for a new try since I have not paid too much attention to detailed coding over here.
Let us look at the application code now:
Let us go through the important parts of the code.
- The first thing we want to check is if the browser support Speech Input. This is done via the onLoad event, which invokes the checkSpeechSupport() method. The checkSpeechSupport() method simply checks for the webkitSpeech attribute for an input element. If not, it alerts that your browser does not support speech input, which is what you get if you via Firefox, IE and other browsers to this application.
- The main form is straightforward with an input field of type text. The x-webkit-speech attribute on the input element does the magic of showing the microphone at the end of the input text element.
- We trap the onwebkitspeechchange event which is fired whenever your speech is translated to text and populated in the input element. We call a method checkanswer() that checks your response against the answer and displays an alert.
Simple .. isn’t it ? It is definitely a work in progress if you try out with various words that you throw at it. But if you think about this feature, it can be a great way to take inputs in different scenarios or even with people who might have trouble typing. An oddity to note is that you need to use the mouse or keyboard to click the microphone.
Multiple other demos are available on the net. I suggest doing a Google Search on “x-webkit-speech”.