Behavior, Content, Money – 3 Things you should never give away for free!!!

BCmoney MobileTV

Audio Recognition overview (TTS, STT, Voice .vs. Speech)

Posted by bcmoney on July 16, 2015 in Multimedia, Semantic Web, Web Services with No Comments


No Gravatar
English: A Yeti brand, USB microphone by Blue ...

Yeti brand of USB microphone by the “Blue Microphones” company. (Photo credit: Wikipedia)

It is still in many ways the early days of innovation in the several sub-categories of Audio Recognition.

Microphones

Thanks to technological advancements, microphones have become smaller and smaller (perhaps to some extent this has been driven by the post-war and Cold War eras where espionage became so critical, so governments worldwide competed producing better and better audio recording technologies). Either way, a good Microphone is the key technology to ensuring high-quality accuracy & results. While software solutions are increasingly capable of making due with embedded microphones (such as the commodity grade ones that tend to come installed in Mobile Phones, Laptops or other devices), a good external Microphone is essential for high accuracy. Examples of external microphones include wearable headsets or standalone mics connected via Bluetooth, USB cable or Analog/Digital cords. The technology has now improved to the point that the average person can produce audio on par with that of major production studios, all within a reasonable budget.

Speech Recognition

What was said?

Bell Labs pioneered advancements in this area with the creation of the first Text-To-Speech (TTS) technologies, and later Speech-To-Text (STT) during part of their ____ projects in the 19??’s.

 

Voice Recognition

Who said it?

Security companies have started adding Voice Recognition capabilities to their systems since _____ .

 

Agents

Something the Semantic Web promised but had not initially delivered on was an emergence of Intelligent Agents (i.e. code-powered Personal Assistants). Today, we finally see some of this promise being realized through things like Siri by Apple, Cortana by Microsoft, “Now!” by Google and Alexa/Echo by Amazon.

 

Web APIs

Microsoft has offered Windows-specific OS-level Speech API (SAPI) since WindowsXP and developers have been integrating Voice/Speech into their Windows apps for a while now, but now it will soon also offer web-based APIs through the announcement of “Project Oxford”. Project Oxford is aimed at building a set of intelligent services to support information retrieval which can optionally tie into the Bing Search APIs (which supports queries by content type including Web, News, Images, Video, )

 

 

Leave a Reply

No trackbacks yet.

No post with similar tags yet.

Posts in similar categories

BC$ = Behavior, Content, Money

The goal of the BC$ project is to raise awareness and make changes with respect to the three pillars of information freedom - Behavior (pursuit of interests and passions), Content (sharing/exchanging ideas in various formats), Money (fairness and accessibility) - bringing to light the fact that:

1. We regularly hand over our browser histories, search histories and daily online activities to companies that want our money, or, to benefit from our use of their services with lucrative ad deals or sales of personal information.

2. We create and/or consume interesting content on their services, but we aren't adequately rewarded for our creative efforts or loyalty.

3. We pay money to be connected online (and possibly also over mobile), yet we lose both time and money by allowing companies to market to us with unsolicited advertisements, irrelevant product offers and unfairly structured service pricing plans.

  • Archives