Behavior, Content, Money – 3 Things you should never give away for free!!!

BCmoney MobileTV

Speech Recognition – Apple’s Siri (iOS)

Posted by bcmoney on October 13, 2015 in Mobile, Semantic Web, Web Services with No Comments


No Gravatar

Speech Recognition really hit the mainstream when Apple acquired the company and technology behind Siri, a voice-activated virtual assistant.

 

Speech Recognition – Google Now (Android) + Speech API & TTS hacks

Posted by bcmoney on September 20, 2015 in Mobile, Semantic Web, Web Services with No Comments


No Gravatar
Google Now - Speech Recognition search assistant

Google Now – Speech Recognition search assistant (Photo credit: Wikipedia)

In this next part of a series on Speech Recognition, I’ll take a look at some legacy Speech API & TTS hacks which lead up to the introduction of Google Now (for Android devices). It was intended to compete with Siri but voice-based search was actually available on Android & iOS both via the Google app.

 

 

Speech Recognition – Nuance’s Dragon NaturallySpeaking 14

Posted by bcmoney on August 26, 2015 in Multimedia, Semantic Web, Web Services with No Comments


No Gravatar
A sample dictation in Microsoft Word 2010.

A sample dictation in Microsoft Word 2010. (Photo credit: Wikipedia)

They often call it Voice Recognition in Nuance’s marketing and promotional material, which doesn’t help the average user to have clarity about what exactly the product’s capabilities are, but in fact up until recently Nuance’s suite of Audio Recognition software has strictly been focused on Speech Recognition.

As such, they have emerged as one of the industry leaders in this field, now on version 14 of their flagship product Dragon NaturallySpeaking.

Nuance/Dragon Company Histories

They certainly have history on their side, the first academic iteration being created in 1975 by Dr. James Baker at the University of Carnegie Mellon in a partnership with IBM Thomas J. Watson Research Center. The prototype reached a “beta version” by 1982 when Dr. Baker left the University to start a company with his wife focusing on commercializing the DRAGON system they developed together. Due to financial struggles and a desire to improve the underlying recognition engine before entering the consumer market, the first 1.0 production-grade version was, however, not released until June of 1997. The company went through financial turmoil and several mergers & acquisitions, but the common theme was that investors and consumers were truly interested in the products and services that Dragon would make possible. It would finally find its stride when an Optical Character Recognition (OCR) and document scanning company with ties to infamous futurist Ray Kurzweil called ScanSoft acquired the Dragon assets, and then merged them with another fledgling Speech Recognition company named Nuance Communications which itself also had roots in academia through SRI’s STAR laboratory.

Mainstream Breakthroughs

The following products/partnerships are the key

  • Dragon NaturallySpeaking 9 achieves above >90% recognition with training
  • Dragon NaturallySpeaking¬†11 achieves above >90% recognition without training
  • Dragon Medical
  • Dragon Legal
  • Dragon Dictate iOS app
  • LG Smart TV 2012
  • Siri project/company partnership (speech recognition powered by Nuance/Dragon)
  • Siri sale to Apple for iOS integration
  • Apple Mac OSX

They’ve also recently announced that they would after many years of requests be opening up their software’s capabilities as a broader platform via publishing APIs and inter-connectable Web Services which other developers can use to build Speech Recognition into their own applications.

Nuance’s Dragon NaturallySpeaking – Voice Command Cheat Sheet

https://www.nuance.com/content/dam/nuance/en_us/collateral/dragon/command-cheat-sheet/ct-dragon-professional-individual-en-us.pdf

Audio Recognition overview (TTS, STT, Voice .vs. Speech)

Posted by bcmoney on July 16, 2015 in Multimedia, Semantic Web, Web Services with No Comments


No Gravatar
English: A Yeti brand, USB microphone by Blue ...

Yeti brand of USB microphone by the “Blue Microphones” company. (Photo credit: Wikipedia)

It is still in many ways the early days of innovation in the several sub-categories of Audio Recognition.

Microphones

Thanks to technological advancements, microphones have become smaller and smaller (perhaps to some extent this has been driven by the post-war and Cold War eras where espionage became so critical, so governments worldwide competed producing better and better audio recording technologies). Either way, a good Microphone is the key technology to ensuring high-quality accuracy & results. While software solutions are increasingly capable of making due with embedded microphones (such as the commodity grade ones that tend to come installed in Mobile Phones, Laptops or other devices), a good external Microphone is essential for high accuracy. Examples of external microphones include wearable headsets or standalone mics connected via Bluetooth, USB cable or Analog/Digital cords. The technology has now improved to the point that the average person can produce audio on par with that of major production studios, all within a reasonable budget.

Speech Recognition

What was said?

Bell Labs pioneered advancements in this area with the creation of the first Text-To-Speech (TTS) technologies, and later Speech-To-Text (STT) during part of their ____ projects in the 19??’s.

 

Voice Recognition

Who said it?

Security companies have started adding Voice Recognition capabilities to their systems since _____ .

 

Agents

Something the Semantic Web promised but had not initially delivered on was an emergence of Intelligent Agents (i.e. code-powered Personal Assistants). Today, we finally see some of this promise being realized through things like Siri by Apple, Cortana by Microsoft, “Now!” by Google and Alexa/Echo by Amazon.

 

Web APIs

Microsoft has offered Windows-specific OS-level Speech API (SAPI) since WindowsXP and developers have been integrating Voice/Speech into their Windows apps for a while now, but now it will soon also offer web-based APIs through the announcement of “Project Oxford”. Project Oxford is aimed at building a set of intelligent services to support information retrieval which can optionally tie into the Bing Search APIs (which supports queries by content type including Web, News, Images, Video, )

 

 

The Internet of Things – If this then what?

Posted by bcmoney on January 10, 2015 in Cloud Computing, Mobile, Web Services with No Comments


No Gravatar
English: A technology roadmap of the Internet ...

English: A technology roadmap of the Internet of Things. (Photo credit: Wikipedia)

The “Internet of Things” (or IoT) is an evolution of microprocessor engineering, sensor innovations, wireless communications technologies, and of course the Internet itself. An IoT “thing” could be any natural or man-made object that can be assigned an IP address and provided with the ability to transfer data over a network. For example, inanimate objects (i.e. many cars have more built-in sensors than early NASA shuttles for doing everything from alerting the driver when tire pressure is low to regulating anti-lock breaking systems or airbag deployments during emergencies), animals (i.e. a wild animal tagged with biochip transponder to track position/population size or migration patterns) or people (i.e. an elderly person with a heart monitor device or any other implant or device which tracks health data). In all of the previous examples, “things” are provided with unique identifiers and the ability to transfer data over a network without requiring human-to-human, human-to-animal, or human-to-computer interaction. A major question of this Internet of Things is now what the “killer applications” will be. As in, what real-world problems will be solved, what efficiency improvements can be gained or which tangible benefits can be realized for the end user? By connecting more and more devices (thanks to the proliferation of IPv6 addresses, enough to give every atom on Earth’s surface a dedicated IP), we are of course creating more and more usage data,¬† observational data and metadata about the interactions of these devices and users within the rest of the world, which has also placed even more importance on BigData. Certainly, a big part of IoT will be task automation (the absence of a user during operation of devices and their software),¬† enabling devices to function more and more autonomously and theoretically freeing up users from manually entering commands via a command-line or clicking/tapping on controls within a user interface. Enter the service If This Then That (IFTTT), which enables you to “wire together” the capabilities of or otherwise integrate data from two disparate sources to accomplish a particular goal. Read the rest of this entry »

BC$ = Behavior, Content, Money

The goal of the BC$ project is to raise awareness and make changes with respect to the three pillars of information freedom - Behavior (pursuit of interests and passions), Content (sharing/exchanging ideas in various formats), Money (fairness and accessibility) - bringing to light the fact that:

1. We regularly hand over our browser histories, search histories and daily online activities to companies that want our money, or, to benefit from our use of their services with lucrative ad deals or sales of personal information.

2. We create and/or consume interesting content on their services, but we aren't adequately rewarded for our creative efforts or loyalty.

3. We pay money to be connected online (and possibly also over mobile), yet we lose both time and money by allowing companies to market to us with unsolicited advertisements, irrelevant product offers and unfairly structured service pricing plans.

  • Calendar

    • February 2023
      M T W T F S S
       12345
      6789101112
      13141516171819
      20212223242526
      2728  
  • Archives