Voice recognition has come a long way since I became acquainted with it. Back in the late 1980s / early 1990s, I was a management consultant with Arthur D. Little based in the US headquarters in Acorn Park (Cambridge, Ma). Some colleagues and I took a short trip to BBN’s headquarters in Fresh Pond in Cambridge, Ma to get a briefing on their latest experimentation with voice recognition.
At that Bolt Beranek and Newman was still their own company. In September 2009, Raytheon acquired BBN as a wholly owned subsidiary. I didn’t go over to BBN that frequently – maybe two or three times. But the cooperation or conversations between ADL and BBN was eminently logical. Our founder, Arthur D. Little, was a chemist from MIT and Beranek and Bolt were also from MIT (professor of acoustics and professor of physics respectively). Our two firms shared an ‘engineering anything was possible mindset’ that led each of the firms to some impressive products.
They demonstrated a voice recognition system that they could talk to and get information as well as ask for and receive airline travel reservations. One of the engineers on the project was from Scotland and he proudly demonstrated how their system understood his thick Scottish brogue.
But jumping back to the 1971 /1972 time period, after getting out of the Army, I worked as a resident visitor at Bell Labs in Madison, NJ. (This was the truly wondrous time before the US Federal Government forced the divestiture of the Labs from the Baby Bells. Hell, if the US government can’t break up what was the absolute best communications company on the planet, then what can a US government do?)
During that time period, Bell Labs was experimenting with embedding voice recognition in the Job Control Language (JCL) used in the creation and submission of programming decks. Once the programming job had been run, whether to success or not, the computer that ran the job would call your phone on your desk, let you know that job named (whatever you embedded in the JCL) was done, and what box in the IT department to go to pick up the programming deck and the output. We were told that the computer had a hard time pronouncing ‘S’s and ‘F’s … naturally we put as many S’s and F’s in the JCL that we could fit in.
Quick math: 50 years or so, I’ve experienced voice recognition in some form or another. Now, of course, we have voice response units (VRUs) which are their own form of hell. Hell for customers but money-saving systems for companies across every industry (at least in seems like every industry is using them).
But, society now has voice assistants from Amazon, Google, and Apple to help us throughout our day. Perhaps having them balances the hell that company VRUs put us through when we call a company and just want to speak to a human without slicing through a VRU tree before we get our query or objective resolved.
I can’t speak to (sorry, no pun intended) the Amazon or Google voice assistant capabilities. But just before the 2020 Holiday Season, we purchased an Apple HomePod Mini. And we love it: it does what we want (play music) and it can’t do what we don’t want it to (enable us to purchase anything). Supposedly our music requests (and our occasional requests for information) are encrypted: personally, I don’t care if they are or not.
I can see the obvious dangers of having either an Amazon or Google voice assistant: running up some serious expenses. It is so easy to ask Siri to play a particular piece of music or ask Siri to play music for a specific artist … that I can immediately see the danger place I would quickly fall into asking for … well, for whatever to be purchased and delivered to our home.
We’ve obviously come a long way from testing Scottish brogues or being warned not to use S’s or F’s when we speak to a voice assistant. Personally, I’m looking forward to having an interactive hologram at my beck-and-call so I can not only ask for a specific song to be played but so I can also see the artist performing the music.