16.2 C
Casper
Tuesday, July 16, 2024

Invisible, Intelligent, Intrusive

Must read

Chandni U
Chandni U
Assistant Editor

Intelligent Voice Assistants have made great strides, but some are questioning whether all the automation is worth sacrificing one’s privacy.

As we embrace Intelligent Voice Assistants in homes and companies, we can observe their technological evolution and dark side.

Hey, Alexa. Do you know Siri?
Only by reputation.
Hey, Alexa. What about Bixby?
I am partial to all AI.

The world is starting to populate with AI, and the voice concierges are beginning to be a community of their own. These intelligent voice assistants (IVA) have evolved systematically with technological investments for decades.

The voice family tree

It all started in 1960, when Joseph Weizenbaum, an MIT professor, invented the first natural language processing computer program. Soon after, IBM developed a Shoebox voice-activated calculator. A decade of voice recognition software followed. Harpy, a master of 1000 words, was another addition to the lot, able to understand sentences that a three-year-old could.

Tangora was also IBM’s voice-recognizing typewriter with a vocabulary of 20,000 words. Finally, the first real virtual assistant was Simon, created by IBM with digital speech recognition technology. Programmed with cognitive computing technology, including AI, Machine Learning (ML), and Voice Recognition, these IVAs identify and learn from data inputs and can predict user needs. Today, the world has Siri, Alexa, Bixby, Nina, Viv, and Mycroft.

Tracing the technological roots

There was a time when manual punch cards were used to store data and instruct the machine. With the advent of the programming era, the destiny of IVA was written. Although they were built around vacuum tubes, they were bolstered by the invention of the transistor and the microprocessor.

The Cognitive Computing (CC) phase brought the simulation of the human thought process into a computerized model. Leveraging self-learning systems, CC used natural language processing, pattern recognition, and data mining to try and imitate the human thinking process.

Having succeeded in emulating the human brain in terms of parallel processing and associative memory, CC  demonstrated pattern recognition, robotic control, and emotional intelligence. It was a lauded technological feat that raised a twinge of fear.

The engineering behind CC developed context-based hypotheses with ML, offering the industry an in-depth understanding of how these voice assistants were built and their ability to interact with humans.

Making of the voice

After processing the human command based on datasets, an IVA converts the voice to text with NLP, creates a reply, and converts it back into voice. Soon, developers took a step further as the demand for IVAs skyrocketed, and they decided to create large Deep Neural Networks (DNNs) to address the then-new challenges.

An open infrastructure model called DjiNN was developed for DNN as a service, and Tonic Suite consisted of seven end-to-end applications, including image, speech, and language processing.

Also Read: Explained: Knowledge Graph in ML

Apart from NLP, a lesser-known AI technology powers these intelligent assistants, called Natural Language Generation (NLG). This algorithm creates the text and speech responses of VAs like Alexa, Siri, and Google Assistant.

The next ML model that worked on an IVA’s idea of emotions through the tone of a human voice was HEIM (Hybrid Emotion Interference Model). The model used Latent Dirichlet Allocation (LDA) to extract text features and a Long Short-Term Memory (LSTM) to analyze the acoustic and understand emotions behind human voices. These technological geniuses can answer with confidence. Experts believe IVAs will simulate human psychology and make deeper connections in human cognition soon enough. A revolutionary algorithm, they kindled the intelligence threat of IVAs again.

Are they listening?

Recently, Siri let the cat out of the bag. On being quizzed by reporters about Apple’s next event, Siri declared, “The special event is on Tuesday, April 20th, at Apple Park in Cupertino, CA.” Siri made this declaration before any official announcement from Apple. The brand made no immediate comments on Siri’s revelation.

Admirable as IVA is, as it becomes more humane daily, people are increasingly concerned about privacy matters more than security. Those familiar with Ultron from the Avengers franchise can understand the seriousness of it. Sci-fi apart, the thought of an IVA becoming a highly self-aware AI is scary.

Will IVA be able to make life-changing decisions for a human being without their knowledge and consent? It is very much possible in the distant future.

Another invasive example dates back to 2018. A family in Portland reported that their Amazon Echo device was recording their private conversation and sending it to a random person from their phone contact list. Even Google Assistant hosted a nest of privacy complaints.

The truth is that every IVA’s nature is to collect and understand data, especially voice data. Experts believe that users do not realize that for an IVA to grow, have better conversations, and be helpful, they must review the collected voice data.

Research conducted by Loop Ventures showed that out of the 80 questions asked, Google Assistant got over 92% of them correct, Alexa answered 79% correctly, and Siri was at 83%. In 2019, the figures had been 86, 61, and 79%, respectively. The research suggests that IVA will understand all questions shortly, within reason.

Meanwhile, to streamline the process and ease people’s worry, the GDPR law that was activated in 2018 forces all IVAs to fetch consent. Users also possess the right to be informed about the collected data and the right to rectification and erasure.

Google launched a federated learning model in 2019 to limit the accidental awakening of Google Assistant on Android. This new system would allow the IVA to ask permission to save audio and speech so that it can be learned over time. With the federated model, Google can record the audio in the device as encryption instead of processing the data on the cloud or its servers.

Also Read: How Nomadic Revolutionizes Travel with Data-Driven Personalization

Voice Assistants have come a long way from the programming era of cognitive computing to the AI era. Data availability, the decreasing cost of computing power, and better algorithms will keep enhancing voice assistants’ intelligence. Back in 2021, Microsoft was investing in Nuance to dive deeper into natural processing and AI that can respond to humans. Google and Amazon were working on the complete elimination of ‘wake’ words.

Meanwhile, Amazon and Google started selling their smart speakers as modern alarm clocks, and the Echo Look and the Echo Spot were set to get cameras in the bedroom. With the forces being marshaled to push us toward voices, and as they grow in sophistication, we want our digital assistants not to be too friendly (not to invade our privacy) but sassy enough.

More articles

Latest posts