
Unlike its predecessors, Harpy could recognise entire sentences. That’s how Harpy, built at CMU, was born. A number of companies and academia including IBM, Carnegie Mellon University (CMU) and Stanford Research Institute took part in the programme. The most significant leap forward of the time came in 1971, when the US Department of Defense’s research agency Darpa funded five years of a Speech Understanding Research programme, aiming to reach a minimum vocabulary of 1,000 words. But all these systems were mostly based on template matching, where individual words are matched against stored voice patterns.

There were other efforts in the US, UK and the Soviet Union, with Soviet researchers inventing the dynamic time-warping (DTW) algorithm that they used to build a recogniser capable of working with a 200-word vocabulary. In the 1960s, several Japanese teams worked on speech recognition, with the most notable ones a vowel recogniser from the Radio Research Lab in Tokyo, a phoneme recogniser from Kyoto University, and a spoken-digit recogniser from NEC Laboratories.Īt the 1962 World Fair, IBM showcased its "Shoebox" machine, able to understand 16 spoken English words.

“With recognition of these 12 words, the telephone system was able to complete the transition to machine-only telephony,” says O’Gorman.Īudrey was not the only kid on the block, though. So, in the 1970s and 80s, a huge effort in Bell Labs’ speech research was to simply do the following: recognise zero to nine digits, and ‘yes’ or ‘no’. But as telephone switches became digital in the 1970s and 80s, they enabled faster and cheaper call routing, while staying dependent upon an operator recognising a person’s request to dial a number. Recognised speech would require much less bandwidth than the original sound waves. “I believe Audrey was initially developed to reduce bandwidth, the volume of data travelling over the wires,” says Bahr’s colleague Larry O’Gorman of Nokia Bell Labs. Audrey was an early bird – it preceded general purpose computers, and although it was not used in production systems, “it showed that speech recognition could be made practical”, says Bahr.īut there was another goal.
DRAGON NATURALLY SPEAKING SOFTWARE AMAZON MANUAL
“This was an amazing achievement for the time, but the system required a room full of electronics, with specialised circuitry to recognise each digit,” says Charlie Bahr of Bell Labs Information Analytics.īecause Audrey could recognise only voices of designated speakers, its use was limited: for instance, it could offer voice dialling by, say, toll operators, but it wasn’t really a necessity because in most cases manual push-button dialling of numbers was cheaper and easier. It worked with 70-80% accuracy for a few other designated speakers, but far less well with voices it was unfamiliar with. But regardless, Audrey could recognise the sound of a spoken digit – zero to nine – with more than 90% accuracy, at least when uttered by its developer HK Davis. It could recognise the fundamental units of speech sounds, which are called phonemes.īack then, computing systems were extremely expensive and inflexible, with limited memory and computational speed.


Made by Bell Labs, the huge machine occupied a six-foot-high relay rack, consumed substantial power and had streams of cables. The invention took off, with more and more offices around the globe sporting a secretary with a clunky earpiece, listening to the recordings and transcribing them.īut all those baby steps kept machines passive – until “Audrey”, the Automatic Digit Recognition machine, came along in 1952. The idea was to get rid of stenographers by using the machine to record dictation of notes and letters for a secretary, so that they could later be typed offline. The invention paved the way for the first recording machine, the "Dictaphone", patented in 1907. Then in 1881, Alexander Graham Bell, his cousin Chichester Bell and Charles Sumner Tainter built a rotating cylinder with a wax coating, with a stylus that would cut vertical grooves, responding to incoming sound pressure.
