Interactive Voice Recognition or IVR, as it is commonly known, is the technology that a caller will interactive with when they telephone a large business or enterprise. The caller’s experience will be that their call will be answered by an automated system, which will greet them with a customized voice recorded greeting, and probably present them with a menu of options. These options might be:
- Press one for Sales, or speak One
- Press two for Billing, or speak Two
- Press three for Technical support, or Speak Three
This is where voice recognition comes into play, because the initial menu offers just tone dialing, for example the caller wishing to speak to sales would just press the number one button on their phone. The phone would then send the correct tone for button ‘One’ to the system, which it would immediately recognize and connect the caller to sales.
However, IVR is much more complex than just that, it can identify the alternative, for example if instead of pressing the button number one, the caller spoke the utterance ‘one’. The system would also be able to understand the utterance the human had spoken and duly direct them to the correct option.
IVR in this its truest form requires algorithms and complex hardware to capture and then analyze the voice samples, after all it has to work with every customer, regardless of local accent or speech impediment. This makes IVR a very complex science but one that has been pursued by telephony companies and PBX manufactures for a long time.
The future of IVR is not just to recognize single words but ultimately to distinguish distinct speakers, as this has huge ramifications in telephone banking security and call center fraud. The goal of speaker recognition may still be some way off, but IVR as voice word recognition is very advanced and not just confined to security but is becoming common-place in domestic appliances and as an aid for those less able bodied.