Telephonic time-wasting

You may think this is a joke. I don't... !

The stupidity of ‘voice recognition’ and ‘voice activation’

Fido video Technical limitations
Sociolinguistic stupidity
Vocal Persona
Time wasting and the Exasperia example

What really bugs me about all this

Kafkaesque customer service — just one example
Typical of most "intelligent" (=stupid) systems.

Top The Fido call transcription

When I lived in Montréal, I needed to find out how to change my credit card details with my
mobile phone service provider (Fido/Rogers). This is a reliable transcriptiion of the actual
phone call I made to them (Fido/Rogers) in the morning of
Wednesday 5 October, 2005.
You can also check the call as an audio recording and in video form.

In this tabular transcript, all the Fido voices are pre-recorded..
My words or action are in green and dead time is in orange.

Statement or action
Dial Fido  
"Bienvenu(e) à Fido. For service in English, press '1', now." (pressed) Request contradicted at 03:07
"Welcome to Fido. Please enter the ten digits of your Fido..." Step repeated at 01:52
Fido phone number entered  
— (= nothing happens, nothing is said) Waiting: ½ second music, then silence; chaotic impression
"Please note that when using this system you can either tell me what you want to do or you can dial the information using your phone key pad, like when you want to enter your phone number, for example. Using your keypad is suggested if I'm having trouble understanding you." Very low volume, difficult to apprehend. Very wordy, contains tautology ('like",,, "for example"). Keypad subsequently unusable in all instances except input of phone number.
Waiting ...
"Please hold while we access your account information." Very loud indeed (if 0 dB, previous at -30dB), chaotic impression
Waiting ...
"with your prepaid service", then nothing Very low volume again, technologically erratic; and what does it mean?
"As you use this service, I'm going to ask you some questions. Whenever you know the answer you can just interrupt me. You can still use your handset keypad for entering things like phone numbers or access codes." This is very wordy and treats me like a primary school kid. It is also untrue because it demands "yes" or "no" answers (which it doesn't understand, see 01:09, 01:13, 01:17, 01:28) and doesn't allow keypad input..
"I bloody well hope so." Muttered very quietly
Waiting again...
"I didn't get that. Please say 'yes' or 'no'". Contradicts messages at 00:58 and at 01:17
"No!" Loud and clear, as instructed at 01:09
Waiting for response...
"Jesus Christ!" Muttered under breath in quiet desperation
"Sorry, I still didn't understand." Contradicts message at 01:09 and response at 01:13
"Well, you're stupid." Normal voice
"Alright, then. Here's what I can help you with. You can say 'account information'..." Not what I want to hear after another 3 seconds waiting for an intelligent response
"No" As instructed at 01:09; no indication to the contrary: I "can" say "account information"; I don't have to.
"I didn't get that." Clueless: see 01:09, 01:13, 01:17, 01:28
"Well, get a brain." Normal voice, irritation growing
Still waiting for an intelligble response
"Sorry, I still didn't understand." Incompetent!
"Oh, get a brain, woman." Normal voice, irritation turning to despondency
Still waiting for a sensible response
"To help me route your call to the right person, please say or enter your ten-digit Fido number." I already did this at 00:09
"Done that already." Mobile number entered again Waste of time
Still waiting for something productive to happen
"I'm attempting to transfer your call. Thanks for being patient." What a joke! A machine thanks me for being patient?...
Absolutely nothing for 64 seconds! At 02:55 I mutter "not even hold music this time."
"Pour raisons d'assurance de qualité et d'information il se peut cet appel soit mis sur écoute ou enregistré." Extremely low volume, very difficult to decipher. What happened to my choice of English? (00:05)
"Mais, j'avais choisi l'anglais." Muttered very quietly
Low-volume hold music
"The current wait time is five to ten minutes." It’s taken 3 minutes 34 seconds to find this out!

It took 3 minutes and 34 seconds to find out that I would have to wait another five to ten minutes before being able to speak to a human being. In fact I had to wait another twelve minutes. My question was only partly answered sixteen minutes after picking up the phone.

Top Time wasting

As the tabulated phone call transcript shows, dead time (waiting for a response or action from Fido) accounts for 57% of the call’s total duration, a small but significant part of which is occupied by the voice recognition system attempting to match what I say based on the impoverished range of syllables it has been programmed to deal with. Otherwise, entering digits on the keypad accounts for about 10%, listening to Fido’s preprogrammed messages 28% and my own statements or muttering the remaining 5%. Including the duplicated phone number input, my statements and actions occupy 15% and Fido’s 85% of the time wasted before learning that I had to wait another ten minutes. Who’s wasting whose time?

Apart from the personal inconvenience and irritation of being kept waiting, it is worth calculating what kind of effect a system like this has on the economy. For example, before the voice non-recognition system was introduced I could top up my credit card payment to Fido within about 60 seconds. To carry out the same task using the voice non-recongnition system takes at least three minutes. Two minutes extra per month means 24 extra minutes per year. If the company has only 10,000 users, that means the loss of 4,000 man hours per year. Would Fido, with its silly “cuddly dog” logo, like it if the company’s unionised employees were to demand the right to an extra 4,000 man hours of paid absence (for illness, holidays, etc.) or if they had to waste that amount of time on the phone?

According to Fido, nearly 90% of calls made using their voice non-recognition system do actually get through. They hasten to add that the survey did not investigate if customers were satisfied with the system. After all, I got a partial answer after sixteen minutes (of which the first 214 seconds shown in the table, above) and belong to that 90%, so what's the problem, apart from wasting our time?

Top Technical limitations

The most serious technical limitations are the system’s:

The most serious technical limitations are the system’s: [1] inability to distinguish speech from background noise (the cocktail party effect); [2] inability to decipher other variants of English than those it has been programmed to deal with; [3] paucity and rigidity of possible answers to particular questions and the inability to discern threads in conversation.

No cocktail party skills.

Humans can pick up and follow statements from fellow humans not only against background noise but also when several other conversations are going on at the same time. By way of contrast, Fido representatives advise customers to avoid any background noise (even fans and ventilation noise) if their speech is to be “recognised” by the system. Forget conversations going on in the background, traffic on the street, etc! What is the point of such a crude system? For example, I am currently unable, if I’m downtown and, on discovering there’s only one prepaid dollar left in my account, want to top it up without either seeking out a very quiet place (in downtown Montréal?!) or paying to log on at an internet café, if I can find one.

Linguistic ethnocentricity

English, as first or second language, is spoken more extensively worldwide than even Mandarin Chinese. Variants of spoken English are therefore innumerable. For example, in a multicultural city like Montréal, second-language English speakers preferring the English rather than French voice non-recognition system can have, as their first language, Arabic, Hindi, Urdu, Bengali, Tagalog, Greek, Russian, Polish, Portuguese, Italian, Cantonese, Mandarin, etc. The system even has problems with my (usually clear and correct) British English! Yet we all have to use a system which forces us all to speak with an accent that is not our own.Paucity and rigidity. The system only accepts a very limited number of stock words and statements, to be enunciated in standard North American English (see above), as valid triggers for action on its part, and only in response to one particular prompt at a time. Synonyms, explanatory or polite turns of phrase and any other socially normal conversation strategies are right out of the question. Nor can the system be expected to follow the thread of the most simple dialogue, as can be seen from the fact that it wouldn’t take “no” for an answer when a prompt five seconds earlier had instructed me to answer “yes” or “no”. The system doesn’t even have the memory of a goldfish.It is technically naïve to expect voice activation systems to deal with even the three basic points just mentioned which constitute the simplest examples of normal language behaviour among humans in any culture. So why not just accept that you’re dealing with a stupid machine and do what it wants? The reason is a technical naïvety that is deaf and blind to basic realities of socio-linguistic behaviour

Top Socio-linguistic stupidity

The pre-recorded voice speaks to you using the normal speech parameters of diction, intonation, inflexion, accenuation, rhythm and timing. It also presents you with short but complete sentences. You, on the other hand, must not respond using complete sentences or any diction, intonation, inflexion, accentuation, rhythm or timing that might confuse the system. While the machine presents the sound of a human voice — some companies even give the machine a name and an ego (how sad is that!) — you, the human, must respond to it like a machine. In short, the robot is humanised and you, the human, are robotised. It (machine) pretends to be what you are (human) and you (human) must pretend to be what it is (machine). This absurd perversion of everyday habits of conversation make communication virtually impossible. If you do not respond like a machine to the machine that you are dealing with, despite its human pretences (“I”, “me”, normal speech patterns, etc.), you will not pierce its stupidity and gain access to the information or service you require. It is, in other words, a matter of social power because the system can say whatever it likes to you in whatever way it chooses while you can only say what it accepts, no more, no less.

It is by such mechanisms that authoritarianism and fascism work. If you get people used to behaving like machines in any way possible they’ll be more likely, as the automata they’ve been trained to become, to obey all orders and follow all “policies” without question. I object strongly, on ethical grounds, to any attempt to demean humans in this kind of way. I consistently tell Fido that I refuse to speak to a machine pretending to be human, and I tell them why. They think I’m a trouble-maker. Wrong! They’re causing the trouble and asking for more of it by forcing people to become machines. People are human beings and should be treated as humans, not robots, while machines should not parade as humans. It’s childishly dishonest (and a waste of time) to pretend otherwise. If you follow the normal socio-linguistic conventions of everyday speech and if someone talks with a friendly voice you’ll probably respond using in a similar manner. That’s standard human behaviour. Therefore, when a recorded voice, as in a voice ‘activation’ system, addresses you using normal intonation, inflexion, rhythm and sentence construction, you’ll naturally respond in a similar way. That’s why systems whose recorded messages require a severely restricted range of standardised vocal responses from the customer should not sound human.

It should be obvious that a robotic voice is much more likely than a human-sounding voice to elicit the kind of robotic response the system in fact needs to work at all. One other solution is to revert to keypad input, yet another to employ more real humans to answer queries and to provide services, another to rethink the provision of services in terms of what customers really want, yet another to subscribe to more phone lines and to allow people more direct access to the particular services they require (see last paragraph, below).

Top Finally about Fido

It took me sixteen minutes to find out just part of what I needed to know from Fido, with their expensive voice non-recognition system. Compare that with the 28 seconds it took me to phone up and find the time of the next 161 bus going east from the stop at the corner of my street (click here to hear that!). That’s 34 times faster than Fido to access highly specialised information! Even though there’s just one phone number for information covering the whole of the Montréal public transport system, there’s no fuss: no menu, no voice recognition, no pseudo-human machine fetishes, no farting about at all. The volume is also reasonably constant and every statement totally comprehensible. What a relief! Thank you, STM! Fido should learn from you. Capitalism sucks. Public services rule!

Top Contacting Exasperia

A classic example of corporate phone contact mismanagement.
Here’s why you need to phone the company.

You’re having problems re-registering a paid-for item of software bought from Exasperia.
The reason is that there’s confusion about which string of alphanumeric characters is the serial number, registration key, serial key, license number, license key, product ID, product number, product key, invoice number, receipt number, verification code, etc. that they ask you to fill in on line.

Despite searches among Exasperia’s chaotically posted FAQs, most of which are left unanswered by other users and are misleadingly referred to as ‘Help’, you just can’t find which of those variable names apply to which strings of alphanumeric characters for which purposes in your dealings with the company. There’s no explanation of how those codes and numbers relate to your User-ID, user name, password, access key, customer number, account number, account type, account status, verification details, etc., etc.

Do they mean different things by all of those names for keys, codes, numbers, licenses, identities, types, status and passwords? If not, which variable names are alternatives to which others and what functions do they have? It’s not clear. The only way to find out is to phone Exasperia.

This is what happens.

What you do or think is in this font. What you actually say is in this font.
What you hear from Exasperia is in this font. Explanations/comments are in this font.

Statement or action
You go to the Exasperia website at and spend 90 seconds trying to find their contact phone number hidden in a small font at the bottom of their “Contact us” page. You dial the number.
Hi! I’m Kelly. Thank you for calling Exasperia. Please enter your account number followed by the pound key (‘This assumes you know she means #, a.k.a. number key, a.k.a. sharp sign, not £, the actual pound key on a UK computer keyboard).
It’s obvious that Kelly’s a machine somewhere in North America, not a human being. You’re in a bit of a hurry and you inadvertently enter one of the 14 digits wrongly.
(Kelly the machine) I’m sorry. That’s not a valid customer identity number. Please enter your customer identity number followed by the hash key.
You have to put the phone down while you search for your account number so you can be sure you enter it correctly this time. It appears in a small font, without spaces, near the top right of an invoice, between the internal audit number and date of issue. You arrive back at this point two minutes later. It takes you another 10 seconds to key in the 14-digit number more carefully this time.
Thank you. Now please enter your 5-digit PIN so we can access your account information.
You enter your 5-digit PIN.
Thankyou. Your details have been verified [pause] . So, what would you like to do? You can say ‘Accounts’, ‘Change of address’, ‘Products’ or´Services’.
Oo.... I don’t know. It’s really none of those.
Sorry. I didn’t understand that. You can say ‘Accounts’, ‘Change of address’, ‘Products’ or´Services’
I said I didn’t know because it was none of those.
Sorry. I still didn’t understand that. You can say ‘Accounts’, ‘Change of address’, ‘Products’ or´Services’
Geez! This mechanical woman is thick and stubborn!
Sorry. I still didn’t understand that....Cheezy music kicks in. Sounds like Kenny G...
OMG. Do I have to listen to this!?
Sorry. I didn’t get that. Cheezy music continues... Alright, then. Here's what we’ll do. Please listen carefully as we’ve recently changed our menu options. You can press 1 for special offers, 2 for ExaspAdvantage loyalty bonus points, 3 for upgrades, 4 for accounts, 5 for change of address, 6 for new products, 7 for ancillary services, 8 for our legal department, or 9 for job opportunities at Exasperia. You can also press zero at any time to go back to the start of your call, or press star to hear these options again. Press the number key if you want to die. If you’d like to speak to a customer service representative please stay on the line...
You opt to hold. Kenny G segues into a badly edited 30-second loop from Vivaldi’s Four Seasons.
Your call may be monitored for training and quality assurance purposes. We want to continue providing an excellent service to all our customers. [15 second pause]
We’re sorry but all our customer service representatives are currently helping other customers. We recommend that you press pound (#, not £) to leave us a callback number. [short pause] Or you can continue to hold so as not to lose your place in the queue.
Having already wasted nearly seven minutes so far you decide to soldier on in the hopes of eventually speaking to a human being.
Thank you for continuing to hold. You are moving forward in the queue. [pause] Did you know that you can find answers to most queries by visiting forward slash FAQ?
Bollocks! I just checked there! What a waste of your time! Still, no “sorry I didn’t understand” this time, so you don’t have to play pretend with ‘Kelly’. It’s clear you’re not being heard. Vivaldi again.
Thank you for continuing to hold. Your call is important to us and you are moving forward in the queue. We want to continue providing an excellent service to all our customers. [pause] Have you claimed your Exasperia Advantage loyalty bonus points? To see the amazing prizes you can win, go to forward slash loyalty and press ‘prizes’.
Stop wasting my time wih inane loyalty bonuses! Vivaldi has now morphed into a badly sequenced montage of dull synthesised pop loops featuring an unconvincing sax sample.
Thank you for continuing to hold. Your call is important to us and you are moving forward in the queue. We want to continue providing an excellent service to all our customers. [pause] Did you know that Exasperia was voted ‘Timewaster of the Year’ by readers of Kool Kafka magazine? Learn more about this prestigious award at forward slash excellence forward slash wasteoftime (all one word).
(Thinks) OMG! They don’t even have a false sense of shame or modesty! Hold music continues.
We’re very sorry but right now we’re experiencing [as usual] an unusually high volume of calls. You can press the hash key to leave us a callback number and we will contact you at the earliest possible opportunity. [pause] You can also continue to hold so as not to lose your place in the queue.
Having wasted 9 minutes you decide to battle on in the hopes of eventually reaching a human being.
Thank you. A customer service adviser will be with you as soon as one becomes available.
More of the same cheap-sounding synth. Will this never end? You suffer another 5:15 of ‘being important’, ‘moving forward’, badly edited hold music loops and infantile marketing before finally...
Hi! My name is Gary. Can I have your Exasperia account number please?
‘Gary’ has a quite a thick Tamil accent and is no more Gary than you’re Madhukaeshan. But at least he’s a real human being who’s polite and who means well. You give him your account number again (already entered at 02:05). He also wants you to confirm your address details and user-ID.
Thank you, sir. And how may I provide you with an excellent service today?
You assure Gary that you’re delighted to talk with a human being and suggest he pass on to his seniors your dissatisfaction with the firm’s exasperating phone contact system (see above). Then you describe your problems with re-registering the product because there’s no consistency as to which string of characters and/or numerals is the software’s serial number, registration key, serial key, license number, license key, etc. (see preamble to this table). But it’s hard to understand ‘Gary’ with his palatalised and/or unaspirated plosives (p, t, k, b, d, etc.). It’s frustrating and takes much effort from both of you before the issue is only partially resolved. You can’t blame ‘Gary’. He’s just as much a victim as your are of senior management making a cynical fast buck at both his and your expense.
c.21:00   End of call to Exasperia!

This call could have been much worse. At around 06:00 you could have pressed 7 for ancillary services and been presented with a submenu saying “You now have another five options to help us better route your call. Please listen carefully as our options may have changed.” Let’s say you choose 5 for “all other enquiries”. You’re transferred to a subsubmenu where you select another option which shunts you round in a circle back to the main menu or to the first submenu. Sound familiar? It’s happened to me several times.

Talking to a human being is what most customers want to do but it’s the last thing the system offers you. You’re just a pain in the corporate arse who has the gall to expect time to be spent attending to problems that are usually of the corporation’s own making. If ‘your call is important to us’ really meant ‘your call is important to us’ you wouldn’t have to wait so long and it wouldn’t be so hard to reach a human being.

However frustrating and cynical this corporate phone treatment may be, never take it out on the human being you might eventually reach if you’re lucky and persistent enough. Just tell your fellow human slaving away in the call centre in Dundee or Delhi how pleased you are that they’re human and ask them to pass on your frustrations to management. If enough of us complain, who knows, it may have some effect one day or another. Until then neither you nor I count. Our time and patience is not an issue unless we insist that it is. Your call is not important to the corporation because you’re really a pain in their arse.

To learn more about these stupidites, watch this short report from Australian TV. Try also this parody on voice ‘recognition’ (“I’m Phil. Did you say ‘Tom Jones’?”).

Top What really bugs me about all of this

It’s an exasperating waste of my time as a customer. It’s based on a whole set of misconceptions, on short-term greed and on sloppy thinking.

Corporations need to:

  • Employ more people (actual human beings) to answer the phone.
  • Employ people to answer the phone who come from the same language culture as the customers who phone in.
  • Understand that it is infantile, dishonest, hypocritical and extremely offputting for customers to be told how fantastic the corporation is, what a wonderful range of advantages its loyalty scheme offers, what an excellent service it provides, how important your call is etc. while simultaneously subjecting customers to endless menu options and wait times before they can (if ever) talk to a real person.
  • Scrap the crude and groundless fetish of voice activation automation.
  • Learn the rudiments of socio-linguistics.
    • Stop pretending that machines are humans. They aren’t.
    • Stop treating customers as machines. We aren’t.
  • Save customers time and frustration rather than squeeze another fast buck out of customers and junior employees.
  • Learn that capitalism sucks and get used to the fact that increasing numbers of individuals around the world are sick to the teeth of it. Voice ‘recognition’, endless menus and wait times are admittedly a very minor irritation compared to the capitalist system’s major lies about freedom of speech, information and expression, but they are significant as lies. They socialise us into negative entities seen as preventing corporations from making money. That means preventing us from being human beings which, in its turn, effectively prevents us from acting in a spirit of empathy and altruism.
  • Generally grow up and wise up!