
It's the machines' fault... and ours
Published: 21 January 2005 08:00 GMT
Everyone's had their day ruined by a computer crash at one time or another - but why does this happen in the first place? Peter Cochrane examines the causes and offers a solution to flaky tech.
How come technology seems to fail at the most critical times? There you are making good progress toward an important deadline, overcoming all obstacles and having a winning day, when - crunch - the printer stops functioning. Better still, your PC crashes. And then at that very instant the boss appears to ask how things are going and will he be getting his report in less than an hour?
What gives, is technology bating us or is this really the norm?
The answer to this sometimes frustrating conundrum comes in two parts - the understandable and the distinctly quirky.
First, the understandable. When we get married, start a company or make any major life changes, we tend to re-equip. That is, we buy the TV, oven, fridge, washing machine, dryer and vacuum cleaner all at the same time.
Strange as it might seem all of these white and brown goods are designed to broadly the same lifetime specification. The Mean Time To Failure (MTTF) is around five years and the Mean Time To Death (MTTD) is around eight years. Ergo multiple and near simultaneous failures are to be expected - they actually have been built-in. In other words, if we buy 10 items at once, there is a pretty good chance that two or more will fail at a similar time.
This mechanism also applies to our automobiles, computers and other IT equipment - and everything else that is mass-produced. So buying a PC, printer, scanner and back-up drive all at once puts us in the same vulnerable position. Of course, the amount of use and abuse also influences the actual outturn of the MTBT and MMTD. Add to this the variability between manufacturers, suppliers and maintainers as well as that unpredictable commodity - software - and the stage is set.
Another factor: When is your car most likely to fail? The day after it has been in the repair shop. Once the repairman has been inside the box it is far more likely to become unreliable. This is true of everything you own. It is certainly true of any and all software upgrades and installs for computers. Hence the old adages - 'leave well enough alone' or 'if it ain't broke don't fix it'.
Some generally unseen mechanisms can often lead to a cascade of technology failures as well. In all complex systems a single point mechanism can lead to multiple failures and, conversely, multiple small failures can see a single dominant failure. For example, a network hub failure may see the loss of an internet connection, printer and scanner. At the same time, opening one more application when almost all the RAM is full, the hard drive is severely fragged and the mouse is dirty can cause a total system freeze.
And now for the quirky mechanisms - us. One of our first problems is recognising what is going wrong when tech failures occur and then diagnosing the cause. Very often we are not all that good at it. We tend to jump to the wrong conclusion and, especially when tired and stressed, make mistakes and compound problems through bad decisions and actions.
Add to all of this the fact there are a lot of us networked together, with different competence levels, all trying to achieve different objectives and you have a disaster in the making.
We also have run up our load of company, domestic and leisure activities to a point where almost everything is on a critical path. There is no slack - no room for error or failure. In a way we assume our technology will not let us down.
Why? Because much of our technology - heat, light, power, communication, transport - is reliable. Sure, IT is still flaky but it's an awful lot better than it was 20 years ago and continues to improve. With our current mindset, any and all failures come at a critical time because everything we do is critical. We have no back-up, no standbys or no extra members of staff who can pick up the ball.
Is there a solution? Yes. But it means becoming less efficient and building in some slack. I don't want to brag because what follows is a bit extravagant. But because a lot of people live in my home, effectively two families under one roof, I now have two washing machines, dryers, irons and kitchens - and four vacuum cleaners. Domestically I am in good shape - but I am not advocating this as a solution. A sharing agreement with a close friend or neighbour makes for a far more economic solution if it can be arranged.
On the IT front, if you want reliability you have to spend money on dual machines, hard drives, printers, scanners and everything else. And never upgrade software or install a new OS or application on all of your machines simultaneously. Do it sequentially, establishing stability a stage at a time.
Overall my most effective investments have been in back-up hard drives, both internal and external, plus several no-break power supplies. If a power glitch or outage occurs, my systems keep running. This single measure has saved me much grief and paid for itself many times over. And it was the least expensive of all my precautions - just $100 or so for battery backup for my server, router, hubs, drives, PC and peripherals.
On one level I stand in awe that modern society and technology works at all but on another I can see all the inefficiencies. In the end failure is endemic and part of the learning process. We just need to continuously minimise the overall impact. And believe me, IT is getting better.
Whoops, there goes another light bulb.
Written after my printer ran out of black ink, a light failed in my office and my ISP went down for half an hour. All extremely rare events but grouped in the same hour. Column completed within the next hour and despatched to silicon.com via my Wi-Fi link.
Peter Cochrane is an engineer, scientist, entrepreneur, futurist and consultant. He is the former CTO and Head of Research at BT, with a career in telecoms and IT spanning over 40 years. Peter has also held a number of prominent academic positions including the UK's first Professor for the public Understanding of Science and Technology. For more about Peter, see www.cochrane.org.uk.
"IT is still flaky but it's an awful lot better th...
John Hewett
I have to say I'm still amazed at how easy it was ...
Dick Winchester
The essence of IT failures is that which we cannot...
Fergal O'Leary
What was that last comment about death and the str...
Anonymous
What the hell is that O'Leary fella talking about?...
Niall Connell
Peter Cochrane's Uncommon Sense: Life as a tech-nomad
Peter Cochrane's Uncommon Sense: Looking back from 2020
Peter Cochrane's Uncommon Sense: Why do we still travel?
Peter Cochrane's Uncommon Sense: The state of broadband
Improving TCP Performance in Ad Hoc Networks Using Signal Strength Based Link Management
Purpose of this Position Failure Analysis of Product to System, unit and component level in order to implement both corrective and preventative ...
Embedded C developer required for this hardware interfacing contract where expertise in one of either 802.11, bluetooth or wi-fi expertise is ...
Performing a range of ICT management tasks, such as replacing classroom PCs, repairing failed components and resolving any network or equipment ...
CIO50 2008
The silicon.com CIO50 2008 profiles the most influential and innovative tech chiefs in the UK across all industries and organisation size, from the biggest FTSE100 companies to high growth dot-com start ups and the public sector. The list was voted on by the UK CIO community and a panel of experts. Find out more in our latest special report.
July 10th: Just MASH Marketing: The Customer Reference Mashup
TechNet Webcast: How Microsoft Does IT: Management and Operations in Windows Server...
Mashing it up with Support: Automate, Coordinate and Collaborate with the Incident...
Ensure Virtualization is Meeting Your Needs--Read this New White Paper
Stories from the web...
Copyright ©1995-2008 CNET Networks, Inc. All rights reserved. Top of page
Naked CIO The Naked CIO: Service level disagreements SLAs - not worth the paper they're written on?
silicon.com Dear silicon.com: Tech teacher shortage, Kangaroo and phones on planes Reader Comments of the Week