How to manage an outage – Your essential temforce Guide
Blog three in a series of three – How to manage an outage
So you’ve vetted your vendors, secured your system and kept diligent records of your inventory and performance history. And you still suffer an outage.
Unfortunately, no system is completely foolproof and there are endless ways in which an outage can sneak up on you. But good outage management can mean that you are down for a matter of minutes (or seconds!) rather than an entire day – and even better outage management will help prevent the same problem from happening again.
In essence, managing an outage well is a matter of preparation, knowledge and communication.
Once you’ve got these down, you’ll practically be hoping for an outage that lets you show off your skills.
Preparation
No matter how expensive, modern, multi-functional and over engineered your Telco system may be – you should always assume that you will experience outages. If the outage never happens – great! But if it does, you will be prepared.
This means knowing all your circuit IDs, having great relationships with all your vendors, and having a defined process in place in case of an outage, so that you know who you need to speak to and what you need to do.
Create an ‘outage preparation kit’ which contains instructions on how to identify, repair and prevent different types of systemic failure. Include detailed 24-hour contact information on every single vendor and account manager so that you are able to get in touch with them the second that an outage appears. If you have a backup plan in place (e.g. switching to an alternate cloud, or using a good old fashioned power generator), make sure that you have left instructions on how to implement and manage this process.
Every outage represents a new opportunity to hone your ‘preparation kit’. Ask yourself: How could I have handled that situation better? What could have been done to prevent that outage? Why did it take so long to resolve the outage? How can I cut down on the response time in the future?
Remember, when it comes to outage management, there is no such thing as being too prepared.
Knowledge
When an outage happens, the first thing everyone wants to know is: ‘What’s going on?’
If you can’t identify the source and cause of the outage immediately, your colleagues and your clients will start to lose faith in your ability to do your job. In high-octane companies where time is precious, any ambiguity over the length of the outage could send clients into a tailspin and ultimately devalue your business.
It is down to you to know your system inside and out. You need to understand every single circuit, wire and device that is under your jurisdiction and the relationships between each one. And you then need to be able to identify any issues straight away.
Once you know what you are dealing with, you can start managing your outage, whether that means placing a call to a vendor, flipping a switch in a fuse box, or simply reconnecting a power line.
Communication
In customer-facing companies, your response to an outage – no matter how minor – will reflect the quality of your customer service. A rapid and meaningful response will reassure customers and quell any panic.
Make sure that you have a template email ready to go to everyone who could possibly be affected by the outage. That way, you can respond to any concerns with just one click of the mouse pad, and reassure your colleagues (and clients) that you are on the case.
Social media is a great way to get information across to a disparate client base. Prepare and save a simple status update (e.g. “Our system is currently experiencing a brief outage, please be patient while we restore all services to their usual high standards”) which can be copied and pasted into Twitter, Facebook, LinkedIn or any other social network where your company has a profile.
If the outage has not been resolved within 20 minutes, send another update with more details on the problem (e.g. “Thank you for your patience, we have identified the cause of the outage and are taking XYZ steps to repair it”).
By communicating regularly, you not only show that you are on the case and managing the situation, but you also reduce the likelihood of your customers – internal or external – feeling that they need to contact you to find out what is happening. Not only is that frustrating for both of you, but of course answering every call and email means it’s probably longer before you find that solution to your outage!
Regular communication updates aren’t simply ‘nice to haves’, but an absolutely vital part of the overall process.
Keep all your messages short and informative – this is not the time for joking around or launching into lengthy technical explanations or incorrect theories.
Once the outage has been resolved, be sure to let everyone know through all the relevant channels. Include some brief information about what caused the outage (in layman’s terms, of course!) and make yourself available to answer any questions that your colleagues or clients may have – but informing them that a full report will follow.
Make sure that you do follow-up in the following days to reassure the affected parties that this particular problem won’t happen again, by explaining the corrective measures that you have taken. Finally, monthly supplier relationship reviews are an important aspect of understanding the past outages to help prevent the same outages from occurring again, but more importantly help identify trends to prevent the same outages from happening again across a different region or aspect of your network.
Having the data at your fingertips and displayed in a simple yet transparent manner is key to your success in managing and preventing outages going forward and this is exactly what temforce’s outage management utility provides.
Want to know how to measure the ROI of your TEM? – Don’t miss our post “The Top 4 Ways to Measure to ROI of TEM”.
Temforce is a provider of Telecom Category Management SaaS enabling teams to boost their Telecom Management!