Why Your Helpdesk Never Seems To Be Able To Catch Up (The Phoenix Project)

From where I come from in the world of consulting, you are either suffering from too much work or too little. As I have worked with MSPs over the last decade, I can say that the same is very true here in IT as well. We often blame our sales efforts for the issue of “feast or famine” but all-you-can-eat Managed Services was supposed to help cure that by converting customers over to recurring monthly subscribers.

So why is it that we still struggle with overflow on our helpdesk? Why are projects still not getting done since we don’t have to focus as much on sales as we once did back when things were hourly?

A recent read through an industry classic book (that I highly recommend to everyone here) got my mind spinning as to why MSPs struggle so much to keep their ticket count down. The Phoenix Project by Gene Kim, Kevin Behr, and George Spafford tells the (largely fictional) story of an internal IT department for the company Parts Unlimited that was constantly struggling to keep up with the bevvy of incoming work. Instead of placing the blame on sales, or development, the novel admits that internal IT operations is the problem when it comes to the constant state of emergency that many MSP Helpdesks suffer from.

The book makes several key statements which I thought I’d summarize here. The end result should be shorter queue times for your clients, faster turnaround on work, and less tickets in your system. Also, you might not grey as quickly as you were when everything was 15 alarms exploding.

Lesson 1.) Identify Your Constraints

Helpdesks are often constructed in a rather flat fashion. You have tier 1, tier 2, and sometimes tier 3 for the really difficult problems. The idea behind this structure is to protect your “dragon slayers” the top technicians who can get stuff done for you. If they were inundated with every little Password reset request then the entire helpdesk would grind to a halt. There are only so many hours in the day and you need to protect your best technicians so they can focus on the projects they need to focus on.

The thing people often don’t recognize is that their tier 3 techs might not be the bottleneck to their operation. Since the majority of the helpdesk is a lateral setup, that means you might have several people who are quite capable of handling tickets as they come in but maybe one tech is getting backed up or not properly prioritizing the tickets being given to them.

It takes a little time and perhaps a little development, but here are a few metrics I would be looking for to help find where tickets are getting stuck.

  • Longest average ticket close time (per technician)
  • Most open tickets (per technician)
  • Longest close time (per ticket)
  • Longest open tickets (per ticket)
  • Longest open tickets (per technician)

Does that mean you need to penalize your techs who have the worst average ticket close time? Not necessarily. The issue might not be the tech’s fault. In fact, they may be working their butt off and can’t seem to unbury themselves from the sheer mountain of work on their plate.

This issue often crops up for the best technicians in your MSP organization. They work hard, they get stuff done. But then they don’t take the time to document exactly what they did. Eventually they end up being some kind of “Specialist Savant” with your client’s machines and networks. As a result, you can’t seem to get much done for that client without your tech’s help because they know the ins and outs of the entire system and no one else does.

Identifying who is the weak link in the mountain of tickets will give you’re the opportunity to analyze why tickets are being backed up.

Identifying the longest tickets to be closed (or longest open) will help you with the next important point in the process of trying to speed up your helpdesk.

Lesson 2.) “Work-in-Progress is the silent killer”

The Phoenix Project identified the biggest time waster in all of IT: The queue.

More time is lost in all of your helpdesk just having tickets wait to be worked on than anywhere else.

So how do you tackle it? How do you bring down queue times?

The best answer for people who can afford it is to hire on a dispatcher. The job of the dispatcher is to make sure that tickets are rapidly assigned to the right resource and to reduce the backflow of tickets from one resource to another. In other words, tickets should be properly assigned to the right technician the first time, and they should be assigned at the speed they are being resolved.

A dispatcher can also protect your staff from prioritization errors. As the business owner you know who to prioritize, and why. But as a technician it is often easiest to just work on whichever gear is the squeakiest. If someone is yelling loud enough then they must be the most important to deal with.

This can rapidly build to a very bad internal and external company culture. Clients feel they have to be jerks just to get good customer service, and your staff are running on constant overwhelm because they are in high adrenalin “fire-extinguisher” mode for 99% of their day.

A dispatcher can protect your technical staff from this by enacting a system of ticket prioritization and making sure the right tickets are being worked on.

Don’t have funds to hire a dispatcher?

No problem. You can do this electronically, or have one of your techs take on the responsibilities of dispatcher. Also, if you are the primary tech in your MSP, then you might want to shuffle the role of dispatcher to another employee in your organization so they can help you prioritize the most important tickets to work on.

See my article here on who to hire first in your MSP.

The important part is that you have a single point of contact to manage the dispatching of new work to your technical staff. They protect your technical staff in a number of ways and will greatly improve your company culture if you don’t have one already.

If you can’t afford to hire someone yet and aren’t willing to do it electronically, then it might make sense to outsource it. I’m all for keeping things local, but this is a key job which is best separated from the technical staff. You need someone who can be courteous on the phone and organized enough to make sure that all incoming work is dealt with in the proper order.

Lesson 3.) “Multitasking is bad”

Multi-tasking is one of the biggest productivity killers of our age. A technician who is multi-tasking, is probably wasting time on more than one thing. It might feel efficient to be working 10 tickets at once, but chances are you could have completed them one at a time faster and probably more efficiently.

Sure, there is always the “this process is going to take another 7 hours so I should start something else”, but you might find that often to be the exception to the rule. In the manufacturing world it is called the Drum Buffer Rope.

The idea is that you release work at the speed your constraint is able to complete it. So instead of piling 70 tickets on your key resource and saying “get to them as fast as you can”. You select the most important tickets and release them to the key resource as they are completed.

Ever notice that you’re less stressed and you get more done when your inbox is empty? What if you had someone plan your work out for you for the day and you simply told them when you were ready for more?

Your techs suffer when they are staring at a board full of tickets. By releasing the important tickets to them first and then filling in the small ones as they go, your dispatcher can greatly speed up their speed of resolution.

In fact, if you’re staring at a full board of tickets right now, the very first thing you should do is unassign all tickets not marked as in process. Then empty the board of in process tickets as fast as you can. Once the board has been drained of work in process. Have someone start going through the tickets one at a time and based on their importance, assign them to your techs as they are being completed. Your dispatcher can control the flow of work entering your technician’s inbox.

They will (generally) work faster, and be happier.

But more importantly, they won’t have to be responsible for deciding what is the most important ticket to work on. Technicians will almost always make the wrong choice because they’ll work the easiest tickets (or the loudest) and push off the hardest tickets until later. But later rarely comes (if ever).

This also gives your technicians time to start performing proper documentation. Your dispatcher can also enforce documentation rules. By controlling the release of work, they can make sure that your technicians have properly documented what they did (and why) before releasing new work to them.

Just because their inbox isn’t stuffed doesn’t mean you need to lax up on their tickets closed per day metrics. They can still close plenty of tickets per day, but only work on what is being fed to them.

I’m not necessarily saying that you should only release one ticket at a time to your technicians. But limiting the tickets they have available to work on will greatly improve the speed at which they work an issue. You can guarantee they’ll get to a ticket if it is the only one in their inbox right now. And they’ll work a hard ticket if they only have a choice of two difficult ones at the time.

But maybe you find that you can safely release three or four easier tickets to your techs at a time without slowing them down too much. This will take some experimentation.

Just don’t overload them. Tell them what you want them to be working on so they can focus their energy on making technical decisions for your clients and not for your business.

Lesson 4.) Documentation is Key

You can’t improve on processes which are not written down. MSPs are all about automation and process improvement. If you have your clients on the “all-you-can-eat” plan of Managed Services then the more time you spend working their issues, the less money you make as an organization.

Chances are, you probably deal with the same issues over and over again as an IT organization. But if you haven’t started documenting what those issues are, then you’re going to find the creation of “Savant Specialists” within your company. These are people who seem to be the “only person who can fix X client’s issues”.

While you want to hire on the smartest staff in the industry, it would be much better to have average staff who know how to write things down. Savant specialists all suffer from the same two problems.  A.) they don’t have enough time in the day to help everyone (even if they wanted to) and b.) they will eventually leave your company (or just call in sick) and put you in a difficult position on how to continue their duties without them.

They basically have a vice grip on your MSP and you are going to struggle unless you start forcing them to record what they are doing for your clients.

I’m not talking about writing up ticket resolutions. I’m talking about using a service like IT Glue to write up cheat sheets on how to do tasks for your clients or internal operations.

As an example, here at Virtual Administrator, we have a document on how to properly manage each action inside of our GMS server. That way if our lead tech for GMS were to be unavailable, the business doesn’t grind to a halt as we wait for his return.

It is ok to allow people to groom their technical skills and even to specialize in a certain area of expertise. But you should never be in a position where you cannot function without them.

And this also applies to you as the MSP owner. Don’t think that just because the buck stops with you that you are exempt from documenting your own daily practices.

Lesson 5.) One small improvement per week (or even per day) is all it takes to make a great IT Organization

It isn’t important to focus on massive internal improvements. Making small changes each week can have an incredible long-term impact on your MSP. Once you know your processes and have them documented it becomes a LOT easier to find small ways to improve as an organization.

Maybe you automate something. Maybe you identify your staff is doing something which they shouldn’t have to be doing (perhaps they regularly reboot a server which shouldn’t need to be rebooted. Setting aside an hour or two to track down the issue might resolve any future tickets related to that issue).

The world of DevOps is highly focused on small rapid deployments of new software. In the world of IT it is the same thing. Small, rapid deployments of process improvement will take you from an overwhelmed and underwhelming backwater MSP to a top tier MSP in a surprisingly short period of time.

The important thing is to build an internal culture that isn’t afraid to try new things to further improve the company as a whole. 3M is famous for how they encourage their employees to take time to experiment. Things like Scotch tape would never have existed if 3M hadn’t encouraged their employees to try new things.

Your staff probably has a bevy of ideas to try. Implementing one at a time and testing to see results will give your MSP an edge over everyone else in the industry.


Bottom Line: If you haven’t yet read The Phoenix Project, then it is high time you do.

It is a fascinating book, an excellent story, and you’ll be grinning like I was when you read about all the servers being down and it isn’t your responsibility for once!

I think we can take a lot of pressure off our helpdesks which we are needlessly assigning to them. I’d love you hear your thoughts on this as well. What have you done to improve your internal helpdesk operations? Hit me up in the comments below ?.

Get the Book on Amazon.