— September 5, 2017
In most companies, IT Departments have always been a bit like “the distant cousin” who knows how to fix the washing machine. Their very existence is only remembered when something goes wrong with it, and most times they end up being blamed for the initial problem.
The market self-healing momentum has led to the event of some supportive methodologies and frameworks which aim at promoting both the effectiveness of IT Services (in the case of ITIL) as well as the very place of the IT Department in support of the corporation (in the case of COBIT).
ITIL sets forward a framework that classifies and manages the lifecycle of key IT corporate assets (Configuration Items) whereas COBIT promotes proper workflows between the IT Department and the remaining corporate areas in order to leverage the potential that IT represents towards supporting effective Corporate Core Business processes and pain points.
Both frameworks are perfectly aligned with the LEAN Manufacturing and Operation concept as well as Continuous Improvement and Error Mitigation frameworks/ methodologies such as Six Sigma or Kaizen.
The point is that while a Corporate Change Management Process (which implies investing time and money) towards going LEAN (by implementing herein mentioned methodologies such as Six Sigma or Kaizen), will add value and raise efficiencies throughout the entire organizational value chain’s scope; implementing COBIT, is primarily perceived as something that will mainly leverage (over an initial stage) the IT Department’s ability to better support the organization.
ITIL, on the other hand, shows a more direct beneficial line of sight since it can significantly contribute to map, monitor and improve organizational processes that directly impact Corporate Core Business (direct cost savings).
Now from a CIO’s standing point, while a manager in charge of assuring the performance of Internal Corporate IT Services, there is the vital need for having at his/ her disposal and on demand the main operational performance data pertaining corporate IT Landscape. This is achievable through the definition and establishment of mechanisms and processes that enable assertive measurement of Key Performance Indicators (KPIs).
A Key Performance Indicator is a measurable value that bears capital importance in demonstrating how a given system or process is performing towards meeting inherent established objectives.
KPIs are usually clustered in two groups that represent distinct levels of detail:
- High-level KPIs – focus on overall performance and the added value to the organization.
- Low-level KPIs – focus on the performance of specific systems or processes.
Now how can KPIs be developed?
Since KPIs serve the purpose of measuring the adherence towards established goals and objectives, one needs to begin by having those defined in an unambiguous and assertive manner.
The following step consists of picking each goal and objective separately and define:
- What is the target and thresholds – target value and acceptable deviation
- How to measure – where is the data coming from and how shall it be collected and stored.
- When to measure – what is the measuring frequency that allows the best and most assertive data collection which assertively enables a view of
As an example, let’s pick an intuitive KPI that relates to about 100% of client software and which is “availability”. Every client user wants the software that he/ she uses to be available everytime it is needed.
So, the KPI here could be “Client Software Availability”.
Now, what should be the goal here?
As mentioned the goal is to have the software available every time it is needed, therefore the target is 100% availability on office work time, meaning Monday to Friday from 08:00 through 18:00 hours.
What is the acceptable deviation?
We can say that if once a month a client is not able to use the software for 30 minutes that will not represent a serious impact on the company Core Business. So, we pick 22 working days in a month and multiply them by 10 work hours per day and we get a total monthly work time window of 220 hours (or 13,200 minutes). Having the software down for 30 minutes each month means an availability of 99.7%.
How can the measure be made?
Every workstation has event logs which can be checked to assess which down times or errors, as well as process activations, have been triggered and it is easy to configure a local routine that forwards a message to a central information repository upon any given incident.
When should the measuring be active?
Being this the case of an event triggered action, the measurement will take place every time the event happens.
The CIO’s KPIs
The CIO is both a Manager in the organization (therefore, responsible for running a cost center) and an Internal Services Provider who usually resorts to 3rd party entities to assure such services towards the organization.
In this manner, the CIO has two separate groups of KPIs that support his “steering” activities, “Organizational” and “Operational”.
The Organizational KPIs pertain the IT Department’s metrics such as Financial Performance (e.g. being in line with the budget) while the Operational KPIs address Service topics (e.g. IT Systems availability).
Let’s now list some of the main KPIs according to herein established groups and provide a short explanation about each:
- System availability (including Systems and network) – The majority of the systems that run at any corporate network as well as the network itself do so to support the Core Business. Therefore, it is of the utmost importance that those are available 100% of the required time.
- Downtime – Although it may seem redundant when considering the previous KPI, in fact the downtime metric by itself it is relevant towards assessing the overall IT Landscape behavior whereas the System Availability applies to each individual system independently.
- Average queue time of incoming phone calls – The end user is a key element in the corporate world and having someone waiting on the phone to be supported by Service Desk is a major waste of productivity, hence the relevance of this KPI.
- Lost calls – A complimentary KPI to the previous one pertains the number of lost calls to the Service Desk for it shows how many users, although in need of support, decided to abandon the call since it was taking too long (under their perception) to get an answer.
- Service desk client satisfaction index – This is one of the main KPIs to assess how the company is happy with the performance of the IT Department in its main role of supporting the business.
- Ticket handling time – This mirrors and efficiency metric, therefore enabling (when a comparative analysis is performed over time) to understand if the service is improving or the opposite.
- Adherence to Response time – Each system is classified according to the severity and extent of damage towards the Core Business that a severe incident in that same system is likely to represent. That leads to the establishment of required Response Times to mitigate such potential impact.
- Adherence to Resolution time – Known systems have service support patterns defined over time, which allows to establish an average time that is acceptable to perform a corrective action, and therefore the definition of Resolution Time. Let there be noticed that this is a very “delicate” KPI for in most cases critical incidents imply having the software manufacturer involved, since they are the ones with access to the code.
- Service Reposition time – It is most common to have this KPI defined instead of the one above for it implies having the system running although the incident route cause may still be unknown.
- Backup and restore success rate – One of the main business resilience aspects of any Company pertains the ability to safe keep vital data and have it restored in case of major loss. This KPIs is in fact a combination of two the Successful Backups rate and the Successful Restore Operations rate that measure up to which extent regular backups run successfully in one case, and up to which extent restore procedures also run successfully.
- Service desk tier 1 resolution rate – This KPI mirrors a key metric for any operation which pertains the ability to continuously improve the rendered service. So once a system comes into place it is normal to have most of the incidents escalated to the 2nd or 3rd level support teams, but as time goes by, the knowledge of how to handle incidents and event needs to be documented into simple procedures so that at a given point in time (usually within a year of its deployment), the majority of incidents is solved by the Service Desk with no need for escalation to expert teams.
- Adherence to SLA – When resorting to 3rd party providers, an IT Services Support Contract is set in place between the corporate client and such provider company. The Services to be rendered are regulated by Service Level Agreements (SLAs), that basically define WHAT, HOW, WHEN, HOW MUCH, resorting to WHICH tools and assets, is being supported and measured according to which KPIs. A “must have” KPI is exactly this one which demonstrates how the provider is adhering to the overall SLA’ terms.
- Change driven Incidents rate – Like someone once said, “the only constant in life changes”. As live entities both the Corporation (driven by market trends and events) and its IT landscape are subject to changes. Now it is relevant that a change does not impact in a new incident, so it is relevant to measure how many changes lead to incidents.
- Problems resolution rate – One of the most difficult things within a continued service rendering towards a wide IT Landscape pertains identifying, dealing with and fixing Problems. Now, a Problem here means a recurrent miss behavior on a given system or systems that occurs repeatedly out of distinct incidents.
- Deployment success rate – Some regular changes within any IT landscape concern patching, versions update and adherence to configuration standards. These are considered under deployment and they may have each a specific KPI per cluster of systems or even system or a global one, depending on the established corporate IT Policy.
- Security compliance rate – Security is another major pillar of any IT Landscape and this KPI aims at assessing up to which extent such landscape is compliant with established IT Security Policies. Again the option may lay in one overall KPI or up to individual KPIs by system, depending on the approach and criticality degree at hand.
- Shadow IT percentage – A major factor towards raising the security and compliance risks within an IT Landscape in the proliferation or existence of unauthorized non-sanctioned IT resources. These can range from a simple unprotected access to a public cloud based storage service where to some user sends internal company documents up to the installation of a server under someone’s desk which is connected to the network via a local shunt through his/ her desktop. The percentage of these forms of IT resources within the entire Corporate IT Landscape is a strong indicator of security vulnerabilities.
- Preventive Maintenance backlog – One main factor that assures operational status (meaning lowering the disruption potential) its related with the accurate and timely performance of Preventive Maintenance Actions (a simple example would be cleaning database logs). This also accounts for the overall assessment of Business Continuity and IT Resilience potential of a given organization.
- Percentage of systems covered by antivirus/anti-spyware software – Although it may seem logical that all systems need to have installed and running the latest version of such software, that is not the status quo in many companies. The way to move in the full compliance direction can only be undergone if there is a way to identify where the company stands “on the map” at a given point in time, hence the relevance of this KPI.
- Full-time employees (FTEs) as a percentage of total IT staff capacity – Outsourcing stands for many things, amongst which: flexibility, on-demand expertize, cost optimization, innovation and so on … or it should if properly done. Therefore, it is relevant to understand the ratio between own resources and 3rd party ones.
- Average 3rd party hour rate – This is a KPI that contributes to a benchmark analysis and decision-making process on what needs to be optimized in terms of service costs.
- Average hour rate over Priority 1 and 2 Systems – In a similar way to the previously mentioned KPI, this one gives a picture of how much is required to assure support towards the most critical systems in the company and how such expenditure evolves.
- Costs of operating call center/ service desk – Same as the previous one, but towards service desk.
- Adherence to budgeted expenses – A common self-explaining KPI to any Cost Center. Budgets need being properly estimated and not overrun, for that negatively affects all major corporate KPIs such as Profitability, EBIT, and so on.
- Cycle time for expense reimbursements – This is one major relevant KPI towards team satisfaction. When any collaborator spends his/ her own money serving the company and is entitled to be reimbursed, the company must assure an expedite process.
- Cycle time to resolve an invoice error – In most companies the data pertaining each area/ department related invoices are filled in by someone from that same department. In the case of error, the correction process needs to be swift since the company may incur in expenses such as VAT payment that will require a long period of time to be reimbursed.
- Fixed costs optimization ratio – The effort towards Continuous Improvement need to also bear in sight the financial component inherent to each asset and work process. One of the tasks of each department manager is to seek cost optimization.
- Percentage of payable invoices without purchase order – In most companies issuing a Purchase Order without having involved the Procurement area is a major noncompliance. The area’s and area manager alignment with corporate rules and guidelines is also demonstrated through this KPI.
- An average number of training hours per employee – Having a team that spends 99% of the work time at the desk, having attended the last training action some years ago does not stand for productivity, yet for technology illiteracy and outdated skill set. One main requirement of a good manager pertains the ability to maintain his/m her team up to date towards the department core competencies trends and new improved ways of acting/ operating.
- Average time to competence – The efficiency level of welcome home/ onboarding process towards new members represents time and money, therefore it is a core responsibility of any department manager to assure that these are swift and effective processes.
- Productivity rate – Although one may think how is it possible to compare the productivity of an IT Department team with, let’s say an assembly line team; the fact is that it is not only possible as advisable for it both allows healthy internal competition as a clear picture of where there is room for improvement. It is nevertheless not an easy metric to harmonize amongst the several departments.
- Downtime – No, it’s not misplaced. The Downtime Operational KPI concerns a given SLA within a services contract that needs to be assured by a given technical team towards the CIO/ IT Department, in his/ her role of the internally accountable element for the IT Support assurance towards the organization. This one pertains the responsibility the IT Department has before the organization to assure the IT Landscape availability. In a similar manner, if considering a factory, the Logistics Department will set up a Delivery Time KPI towards their Provider Transport Companies, while assuming responsibility for the fulfillment of the same internal KPI towards the Assembly area.
Please note that the herein described KPIs do not represent the full scope yet the most common in the market.
IT Department’s role
IT is a corporate base “toolkit” for any modern-day organization and the IT Department the internal team who assures that the “tools” are:
- The appropriate ones (aligned with Business needs)
- Running smoothly
- Available as needed
“Looping” back to the initial paragraphs in this article, most IT Departments in major Corporations already assure ITIL based services with proper KPIs in place, but still miss the integration on the corporation operational workflow, and one way to reach it is (as mentioned) through the implementation of COBIT (or a similar methodology).
Giving a brief holistic perspective on the subject, COBIT is a methodology that supports the alignment of enterprise IT Governance with Corporate Core Business.
The main objectives of a COBIT implementation should, therefore, be to:
- Assign objectives and duties to both Business Department managers and the CIO that create value to the organization.
- Plan the deployment of resources that fully address the corporate operational scope (processes and functions), therefore building, monitoring, and improving both IT and Business processes, frameworks, KPIs, guidelines, maturity and operational models.
- Focusing on cost optimization through innovation.
- Focusing on Security and Business Continuity
- Contributing to creating corporate wide awareness towards IT potential in supporting the Core Business.
- Setting Governance and Management as two separate workflows, hence ensuring successful momentum (needs are driven by stakeholders while management assures steering).