Demo: Hyper-Automation of I.T. change requests and Incidents

Previous demos:
  1. Change Request auto-execution: Creation of Oracle GoldenGate deployment
  2. Incident auto-resolution example: A broken Oracle GoldenGate replication
  3. Change Request: Enterprise-wide omnivendor emergency password rotation

INTRODUCTION

Within the next 5-8 years, most skilled IT labor will become a commodity, similar to electricity or water. That means system, database, or network administration jobs will be replaced by code. Those jobs will be automated and billed hourly by the Cloud providers. Just as an OS arrives already preinstalled on your laptop, these highly paid occupations will be digitized and offered as premium services with a Cloud product you subscribe to. Flip a switch, and voila: you’ve just hired a 100-person team of top-notch system or database administrators at a fraction of the cost and with less headache than a human team. Jobs that once disappeared over time include lamplighters, telegraph operators, and human computers (the NASA employees who performed calculations on graph paper — calculations that today any cell phone can do in nanoseconds, instead of weeks). Those jobs became obsolete because something—a process or device—could perform them faster, cheaper, or better (or all of the above). In the IT realm, that was impossible when hardware was physical. But now, with servers, routers, and switches reduced to simple JSON code, the jobs that maintained them will become the same. This demo shows how that works.

Currently, the IT professionals resolve ServiceNow Incidents and run Change Requests manually. Those consume as much as 90% of the administrator's workload, sometimes on weekends, holidays, and off business hours. The process is time-consuming, delay- and error-prone, not to mention the obvious reliability, security, and scalability issues. Additionally, the on-call rotation leads to eventual personnel burnout.

There is a better way now: closed-loop automation, which is like putting most of your work on autopilot. The following 30-second video explains how the process works: 

Closed-loop vs open-loop automation 

Hyper-automation of an IT enterprise is achieved by two means:
  1. Incidents should be auto-resolved, from error detection, resolution, all the way to knowledge repository update.
  2. Change Requests should be run by end-users because only they know WHAT needs to be done, though at a very high, non-technical level. Here is an example: "I need this data flow over there, because we are onboarding a new merchant". The technical part - the HOW it must be done - should be handled by automation.

The difference hyper-automation makes:

Blazing Speed

How would you like your six-months-long projects completed in under two, and always under budget? What if you only had to wait seconds for a complex technical problem to be solved, instead of the weeks it usually takes for people to get around to it? Or a change request delivery that is faster than a vending machine selling you a Cola?       > Click

< Click to collapse

Here is the life cycle of a conventional IT change.

Life Cycle of an IT Change

The assessment, routing, approval, and implementation stages alone may take weeks. The execution stage also often takes hours or even days. In some cases, the administraor may become ill or unavailable, prompting the escalation manager to seek an alternative. In the case of hyper-automation, however, there is no waiting. The delivery starts in seconds - be it an Incident resolution or a Change Request execution. 

Life Cycle of a Hyper-automated IT Change

Here is another example: a long-running project, like fleet upgrade or DC migration (depicted below).  

Project Duration Conventional 

These take time for two reasons. Unlike robots, people don’t work 24x7x365. They are also terrible at multi-tasking and tracking dependencies, especially if the project involves multiple parallel tasks executed by disparate teams, departments or business units. That is why hyper-automation is perfectly capable of condensing the same 6-month-long DC Migration project to just two short months.

Replayable, Automated project duration

Low Cost

What if you didn't have to hire a team of professionals to manually manage resource provisioning, asset configuration, issue troubleshooting, hardware setup, and so on? Automation saves time and money. How much money? Click to see just one example of what hyper-automation can do.     > Click

< Click to collapse

Here is a comparison of an omni-vendor enterprise-wide emergency password rotation, performed on 10,000 database assets at the same time to address an immediate security threat. It compares a conventionally automated activity with a hyper-automated one. Apart from the duration of the task dropping from 34 years (yes, that is how long the activity would take for a team of professionals to do this!) to only a couple of hours, the cost falls from $19.2 miliion dollars to $200.

Password rotation automated vs hyper-automated.

You may ask: how is that even possible?

It became possible as soon as the manager assigned the emergency password rotation Change Request to a hyper-automated group, instead of a regular database team like the San Diego Database department group shown below. This department is not automated, of course.

the San Diego database group, un-automated

The conventional San Diego DBA team employs 9 top-notch database administrators at a cost of slightly over $1 million per year, plus benefits. These people work only certain hours a day and certain days a week. Their knowledge and experience are not interchangeable, meaning that what one employee knows and does, the next may not be able to replicate immediately. However, they are getting paid whether they are working, coasting, or entirely at rest. Even ServiceNow shows the hourly rate for the San Diego database administrator as $114/hr.

the San Diego database group, un-automated

But that San Diego team was a conventional, unautomated department. Let's assign the same Change Request to a hyper-automated team instead. This department is the perfect candidate: "Database Atlanta". It employs only one administrator, Bow Ruggeri, and the second employee is a hyper-automation module ("Replayable" is the name of a product).

the hyper-automated Database Atlanta group

Yes, hyper automation software appears as an employee in ServiceNow, but a very special one. This employee never eats, sleeps, never has personal issues, or asks for a raise. It works round'the'clock, 24 by 7. It is 1,000 times more productive than the 9-count San Diego database administration team. Here is how much the hyper-automated Database Atlanta group costs the enterprise: $17.12 an hour.

the hyper-automated Database Atlanta group description

Here is the hyper-automation module's HR entry in ServiceNow.

the hyper-automated Database Atlanta group description.


< Click to collapse

High Quality

The accepted human error rate is around 4%. Hyper-automation operates in an error-free realm. Unlike humans, it runs precisely what must be run, where it must be run at the right time. That means projects are delivered faster, way below budgets, features released ahead of schedule - that means competitive edge, that means new customers, that means bonuses for all involved (except for the automation software that made that possible, of course!)       > Click

< Click to collapse

People commit mistakes. That's a fact. No matter how much effort your team puts in, they are bound to make errors, several of them, in fact. This includes situations like running the wrong script, on the wrong database, not running it on time, not being prepared (not having the right password or connect string), not understanding the prerequisites or the handover process, and so on. In the case of Closed-Loop Automation, the execution logic remains unchanged. It doesn't depend on how much sleep the admin got the night before. It scans execution logs (sometimes thousands of them, in parallel) in real time as they are written. If it detects an error, it immediately opens an Incident in ServiceNow for the DBA to investigate the failed CR and attach the error log. It does not understand the concept of "good enough" and honestly thinks that 9-5 is just a number. Another way to describe hyper automation is that it is always at its best, 24x7x365. Forget that —not just at ITS best, but at the INDUSTRY's best, because whatever the technology or vendor, hyper-automation modules have vendors' best practices embedded. You execute the Change Requests and Resolve Incidents repeatedly, and get predictable results every single time. Automation=Quality=Control.
Here is the yearly error rate for Incident resolution, compared between a team of DBAs and a hyper-automation software.

Error rate comparison


Ownership

Resource configuration is a labor- and skill-intensive task, usually handled by highly paid and sought after resources. When one of them leaves the organization, they take their knowledge with them. Not only that, the years of training you've invested in the person will start benefiting your competition from then on. The replacement you hire will require time, effort, and even more training to understand the predecessor's code, which he will ultimately discard to write his own. His effort, in turn, won't be reused either. With hyper-automation, the code always remains with the organization.      > Click

< Click to collapse

What used to be physical (the servers) is now just definition code that explains to the Cloud what it is. You owe it to yourself to protect your investment. With hyper-automation, the task definitions and automation workflows never leave your repository, always properly secured, versioned, and tagged. The administrators don't even see the low level code, which runs by itself when the end-user clicks the "Submit" button in ServiceNow. The code doesn’t require DBA understanding, because it is modular.  Replayable code


Accountability

There is never any pointing fingers with our software. When you need to trace changes to definition files, you can do it with ease. They are versioned, so all changes are recorded for your review at a later time. So, once again, there's never any confusion on which module did what, when, where, and why.      > Click

< Click to collapse

With the advent of the Cloud, it became almost impossible to keep up with different vendors and technologies. In the case of automation, the code is in a single source of truth.

Replayable, ServiceNow

There is one centralized code repo (Github, Gitlab or similar), one encryption Vault for storing passwords/certificates, and one location for execution logs (ServiceNow ticketing), no matter which Cloud vendor or product is used. There’s never any confusion about what the automation did, where, when and why. You execute it repeatedly and get predictable results every single time. There is never any finger-pointing, when it comes to IT services automation.

Replayable, ServiceNow

Infinite Scalability

The automated solution can run 1,000 Change Requests at the same time, elevating admins from low-level button pushers and script runners to a more prestigious status as architects. With a Closed-Loop implementation, you may increase your workload exponentially before there is even a need for a Control Node upgrade.       > Click

< Click to collapse

Replayable, DBA rested

The load a current IT service management team carries can be increased by 5%-10%, and only for a few days. If the number or the priority of the requests is to double today, the process becomes unmanageable by the current headcount. It doesn’t matter how much money you throw at your employees. Overworked, they will eventually burn out and quit to serve your competitor.

Replayable, DBA tired

Our closed-loop automation software will do an equally brilliant job whether it runs on one server or a thousand at the same time. The image below runs a Change Request on a server named "lnx1000" (that is a screenshot of the automation inventory file).

Replayable, scalability settings

If you want the same task run on a thousand servers, all you have to do is to replace the "lnx1000" with "lnx[000:999]". The automation software will run the same CR on servers lnx000, lnx001, lnx002 ... and all the way to lnx999. A thosand of them!

Replayable, scalability settings

That is the equivalent of hiring a thousand DBAs on a moment's notice and releasing them just as quickly during a busy period, or when your company expands. Our automation software will do an equally brilliant job, whether it works on one server or a thousand at the same time.

Replayable, scalability settings

Secure Encryption

The automation uses the AES-256bit symmetric encryption algorithm. That means all sensitive data, whether in transit or at rest, is always encrypted: passwords, variables, certificates, API keys, and other credentials. The automation Vault prevents any sensitive data exposure, even once. The automation never leaves execution logs on the servers it manages; all logs are stored centrally in ServiceNow and Ansible.     > Click

< Click to collapse

Replayable, vault

What is the best way of ensuring your mission-critical passwords are never distributed to unauthorized personnel? The answer is obvious. Do not distribute to any personnel at all. Let the robots handle the sensitive data. That is what they are best at.

Each instance of storing that password on a hard drive in clear text, even for a second, violates basic database security auditing rules. The same applies to unencrypted network communication. For a small company that may not pose a credible threat. However, large enterprises are often under strict auditing rules. Security violations are a matter of life and death. For example, if a quaterly audit finds elevated passwords stored on a server o transmitted via network, the auditing entity may revoke a long-standing contract. Hundreds of millions of dollars are at stake, and countless jobs are on the line. 

Replayable, Ansible vault

Here are the usual automation suspects in database administration, for example: 
  • Database schema refreshes
  • New user creations 
  • Existing user password resets
  • Tablespace out of space 
  • Patches 
  • Upgrades 
  • Migrations 
  • Backups or restores 
  • Routine database script runs (this very demo!)

The Demo

The following explains in very general terms how this demo works. Usually, IT administrators receive their assignments (Change Requests and Incidents) from a ticketing system, such as ServiceNow. Here is the lifecycle of a conventional IT change:

Life Cycle of a Conventional IT Change

Pic 1. The automation workflow for this demo: what steps are performed

 

Lifecycle of a Hyper-automated IT Change

Pic 2. The automation workflow for this demo: what steps are performed

In a hyper-automated implementation, however, there is no human intervention in the process at all (other than the initial design and maintenance of the automation logic). Here is what the unattended version of the same process looks like.

This demo will show the most common DBA task - a script run. Normally, a DBA gets notified of the Change Request by email. He/she gets the details from a familiar ServiceNow (or another tool) interface. The administrator then runs the scripts in one or more databases, as requested in the change request.

The hyper-automated Change Request execution process bypasses the administrator entirely. The ticketing system communicates with the automation control node in real time via the API. The approval and verification processes are also done in real time. The automation node then connects to the asset to be maintained (an Oracle database in our demo) via SSH, SQL*PLUS, or an API, and executes the Change Request.

Here are the tools involved and the automation logic.

What tools are involved in the hyper-automation demo

Pic 3. What tools are involved

 

automation logic workflow

Pic 4. The automation workflow for this demo: what steps are performed

Now, for the demo. There are three short screen capture sessions (2-7 minutes each). Please watch them in their entirety. They show how an end user performs a complex IT change without the help of an IT professional, just like ordering an item on Amazon or streaming a movie on Netflix without knowing all the complexities of what happens behind the scenes.

Demo 1: Change Request auto-execution example: A routine data change ticket with 50 database scripts

Demo 2: Change Request example: Creating an Incident for automation failure 

Demo 3: Change Request example: executing an elevated-privilege PRODUCTION Change Request requiring a CAB approval 

Previous demos:
  1. Change Request auto-execution: Allocationof Oracle GoldenGate deployment
  2. Incident auto-resolution example: A broken Oracle GoldenGate replication
  3. Change Request: Enterprise-wide omnivendor emergency password rotation

SUMMARY

Hyper-automation enables a business to accomplish much more with far fewer resources (technology and personnel). It is time you started hyper-automating the most common Incidents and Change Requests until all, or at least most, are on autopilot, as I demonstrated above. Please contact us if you need help hyper-automating your IT shop.