Wednesday, August 19, 2020

Similarities between SANER and Application Monitoring

Opsview Monitor 6.0 Dashboard

SANER falls into a space of healthcare that most Health IT developers aren't familiar with, or at least so far as they know.  

This post is going to show how measures of situation awareness fit into existing math/science, quality measurement and software monitoring techniques and reporting already well understood by software developers and system architects. 

If you live in the enterprise or cloud-based software development space (as I have for decades), you've built and/or used tools for application monitoring.  Your tools have reported or graphed one or more of the following:

  1. The status of one or more services (up/down).
  2. The stability of one more more services.
  3. Utilization as compared to available capacity (file handles, network sockets, database connections).
  4. Events over time within time period and cumulatively (total errors, restarts, other events of interest, hits on a web page).
  5. Queue lengths (outstanding http requests, services waiting on a database connection, database locks).
  6. Average service times (also 50%, 75% and 90% times).
This is all about situation awareness, where the "situation" is your application.  There's ton's of science (and math) around use, aggregation, et cetera, of these sorts of measurements.  People write Theses to get masters degrees and PhD's to advance the science (or math) here, or just to implementation some of it.

Let's look at this again from a different perspective:

  1. The status of one or more services (up/down).
    1. Is your ED open?
    2. Do you have power?
    3. Do you have water?
    4. Do you have PPE?
    5. Do you have staff?
  2. The stability of one more more services:
    1. Are you going to run out of a critical resource sometime soon?
    2. Do you have enough staff?
  3. Utilization as compared to available capacity:
    1. How many beds are filled and free in your hospital?
    2. Your ICU?
    3. How many ventilators are in use or free?
  4. Events within time period and cumulatively over time:
    1. How many tests did you do today and over all time?
    2. How many were positive?
    3. How many admissions did you have?
    4. For COVID-19?
  5. Queue lengths:
    1. How many people with suspected or confirmed COVID-19 are waiting for a bed?
    2. How many COVID-19 tests are awaiting results?
  6. Average service times.
    1. How long is it taking to provide lab test results for patients?
    2. What is the average length of stay for a COVID-19 patient?
I'm not making the questions that are being asked up, they come from real world measures that are being reported today by many organizations around the world.  All of the above are at some level essential elements of information for management of the public health response to COVID-19 or other emergency.

Hopefully you can see how the measures being requested are basically the same things you've been using all along to monitor your applications, except instead, they are being used to monitor our healthcare systems.

   Keith


1 comment: