Resilience Roundup - Navigating through Large Display Networks in Dynamic Control Applications - Issue #31

If you’re in the Austin area: I’ll be speaking at DevOps Days Austin this week, if you’re there, come by and say hi!


Navigating through Large Display Networks in Dynamic Control Applications This is a paper by Dave Woods and Emilie M. Roth from the proceedings of the human factors Society 34th Annual Meeting from 1990.

One of the things that I really like about this paper is that it was written just as control rooms, for example, nuclear power plants were starting to change over to computerized displays. That was almost 30 years ago. And yet, the things the authors mention here don’t seem to have improved by a large degree, perhaps in pockets, but not across the board in software.

Obviously, some implementations of large monitoring displays have gotten better, but I don’t think that that is a universal truth. Thinking about some of the dashboards I’ve seen (or written for that matter), they vary widely in how effective they are in supporting diagnosis and investigation.

The authors say:

“we can use the power of the computer to put each piece of plant data in context that makes it meaningful for particular mode of system operation.”

I certainly agree, I just haven’t seen it much. Further they say:

“we can bring the data they need to the practitioners and display the data in different ways corresponding to the practitioner’s task”.

I don’t recall seeing this very often either. Sure, there are dashboards for a database or application or network, but very few that I’ve seen seem to be geared towards my task, so much as the system subset that the metrics came from.

The good thing about some of these things is that there is not really as much of a technological limitation in improving here. And leaves us lots of ways to make improvements to our existing setups.

The authors contrast traditional displays that have control panels and hardwired gauges with replacement systems that have computer monitors. They mention that the classic way of representing data in a computer-based system is hierarchical. And that a hierarchical system could indeed work if the computerized system only had 30 to 40 individual displays available.

“but when we are dealing with fully computerized control center that includes thousands of displays, hierarchical organization is inadequate and does not provide the navigational tools needed to deal the huge space of data display that is possible”

This seems very relevant to me today even almost 30 years on.

It’s important to note that the authors emphasize that many of these human factors issues in these control systems are not new. It’s not that the old system was great and it’s hard to replicate in a computer set up. It’s that there were some benefits to having that physical layout along with its resultant adaptations which are lost when we moved to computerization. Also, the amount of data and the different types of data can all be displayed in different ways now; in increasing ways. This opens up the possibility for new failure modes and new difficulties that weren’t present when using the old system.

“The shift to more computerization and control centers, where the computer can do much more processing, doesn’t eliminate all of the hard problems in control center design”

In the old system, with hardwired physical control panels you could walk into the room and immediately observe a lot of the design of the representation. You could see the work put into it. You could notice, for example, how things were clustered or how many inputs were available to monitor. Whereas in the computer model this is really the opposite. You can see almost none of the design of the representation. Sure, you can see where the monitors are, maybe even where the computers are located, but you can’t see any of the design results or potential complexity that is there.

“the real design action potential complexity is behind the screens in the thousands of displays that an operator could call up”.

The authors provide examples of (a then) new plant in France whose fully computerized control room had over 10,000 different displays that could be called up by an operator or an operating room patient monitoring system that had over 150 different screens to operate.

Previously, in the old system people didn’t have to navigate these large displays without aids. They would be able to navigate them by walking down next to them so navigational options were more apparent. With computers, we now have to navigate a “virtual space” of many, many more displays.

This opens us up to new problems where we can get lost in the data, or we can have tunnel vision and only focus on a subset of displays, not to mention the mental overhead, now increased, as we have to manage how the data is displayed in an interactive system. The authors note that even in the old system, data overload during fast-moving events was already a problem in control centers. So the move to computerized displays had (and still has) a chance to either help or hurt, depending on how we design and use them.

I found it interesting that the authors talk about how easy it is in the computer system for a designer to rapidly prototype or create new displays. This really struck home for me:

“The use of designer aids that support rapid prototyping of displays, as is necessary when one is developing large scale display systems, can lead to a proliferation of displays without adequate consideration of across display organization and navigation issues”

I’ll admit that there is the possibility that I simply have had bad luck in the monitoring dashboards that I’ve been exposed to across a number of teams and organizations. But I don’t think that that’s the case.

I think that there’s a good chance we’ve all had at least some subset of similar experiences. Whether you agree or disagree, please reply let me know, I’d love to hear more about experience.

I’ve definitely done this before, throwing together dashboards without these considerations. I think it’s probably the norm. And of course, it’s not that having fit any one person or team is twirling their mustache as they design dashboards, happy to have done dastardly deeds. But in software we can be prone to not considering as much about the operator, especially in emergencies.

“in part, this can occur because, when this technologically easy to create and add a new display to the system, the solution to every problem can end up being a new computer display”

“This is one case illustrates the potential for misuse of rapid prototyping: with rapid prototyping techniques, you can make the same mistakes, only more quickly run a larger scale”

Designing workspaces

When you design a work space you really think about the types of data that’s going to be shown and how it’s can be coordinated across displays. I think in software, especially for software engineers, we can fall in this trap where we try to make things incredibly flexible so that it can accommodate any sort of use. The authors address this:

“Total flexibility, i.e., any display chunk can appear in any viewport as the observer chooses, represents a failure to design the workspace”

That sums it up a lot. It’s not that we’re trying to do poorly here obviously, but as an industry may not be purposely designing workspaces very often.

The authors continually emphasize the danger of creating a keyhole effect, where the operator can no longer see a big picture and becomes fixated. This is because the amount of data that we could display are so much larger than our available views. This is true regardless of whether those views are our monitors or the screen real estate. This leaves the main challenge to developing these systems as how to balance the flexibility and power we have with software with still being able to support the people that operate it and critical functions they’re monitoring.

Using traditional methods to help evaluate

Looking at the traditional ways of solving this problem without more computation or software, some paradigms emerge that can be held up as an evaluation tool. We can use them to see if the design that we’ve come up with supports or is failing the cognitive activities that occur in their usage.

When you enter one of these rooms the first thing you’d see is either large screen displays or you’d see a rectangular grouping of lights that would show you different statuses by being lit. Both of these give some directions and hints at what the overall status of the system is if you’re familiar with the domain. Stepping back from how that’s implemented in physical space we can then notice a few different cognitive patterns.

We should remember that these control centers almost always involve multiple people or as they like to say in the research, “multi-agent settings”. This could be anything from two or three pilots on a plane plus the autopilot or even 15 to 20 people in mission control.

Overall status indicators, regardless of how they are implemented, provide a common frame of reference for all the different people to use for problem solving. This is especially true in the physical space, where you know all these people have this panel in common, so it must be at least on some level shared by all. This gives a jumping off point from which to coordinate and share information since they know they’re all using the same physical representation of the system.

This overview means that everyone can get at least some sort of quick look at the state of the system. The authors call this an “orienting function”.

It turns out, that when something would change in the system, maybe some sort of safety system activated, the operators would walk down the control board or scan across. This is allowing them to update their knowledge about what the system is doing and evaluate their previous assessments. This walk is letting them look at automatic systems and their state and is helping them begin to diagnose what is happening in the system and detect aberrations.

A skilled operator is actually able to detect some abnormalities that they were not alerted to or even necessarily looking for. They can do this because as they’re walking down that space or scanning across these different control centers, they’re also seeing the displays that are in between. There are things that can still catch their eye if they’re out of place.

Next, as is true of many of our software incidents, when there is trouble there are very often going to be new people coming into the incident or the scene to either help monitor, help control or help diagnose. This ability to bring new people up to speed and integrate them into the response without having to bother the operators is critical.

Another function we can extrapolate is attentional control. This is where we’re able to shift where we’re paying attention. In this case, specifically the ability to quickly shift views so that we can continue to track an evolving incident. Effective representation systems here are going to help us from becoming too focused on a single particular trouble spot and allow us to shift our attention to other areas may also be experiencing trouble, it can help us step back from our zoomed in view and size up the situation again with his little cognitive effort as possible. Especially in a way that doesn’t disrupt our current diagnostic process.

Finally, we can consider how it is that the person using this representation decides where to look next. If you’re looking at single window of data, how do you know where to go next without risk of getting overloaded in the data?

The representations that we create should support filtering irrelevant data from relevant ones given a variety of contexts. This goes back to the keyhole effect. If the representation doesn’t give a good way to zoom out or tell the user where to look next or help them filter all the possibilities of where they could look next, then it’s likely that they will get trapped in this view and overloaded.

“the technological shift to a computer medium is a double-edged sword. While it provides new representational power for supporting cognitive work, it also undermines some partially successful adaptations that have been worked out for the previous medium. It provides the capability to create much worse control centers, as well as much better ones than our previous baseline

Takeaways

  • The design of physical control panels can give us another way of evaluating our designs.
  • The ease in which we can crate more views and dashboards can pose a risk that we create many, poorly designed ones.
  • Regardless of how we choose to represent data, its critical that we consider how we can allow for coordination and exploration of that data without damaging diagnostic processes the users are going through.
  • We have a great ability to help these concerns or harm them.
  • Making our software and our views too flexibility can make the problem worse.

Don't miss out on the next issue!