Cumulus Networks

Build/IXD + UX/Product/UI Design

Cumulus Networks’ First Product with UI

Leading Kleeen Studio’s Engagement Team

from

March 2018

-

March 2020

NetQ was already a highly-scalable modern network operations toolset that provided visibility and troubleshooting into networks in real-time through a command-line interface (CLI). After 3 false starts trying to design & build their first product UI (including using another design firm), Cumulus hired Kleeen Studio to design and build the new UI and UX plus implement the front end.

GOAL
NetQ was intended to be the flagship product for Cumulus Networks; allowing for increased customer engagement to get the sales done with a good POC, relationship & trust building for consultants & customers, instill confidence in the customer, a DIY-like pride of self-service of accomplishment.

TARGET USER
The target end-user of NetQ was either internal IT/NetOps personal (including network administrators, application engineers, compute virtualization admins, and network architects) or Cumulus consultants who were hired to facilitate sales POCs. Thus NetQ should have the flexibility from installing, to journal network/IT analysts to senior/advanced network architects to manage and configure the network.

Gallery

and

Selected Highlights

for

Cumulus Networks

is available

Here

Discovery & User Research

Existing Research

Coming into this project, Cumulus had a very clear understanding of their personas, use-cases, and capabilities. In many ways, this product’s underlying capabilities were fully baked. We, therefore, had an amazing set of end-user research conducted by the product-management team. In addition, they had recently conducted an extensive panel with end-users with a ton of valuable quotes and insights. Our main task for discovery was to understand the CLI product’s capabilities, comb through the extensive documentation on the use-cases, review the notes from the panel, and of course perform a competitive analysis.

Based on our in-depth review of Cumulus’ material, we knew that: 

PERSONAS – The 4+ personas targeted had an extremely wide variety of use-cases and a wide variety of detail granularity, and a huge range of technical skills. Moreover, many of their vocabulary for features implied different degrees of complexity (E.g. an application engineer might expect one or two inputs to “configure,” while a network architect may expect dozens of controls). 

FEATURE RICH – Cumulus’ provides an extremely extensive set of capabilities, KPIs, and metrics – many of which are critical to solving some problems – but not all problems, leaving large holes in workflows.

WORKFLOWS – The end-user flow is highly unpredictable, and largely is dependent on the in-moment context/data, the particular network, and experience. This resulted in prior designs internally by Cumulus and their prior design firm simply were unable to scale due to the lack of flexibility.

 

Workshops to Uncover Use-Cases

To kick off the engagement with Cumulus, we stated with some workshops where we tried to understand what use-cases (from the large variety their product can handle) were to be supported in NetQ, what specific capabilities related to those, and how those use-cases related to each-other and the various personas. We had workshops with resellers (a primary user), and multiple personas at existing customers.

This resulted in some of the most complicated, interwoven and complex requirements we had almost ever worked with. Not because the product goals were bad – but simply that Cumulus had built such a flexible, powerful and robust platform. We quickly realized through this research that we needed a non-standard approach since the classic pages, tabs, and linking would not make for a streamlined and accessible UX.

From our retrospective and new insights, we came to 4 distinct insights that directly informed the design of our UX approach, which ultimately lead us to conclude that we either needed to build 4 distinct products (one for each persona) OR create a solution that had a unusually high degree of flexibility while preserving end-user context.

❶ CLI was great for power-users/architects, but inaccessible for less technical users 

❷ Alerting / Monitoring / Configuration blurred 

❸ 4+ conflicting personas (reseller, architect, engineers, IT) with different expectations but overlapping vocabulary

❹ Everyone’s “site” was different

 

Iterative Workshops & Rapid Sketching

The single most challenging part of this entire engagement was exploring the best approach to capture the flexibility of the Cumulus platform through a graphical user interface while simultaneously supporting all the required personas and use cases (which can pivot from configuration to management to visibility on a dime). This was a multi-week process with heavy consultation and review with the client.

Chrome + Workbench​

The resulting solution broke down the user interface into two discrete (but deeply related) components: the product chrome and the workbench.

The chrome of the product would support an audit trail of what actions you performed in the product, a set of “widgets” for quick actions (e.g. frequent cards, clipboard, tools, notifications), a global search, and a sparkline showing the “recent” health of the network (for general visibility).

The workbench was a main area where each end user can add cards, decks (collections of cards) and arrange the cards as they needed.

There are three keys that makes the workbench more than just a dashboard:

  • cards can be resized by the end-user to determine the data granularity they wish to see (banner, card, sheet, full screen trouser)

  • instead of linking from a card to a page, cards can spawn new cards during drill-ins (making the workbench itself an artifact of your job process

  • workbenches can be saved/resumed/changed to allow for parallel processing of multiple tasks (and thus all “pages” in the main nav bar simply brought up/loaded different workbenches

Even user preferences, global settings, etc we all workbenches with dynamic and responsive content.

Card Design

Once approved, this approach streamlined the UI design process. No longer were there distinct “pages” that needed to be considered and designed. Instead, for each “feature” or set of KPIs, we simply needed to create one or more cards and determine what information/actions needed to be contained at each level of detail. 

  • Banners – a short summary KPI or action button, for easy visibility or resolution
  • Card – the “average” level of details, showing a single visualization, or a simple form (with many default values set) to launch a network process
  • Sheet – A tabbed view with multiple sets of details, visualizations, or more complete forms (with all parameters)
  • Full Screen – a trouser (full-screen modal) with multiple tabs, allowing for tables of raw data, complex network visualizations, and the most granular information for experts

Our design process going forward was to identify the information flow for a single card, then create a wireframe, and hand it to our development team to implement. At our maximum velocity, we were doing 2-3 cards per week.

Users could go from the small size of banners, to the middle size which were called cards, then the sheet, which was sort of a wide tabbed interface, and then eventually, if you really wanted to go into excruciating detail, it would be a full screen interface. Cards would be everything from presenting information, to configuration, to launching system tasks. 

What this allowed us to do, was you could launch the original product, with a set of cards, and then increasingly, add functionality and expose functionality from the CLI, by adding more and more cards. And the key thing to keep in mind here, is that every single card was designed to solve a specific problem that we heard from those end users.

Example Card: Network Health

The way we designed each cards was through looking at the interviews that we did, the extensive notes that we took about everyone’s process and the flows that they had, and then we would start to do the, functionally, the UX design of those cards.

The following image is the diagram that we would generate for each individual card. Notice this is one step back from a wire-frame. Instead this diagram was the data and workflow of the end user. This would then be handed off to our IX designer, who would create wireframes. Given the repeated pattern of all these cards, we rarely did high-fidelity UI mockups for each card (except for new visualizations) as we had created a standardized visual language and component system.

Every card would start off with a name, and more importantly, it would start off with the one question, one problem, that that card was going to solve. In this case, for the network health, what is the question that you have with the network health card? It is, is everything okay? We heard that from all of the, the personas that we were talking to. They want to know, at a given point in time, is everything okay?

The key thing is, what okay meant, and the information that they needed and what they wanted were going to do with it, radically different across the different personas. So what did this actually mean for us? So we would start off with a banner. We would always start off with the simplest way that we could answer this question.

For the IT end-users, this banner actually might be the end of most of the journeys. For the more experienced architect or engineer or reseller, this might just be the way that they would monitor or get alerts. So it solved multiple roles for the different personas. And what we would write up is we’d have the square which is the physical size of the card, the question, the questions that that specific view would answer. In this case the banner was is the network good or not, right? That was what we wanted to understand.

Then we would have the technical content. This would be the information that needed to be presented to solve or answer that question. This would include the database constraints that we would got from talking to the engineer. That was a critical aspect of this process.

It wasn’t just the user research that was informing us, but having a really good relationship with the engineering and database teams of what they actually knew about the network.

And then we would also include notes here that would resolve directly back to the user interviews. So, in this case, is the network good or not? That’s a binary thing that everyone was asking for. But what we heard, specifically, is some people were saying, you know, it would be nice to know, is it good? But we want to know the baseline. How good is it? Is it much better than normal? Is it less better than normal, but still good? So, it was an interesting piece of context that we would then document and bring in to the design. process, right?

We would keep that information here. Then, the UX design would then move on to the second size. So, you could take the small card and you could expand it and take it into what we call the L2 or the card size. And the idea here was to provide the end-user the supporting or backing up what happened in the L1.

For someone like the IT persona, if they had a question, this would probably answer their question and be sufficient. But, for the higher level, more sophisticated users, they might actually use the L2 card, right, because it would actually just skip that first step, because they don’t really care about
10:20 the overall health of the network, they would actually care about four specific things, because they see them as distinct and independent.

Once again, we would have the questions that this specific size card would answer, what part of the network is not good, right, that came directly from our interviews and conversations and workshops, and then we would say, okay, well, how do we answer that, it’s the network fabric health. Again, we would have the technical database constraints here, and we would also have the notes that, once again, we tied back to the original research.

One thing you’ll see here is some pink lines coming out of this, and this is where we would start to note, notate how the user workflow would actually go, not just from one size card to another, but allowing you to click and take an action on the information that would be presented, and jump you to more sophisticated data, or potentially even a completely different card, and add that card dynamically to the workbench. the next part of the UX workflow.

The next step in the design process, would be the L3, the sheet. And so the sheet was a much wider card. It was the same height as the L2, and it also had the ability to have tabs, so that you could have multiple views to really dive into a more nuanced representation of the data. This can be thought of it as aggregating the data up, so you’re not seeing the raw data, but you’re getting much closer. This was very rare for the IT team to actually jump into, but you know, this is the bread and butter of the, the architect that would be using this, this tool.

Once again, each sheet would have the questions and the information that would be there to support those questions. Once again, everything, I think, in any kind of user interface needs to be answering a question and ideally giving you the solution and being actionable. You can see lots of pink lines coming in to show the actions that would have it happen. This would be modeled directly off of the flows that we were articulating from those workshops. We were constantly going back and forth and saying, oh so you would do this next, but why and what information would you need, what action would you take?

Very often we would actually find slight variations, so we could use the same design. But you could have different tabs looking at fabric health versus device health. Functionally, where is the specific score having the problem? That’s the same, it’s just looking at a subset of the data. We could reuse the UX patterns multiple times. This was not only useful for the interaction designer and the design process, but also simplified what needed to be implemented for engineering, which is always a consideration you want to be able to take into account.

Now, very often there were multiple sheets that were fairly different. So here you can see one more sheet that was different than the first three that we were just looking at. You can see right here that. Some of these say open traceroute result card. So literally you could run a result and get get the result of doing a system level process, doing traceroute, doing a ping, doing these network activities and have those result cards then open.

If it isn’t clear, these were actionable cards. It’s not just presenting data. These were actionable things. And then after the L3, which was this aggregated data, then we would jump over to the L4. This would then go from that L4 size and then a trouser would open full screen modal. If you’re not familiar with the term trouser, and each of these L4 card, trousers, were function, were by and large data tables. This is the output that you would get from the CLI if you said, tell me everything about all the devices on the network with these columns. It’s an SQL query if you think about it like that. That’s helpful. And this is what those power users ultimately wanted and did not want to give up by not having access to the CLI. We had links that allowed those architects to jump from an L2 or an L3 card directly to the deep dive data pre-filtered, could naturally just choose to go there.

Interconnected Card Network

This was a very natural process that we were following those workflows, allowing them to skip the steps that they didn't need to do for their specific site, for their specific role, their specific expectations, all grounded on those conversations. Doing so many interviews, so we got a really deep understanding of what questions were being asked and how they tied together. In the diagram to the right, you can see how one workflow diagram would translate into numerous wireframes, and even that would just be one "card" in a deeply interconnected set of cards. 

In particular, all the "black dots" represents slight variations of a base card. Repeating this process allowed us to ship around 12 cards in the MVP but leaving the engagement with 50+ cards ready to be implimented!

And one of the things that Cumulus did is they ended up not needing to hire designers after we left the engagement. One of the reasons is that that we produced a card flow decision tree. Obviously, working with a human interaction designer, user, researcher, that is sort of the highest fidelity way that you could move forward with this and keep adding new content. However because we had done so many cards,  we we were able to distill this down to a repeatable process of if an end user said, hey I need to see some information or I have this other problem, they, Cumulus, could have a product manager or an engineer go through this flowchart and determine what kind of next step they needed to do. This was this could be updating a card and adding a new piece of information, maybe it’s an L3 tab or a new small sparkline on L2. It could be whether you need to split a card, oh now we have so much information or these things are really related but we can pull them apart, so do you do that? It could be you need to add a general card. Or you have to have a brand new card that could be part of a new deck, which was a collection of cards, so you could add a bunch of cards at once onto a workbench.

Functionally we taught design thinking to Cumulus and this process solved the 95% problem use cases because we had talked to so many all the different personas from multiple different sites went through so many different card designs with one consistent interaction pattern across the entire product.

Appendix