Talking Data Visualization with Sarah Nahm and Ian Johnson (transcription)
We were lucky to have two Data Visualists, Sarah Nahm and Ian Johnson from Lever, join us for lunch in Pivotal SF last week talk and about the current state of Data Visualization. Nina wrote a great summary of their thoughts. Here’s an (almost) transcription of the talk. In accordance with my notes policy, any errors, misstatements or shenanigans should be attributed entirely to me and not the speakers.
What To Do With All Your Data
Sarah: My approach is that Data Visualization is goal oriented; it’s not about being on a chart. A lot of people assume that a line chart is Data Visualization, but I think google search is Data Visualization. By building a service and they’re building a specific and powerful Visualization. There’s usually a form that data can take to hit a goal. It’s not about one answer, it’s about doing many. I’ma huge fan of unsexy Visualizations, like tables. Designing tables to be effective is hard; people tune them out. Designing for people’s attention is a huge part of Data Visualization. The second huge think is to think about how data transitions. Is this a button? A portal? Does it start another activity? All those things are design principles that figure into my thinking. Ian is totally different in an really awesome way.
Ian: I’ve learned a ton getting into this field. I came from math and didn’t know Data Visualization was a separate thing. I just had numbers and I could turn them into pixels and colors. From that perspective, any transfer or exchange is a way to turn one data into another. What do we want to accomplish? Goal oriented design has been a really big positive change for me to deliver things that work for people.
Question: Can you talk about [didn’t hear] something that came together?
S: The original question was about “how do you break out of business dashboards?”
Tim McCoy: Can you talk about the tension between using data to tell your story and how data builds the story?
S: Where Data Visualization is now is about showing the data. Insights is hard just not just of “big data” or because the datasets are hard. This has driven me to learns statistics; i highly recommend the Cartoon Guide to Statistics. Learning more about machine learning, getting more of a literacy about it; it’s way too easy to find a correlation and wave that around as an insight, but it’s not that simple. Designing something around data you can’t anticipate is hard. Designing around a story is about being acutely aware of when you’re smart, and trying to be dumb. The anecdote that pulls this together is that in our app—basically Lever is a hiring software that isn’t about finding as much as about managing the data problem. When you find the 300 candidates, you need to keep track of them; it’s a confined but messy data problem. We have a list of candidates here. The big Data Visualization is that there isn’t’ a dashboard; data should be ambient and a portal to get to what you want to look at.
We call this the filterlist—it’s a name that we picked and it just stuck.
As you filter the list, you see the Data Visualization filter and update. As you change criteria, the list changes. Something that was really hard about this is that these aren’t all you’ll ever have about these people, they’re portals to list. If you filter a list and the output is a list of people, but also, what’re lists that people really need? How else can we show this? “There are people I reached out 3 months ago, and I dropped the ball, and how do I get back in touch?” Last week I was doing something about time but it was really fuzzy and heuristics based. We invented “Last Interaction view” which are irregular but meaningful timespans (e.g. 1 week, 3 months) drawn as little airplane windows with sparklines below. This chart gave us so much grief; there’s nothing scientific about it. It kind of bugs people because we made all these decisions. But the user looked at it with the user eye, the user looked confused. But when you look at the sparkline, you see “oh, that time was busy; that other time wasn’t busy.”
The goal was to let people know that if you look at this, you’ll see the list of people you dropped the ball on 3 months ago. A scatter plot would be more accurate, but it ultimately didn’t translate to how people thought about it. All the other teammates were like “maybe we should cut it”, but Ian said “maybe it should be a sparkline, people think about it that way.”
Q: How do you start?
S: I always start with What It Needs to Be; that’s usually a text document. It’s so easy to be distracted. There’s so much fun stuff. We had all this stuff; 6 or 8 questions people needed an answer to about a candidate. It was all about focusing on that. Otherwise it’s too easy to do what’s fun or pretty.
Q: ? [didn’t hear]
S: A We usually get negative feedback, because its hard for people to think about data in context. Until they can see their own data, it’s hard to grok. This is really reductionist. It’s really minimal to try to not overwhelm people. Our style of this ambient, around all the time, makes it hard to not have your own data.
TM: yeah, I did a finance project and the most challenging aspect was that without real data, it’s a total mess.
S: Maybe you should talk about Chris S.
I: THis idea of fake data is extremely important; without data, you don’t have a Visualization. I try to be hard-line about this. We’re still really bad about creating data. People come to us with nothing. We have to comb through existing systems, spreadsheets, etc. We do paper sketches with existing data, some rough D3 sketches. Everyone who was higher-up than our client loved the data porn, but the client was like “where’s the stuff I want to see?” “Where do I get the answers to my questions”?
S: People using our product are beta testers at [REDACTED]. The only guy watching the data carefully; Ian did all sorts of lo-fi sketches of what visualizations might do. In many ways lo-fi helps, but that was just priceless in terms of interactivity, in terms of “what do I want to filter on?” or if there were a time-based graph I could see what formats I needed to export to. Prototyping is no new thing, but the nice thing about Data Visualization is that prototyping an interface around what the data can be is really important; it lets us put a play button on it. There’s something about the way an engineer throws a button on anything can be really helpful; it’s really awesome.
Q: How do you know when to up the fidelity?
I: Data fidelity should get high ASAP, even if its a sub-sample of the data. It should have the final structure.
S: You can never wait long enough to do visual high-fidelity. Clients know about Data Visualization; some good things and some naive things. They know the questions they want to ask, when they need it, etc. People have really terrible taste in charts; they think they know what they want, but they never break down what they really need. Once you get them to say “I want to email this to my boss”, it changes the deliverable from the fully interactive website to an animated gif. The more it does the work that people need; you can make it pretty later. If you’re thinking about the charts as vying for your attention against this list of what you’re trying to do, there’s a degree to which the visual styling needs to be environmentally aware, but usually I go visual too soon.
Ofri Afex: I need to explore many directions while I’m doing charts, otherwise I get too used to the initial approach and get locked in to my first choice.
S: I rely on my design partners; they’ll take a headline and common chart elements and make a bunch of visual styles right at the beginning. I think that’s really valuable. It has to mingle with the rest of the application. I still feel that the charts have to be what they have to be, and that’s locked in by the reductive design decisions you make.
S: We have to show these variables. We have this much space. We have this much sampling fidelity. We have this much interactivity. E.g. for lookup, tables are the best. Charts are meant to show relationships. For so many business products, lookup is the task that’s needed, but clients want pretty charts. Like I said, Google search is Data Visualization.
Q: The reality is that stakeholders need to buy in.
S: The current bar is so, so low. It’s fun to surprise people with how nice of a place they can live in. It all stems from scoping the problems they have to answer, and making them feel confident. That’s the UX side—accessibility and confidence. Todays’ tools are built for experts, but the need is getting democratized. Recruiters aren’t trained statisticians, but they know what questions they want to answer. In some ways, you make sure it meets those needs. Building for the next step: “what do you want to do once you know it?” is where current products tend to drop the ball. That’s where people get excited. On one screen, we did throw a bubble chart in just for visual sexiness and visual balance so it didn’t feel too dense or unplayful.
Q: ??? [couldn’t hear]
S: Many Data Visualization have elements that feel like whitespace: it’s a waste unless you’re thinking about principles like balance and designing people’s attention. The bubble chart is like whitespace; it gives people a chance to take a breath. It’s a signpost that says “we’re in a new part”.
I: This particular bubble chart—a bar chart would work better—but it’s not the main thing.
Q: I just want to dive in.
I: Anything you click on—there’s hover states, etc, you build filters—but there’s no external changes other than highlighters.
Q: How do you think about that problem?
S: We have three redundant ways to indicate a filter is activated. We’re nervous that users won’t know.
Q: Do you split test it?
S: We should probably test more. This is the thing we worked on most recently. we want to ship and see how people work in the whole system. What we’re kind of asking is “When people change the data, how do they see the change?”. The real time framework we’re building on is going to be this amazing way to manipulate things, we can affect anything on the page. The vision is that the delta of the data is as important as the new state; I’d love to show the ghost of the sparkline. The vision for the filtervis is to show that.
I: You can use the browser history API to show that stuff going back!
S: Questions around change over time are where people are doing interesting things. That seems to be top of mind. Tributary makes that easy. Dealing w/ that medium of motion and time.
Q: You talked about representing change over time. How do you attack the problem of representing change over scale, e.g. trillion-row database?
S: What does the user want? Do they want to get back to the big picture? There’s a natural inclination to set up data as a hierarchy, but the Visualization of the data need not mirror the data. Often data mirrors a company’s org-chart, but that need not be.