Tuesday, September 18, 2018

The structure delivers the story

In chapter 8 of A. Cairo's The Functional Art, he provides a framework for the design process of infographics. The design of an effective infographic goes beyond putting numbers, comments and graphics on paper, and it goes beyond prettying these with typefaces and color palettes. An effective infographic is built around a framework which directs the reader's eye and understanding.

 The design of the infographic begins with the identification of the story to tell. What is the subject? What ideas are being related? How are two trends interrelated? What point or points do you want your reader to take away? Answering these questions helps the designer construct the bones of the graphic. Do trends A and B develop in parallel to paint a bigger picture? If so, perhaps the elements of the visualization should be laid out next to each other in a way that directs the reader toward the overarching theme. Is the graphic made to illustrate a dichotomy between points A and B? Perhaps the graphic should be constructed with hard lines and sharp divisions to give the reader the impression of this contrast without them needing to read this explicitly.

The elements within the graphic itself make up its content and its appearance. Laying out these elements as rectangular blocks, it is not difficult to arrange these in a visually appealing way that inevitably directs your reader towards the statement you are making. Of course, these blocks don't need to be rectangles themselves, but the information fits inside the rectangle. Aligning the information in one block with the information in an adjacent block will naturally lead the reader from one piece of information to the next, along a single line of thought. Flipping the formatting of adjacent block will create a visual divide, across which the reader will understand a new line of thought. Other tools to direct the reader's attention include but are not limited to use of color, breaking the rectangular element boundaries, and directional indicators.

A well developed graphic can communicate the main idea through its structure even before the words and numbers are read.


Tuesday, September 11, 2018

Form and Function

The objective of any data visualization is to serve as a tool to communicate some message to the reader. Every tool has a purpose, and the first step in creating this tool should be to define that purpose. Ask "How will the reader use this tool?" The answer should determine how the graphic is constructed. The intended use of the graphic will dictate its form, in order to facilitate its reading and to avoid misinterpretation.

A well-constructed data visualization will serve several purposes. It should present the data at the right scale so that individual values are understood. These values should be organized in a manner that logically directs the reader towards the overall message. The graphic should be constructed so that individual values can be compared, and so that the reader can understand patterns and relationships in the data at a glance.

The following is a famous example of a graphic constructed without following the above rules, which did not convey its intended message effectively.

On January 22, 1986, the US Air Force had scheduled to launch a spy satellite into low earth orbit, just days before the Soviet Union planned to launch a satellite with the same purpose. The satellite was to be carried on the space shuttle Challenger. The launch was delayed due to weather over the atlantic, then delayed again, and again. Now five days later, under pressure from the Air Force, NASA management was eager to launch the shuttle in the cold early morning. Just hours before the launch, an engineer from one of the shuttle contractors brought an objection to the shuttle's launch in the form of the graphic below.

Image result for tufte o rings

NASA management briefly considered the graphic, then dismissed the objection and proceeded with the launch. The Challenger shuttle was destroyed because NASA did not heed this objection. Perhaps this disaster could have been avoided if the engineer making this graphic had taken an extra minute to consider his argument.

Q: What message am I trying to convey?
A: The frequency and severity of O-ring failures on the shuttle are proportional to the temperature, and more importantly, we definitely expect the O-ring to fail at the temperatures expected during tomorrow's launch time.

The graphic the engineer presented information on the type and location of failures from past launches, with notes for the temperature at which these failures occurred. Here they missed their mark- the message they were trying to communicate was the relationship between the temperature and the failure frequency. The temperatures should not have been a note on the graphic, but should have been a central feature. A chart like the example below would have more more clearly shown this relationship, and likely would have convinced the NASA management to delay the launch further.

Image result for tufte o rings


Tuesday, September 4, 2018

Graphics to satisfy our desire for instant gratification

A data visualization is not made in a vacuum. The graphic is a tool to communicate to the reader. With that in mind, the graphic should be created with the user experience at the forefront of the design.

I would assume the primary consumer of most forms of digital media is a millennial. Millenials have been heavily criticized by older generations for having a short attention span. This should be unsurprising, as the millennial generation has been heavily influenced by the internet - an overflowing cornucopia of information, delivering all types of media in quick snippets from all directions.

The New York Times' How Y'all, Youse and You Guys Talk saw viral popularity because it delivers instant information which was directly relevant to their entire readership and their friends. 
Other data visualizations which also deliver relevant, instant feedback in a visually pleasing way might expect to see the same popularity.

My personal favorite data visualization is Gendered Language in Teacher Reviews. (Link below) The interactive visualization is well-proportioned, smoothly animated, easy to use, and easy to understand. The visualization pulls data from 14 million reviews of teachers written on RateMyProfessor.com to show how language choice differs in reviews of male versus female professors.

For background, RateMyProfessor is a site widely used by college students worldwide to evaluate their professors. Students can grade their teachers for overall quality and level of difficulty, and write a review for a class that professor teaches. The reviews should be taken with a grain of salt, because I would imagine that most reviews are written by students strongly compelled to go out of their way to share their classroom experience. That is to say, the majority would be written by students who either hate or love the professor.

This graphic is especially relevant to me, because it reflects the opinions of my peers, and I could use it as a tool to quickly test some thought experiments. Here are two examples of hypotheses I tested with this data visualization:

At least among college age males like myself, there is a common stereotype that women are not as funny as men.


The graphic seems to reflect that stereotype, showing that in all fields, male professors are described as "funny" about twice as often as female professors. It also shows that the most frequent instances of "funny" professors occur in the communications fields - phycology, language, sociology, and english appear near the top. The more technical fields have much less funny professors, with engineering, computer science, chemistry and math appearing near the bottom.

RateMyProfessors changed its format since I last used it in undergrad. Students used to be able to give a "hot chili pepper" in their reviews to professors they thought were physically attractive. How are words for physical attraction used in professor reviews?


Of the adjectives "hot," "handsome," and "sexy," "handsome" was used the most infrequently. Unsurprisingly, "handsome" is very very rarely used in reviews of a female professor's class. I was surprised to see that male professors were more often described as "sexy" by a large margin. Perhaps this indicates that female students are more willing to include the word "sexy" in their vocabulary than male students. For "hot" there is not a clear winner. While "hot" is used ten times as often as "handsome" or "sexy," it seems that one gender is not the clear winner here. It is interesting to note that the difference in "hot" reviews for engineering professors is by far the most extreme. Perhaps this can be explained by engineering students' very limited exposure to women...


http://benschmidt.org/profGender/#%7B%22database%22%3A%22RMP%22%2C%22plotType%22%3A%22pointchart%22%2C%22method%22%3A%22return_json%22%2C%22search_limits%22%3A%7B%22word%22%3A%5B%22funny%22%5D%2C%22department__id%22%3A%7B%22%24lte%22%3A25%7D%7D%2C%22aesthetic%22%3A%7B%22x%22%3A%22WordsPerMillion%22%2C%22y%22%3A%22department%22%2C%22color%22%3A%22gender%22%7D%2C%22counttype%22%3A%5B%22WordCount%22%2C%22TotalWords%22%5D%2C%22groups%22%3A%5B%22unigram%22%5D%2C%22testGroup%22%3A%22B%22%7D

Tuesday, August 28, 2018

The human element

The job of data visualization is to bridge the gap between the journalist's set of raw information and the reader's understanding of that information. Humans are, by nature, subjective beings, and will impose their own worldview and their own interpretation onto whatever is presented to them. A reader's inherent confirmation bias will make them subconsciously manipulate a graphic in a way that makes it reflect their beliefs or downplay an opposing opinion. Similarly, a graphic can be intentionally made to skew data in a way that communicates something other than the real story.

Misleading graphics are very common in political and religious forums. Intentionally skewed graphics, or even ambiguous graphics shared by thought leaders via social media are widespread, convincing millions of followers that "because the data says so" they must be right.

This graphic circulated democratic media during the presidential race leading up to Trump's 2016 election:

Image result for somewhere within the tiny orange

At a glance, this pie chart looks sound. Those are all familiar names. We all know the United States spends big on the military. In fact, every number represented on this chart is true, but paired with the caption, this chart misrepresents the big picture. The above chart only shows discretionary spending, which makes up only ~40% of the overall US budget. A much larger slice of the pie isn't even shown. What's more is that the food stamp program is actually part of the US budget's mandatory spending, which is NOT represented in this chart. A graphic better representing the data is shown below, from Politifact.



This is just a single example, which leaves us with the lesson: always look at a graphic critically, especially in the context of the big picture.

Thursday, August 23, 2018

Welcome to my blog! The following posts will illustrate what I have learned about data visualization methods in my readings each week.