24. Data Analytics - Share Data Through the Art of Visualization - Week 1
Data visuals in two ways:
- looking at visuals in order to understand and draw conclusions about data
- creating visuals from raw data to tell a story
Quick rule for creating visualizations:
- Audience should know what they're looking at within the first five seconds
- Audience should then understand the conclusion the visualization is making 5 seconds after that
Data visualization is a great tool to fit a lot of information into a small space.
Steps: Organize and structure your thoughts > Think about patterns about the data and key
findings
The four elements of effective data visualization are the information (data), the story (concept), the
goal (function), and the visual form (metaphor); a successful data visualization must have all four
elements.
Definitions:
Data visualization // graphic representation and presentation of data
Causation // occurs when an action directly leads to an outcome
Correlation // measure of the degree to which two variables move in relationship to each other. EX: temperature goes up, ice-cream sales goes up.
- Positive correlation // when one factor goes up and the other goes up
- Negative/Inverse correlation // when one factor goes down and the other goes up
- No correlation // when one factor goes up/down and the other does nothing
* CORRELATION DOES NOT MEAN CAUSATION
Static visualizations // do not change over time unless they're edited. Useful when you want to control your story or dataset. EX: Charts and graphs made in spreadsheet.
Dynamic visualizations // visualizations that are interactive or change over time. Users have control over what they see and you have less control over data and the story.
Tableau // a business intelligence and analytics platform that helps people see, understand, and make decisions with data
Decision tree // is a decision-making tool that allows making decisions based on key questions that you can ask yourself. Like a pathway or binary search tree.
Data composition // combining the individual parts in a visualization and displaying them together as whole
Design thinking // process used to solve complex problems in a user-centric way
Headlines // a line of words printed in large letters at the top of the visualization to communicate what data is being presented. Attention grabber, keep it bold and simple and above the chart.
Subtitle // supports the headline by adding more context and description
Labels // identifies data in relation to other data. EX: legends and keys
Legends (keys) // identifies the meaning of various elements in a data visualization
Annotation // briefly explains data or helps focus the audience on a particular aspect of the data in a visualization.
Alternative text // alternative text provides a textual alternative to non-text content
Ways to make data visualizations accessible for everyone:
* Thinking about everyone who might access the data and what obstacles they might run into.
- Labeling // make sure labeling is not confusing
- Text alternatives // add more alternative ways to read the data (voice-overs, translations, etc)
- Text-based format
- Distinguishing // using foregrounds and background in a way to make better contrast
- Simplify // make visualization not overly complicated
How to highlight data visualizations:
- Headlines, subtitles, labels
Visualization components | Guidelines | Style checks |
---|---|---|
Headlines | - Content: Briefly describe the data - Length: Usually the width of the data frame - Position: Above the data | - Use brief language - Don’t use all caps - Don’t use italic - Don’t use acronyms - Don't use abbreviations - Don’t use humor or sarcasm |
Subtitles | - Content: Clarify context for the data - Length: Same as or shorter than headline - Position: Directly below the headline | - Use smaller font size than headline - Don’t use undefined words - Don’t use all caps, bold, or italic - Don’t use acronyms - Don't use abbreviations |
Labels | - Content: Replace the need for legends - Length: Usually fewer than 30 characters - Position: Next to data or below or beside axes | - Use a few words only - Use thoughtful color-coding - Use callouts to point to the data - Don’t use all caps, bold, or italic |
Annotations | - Content: Draw attention to certain data - Length: Varies, limited by open space - Position: Immediately next to data annotated | - Don’t use all caps, bold, or italic - Don't use rotated text - Don’t distract viewers from the data |
Five phases of the design process for design thinking:
- Empathize // think about emotions and needs of target audience, is the visualization appropriate?
- Define // define audience's need and problems and your insights
- Ideate // generate data visualization ideas, brainstorm how to formulate a visualization
- Prototype // putting charts/visualizations together or create and list potential final chart choices
- Test // test the visualization by showing to team members
Elements for effective visuals:
- Clear meaning // the message is clear and the visualization is easy to understand
- Sophisticated use of contrast // knowing how to emphasize the message
- Refined execution // deep attention to detail using visual elements
David McCandless's Venn Diagram:
- Information (Data) // data is needed to create a story and to communicate new ideas/finding
- Story (Concept) // data need a story, with only informative it is boring
- Goal (Function) // goal of data visualization makes the data useful and usable
- Visual Form (Metaphor) // visual elements give visualization structure beautiful
Principles of design:
1. Balance // when key elements of a visualization is distributed evenly, color/spacing/etc is balans
2. Emphasis // focal point for audience to concentrate, visualization should emphasize importants
3. Movement // path the viewer's eye travel as they look at the visualization
4. Pattern // patterns can be shown with colors and shapes, and other elements
5. Repetition // repeating patterns/elements can add to effectiveness of visualization
6. Proportion // using color and size can help emphasize the importance of a data in visualization
7. Rhythm // creating a sense of flow or movement in the visualization.
8. Variety // variety in chart types, shapes, and other elements
9. Unity // final visualization should be cohesive
Elements of art:
- Line // can be curve/straight, thick/thin, vertical/diagonal, etc
- Shape // always be two-dimensional, good for size contrast
- Color // hue, intensity, value. shade is adding dark values to a color. tint is adding light values.
- Space // area between, around, and in objects
- Movement // create a sense of flow/action in a visualization
Types of visualizations:
Bar graphs // use size contrast to compare two or more values. Has X-axis categories and Y-axis scale of values but can switch them. Effectively shows data that can be ranked.
Line graph // help your audience understand shifts or changes in your data. Help show change over a period of time. Has X-axis and Y-axis. (Like stock market chart)
Pie charts // show how much each part of something makes up the whole. Shows proportion differences
Maps // help organize data geographically
Histogram // a chart that shows how often data values fall into certain ranges. Sometimes bell curve looking.
Correlation charts // show relationships among data
Column charts // like a table basically, but can be used to create a basic bar visualization
Heatmap // uses color to compare categories in a dataset. mainly used to show relationship between two variables and use a system of color-coding to represent different values.
Scatter plots // show relationship between different variables. typically used for two variables for a set of data.
Bubble chart // shows size comparison
Distribution graph // displays the spread of various outcomes in a dataset
Patterns:
- Change // this is a trend or instance of observations that become different over time. Line or
column chart.
- Clustering // a collection of data points with similar or different values. Distribution graph
- Relativity // these are observations considered in relation or in proportion to something else. Pie
chart.
- Ranking // position in a scale of achievement or status. Column chart.
- Correlation // shows mutual relationship or connection between two or more things. Scatterplot
Organization Frameworks
- The McCandless Method
https://www.informationisbeautiful.net/visualizations/what-makes-a-good-data-visualization/
INFORMATION > STORY > GOAL > VISUAL FORM
- Kaiser Fung's Junk Charts Trifecta Checkup
https://junkcharts.typepad.com/junk_charts/junk-charts-trifecta-checkup-the-definitive-
guide.html
Checkup Questions:
- What is the practical question?
- What does the data say?
- What does the visual say?
Marks and Channels in Data Visualizations:
* Pre-attentive attributes // elements of a data visualization that people recognize automatically
without conscious effort.
Marks // basic visual objects like points, lines, shapes.
- Position, Size, Shape, Color
Channels // visual aspects or variables that represent characteristics of the data.
- Accuracy // are the channels accurate at estimating the values represented?
- Popout // are the values easily distinguished from one another?
- Grouping // how good is a channel at communicating groups that exist in the data?
Design Principles:
Principle | Description |
---|---|
Choose the right visual | One of the first things you have to decide is which visual will be the most effective for your audience. Simple vs complex |
Optimize the data-ink ratio | The data-ink entails focusing on the part of the visual that is essential to understanding the point of the chart. Try to minimize non-data ink like boxes around legends or shadows to optimize the data-ink ratio. |
Use orientation effectively | Make sure the written components of the visual, like the labels on a bar chart, are easy to read. Change orientation if necessary. |
Color | Use color consciously and meaningfully, staying consistent throughout your visuals, being considerate of what colors mean to different people, and using inclusive color scales that make sense for everyone viewing them. |
Numbers of things | Think about how many elements you include in any visual. If your visualization uses lines, try to plot five or fewer. If that isn’t possible, use color or hue to emphasize important lines. Also, when using visuals like pie charts, try to keep the number of segments to less than seven since too many elements can be distracting. |
Avoid misleading or deceptive charts:
What to avoid | Why |
---|---|
Cutting off the y-axis | Changing the scale on the y-axis can make the differences between different groups in your data seem more dramatic, even if the difference is actually quite small. |
Misleading use of a dual y-axis | Using a dual y-axis without clearly labeling it in your data visualization can create extremely misleading charts. |
Artificially limiting the scope of the data | If you only consider the part of the data that confirms your analysis, your visualizations will be misleading because they don’t take all of the data into account. |
Problematic choices in how data is binned or grouped | It is important to make sure that the way you are grouping data isn’t misleading or misrepresenting your data and disguising important trends and insights. |
Using part-to-whole visuals when the totals do not sum up appropriately | If you are using a part-to-whole visual like a pie chart to explain your data, the individual parts should add up to equal 100%. If they don’t, your data visualization will be misleading. |
Hiding trends in cumulative charts | Creating a cumulative chart can disguise more insightful trends by making the scale of the visualization too large to track any changes over time. |
Artificially smoothing trends | Adding smooth trend lines between points in a scatter plot can make it easier to read that plot, but replacing the points with just the line can actually make it appear that the point is more connected over time than it actually was. |
Additional Resources:
https://www.ted.com/talks/david_mccandless_the_beauty_of_data_visualization?language=en#t-150183
https://artscience.blog/home/the-mccandless-method-of-data-presentation
https://informationisbeautiful.net/
https://www.amazon.com/Street-Journal-Guide-Information-Graphics/dp/0393072959
https://visme.co/blog/best-data-visualizations/
https://www.tableau.com/learn/articles/best-data-visualization-blogs
https://datastudio.google.com/gallery?category=visualization
https://visme.co/blog/best-data-visualizations/
https://towardsdatascience.com/correlation-is-not-causation-ae05d03c1f53
https://www.data-to-viz.com/
https://www.youtube.com/watch?v=C07k0euBpr8
https://dataconomy.com/2019/05/three-critical-aspects-of-design-thinking-for-big-data-solutions/
https://www.enginess.io/insights/data-and-design-thinking
Comments
Post a Comment