Quantcast
Channel: Tableau – Michael Sandberg's Data Visualization Blog
Viewing all 292 articles
Browse latest View live

Review – Part 2: MOOC, “Data Exploration and Storytelling: Finding Stories in Data with Exploratory Analysis and Visualization”

$
0
0

Readers:

In Part 1 of my review of the MOOC, Data Exploration and Storytelling: Finding Stories in Data with Exploratory Analysis and Visualization, I provided some background on the course framework, including what a MOOC was. Also, I provided the biographies of Albert Cairo and Heather Krause that were from the course web site.

In Part 2 of this review, I am going to share some of the content that was taught in Module 1. I am also going to expand on one of the topics Professor Cairo addressed, The Hockey Stick Chart.

Again, I hope you find this review helpful and I highly encourage you to take courses not only from them, but other offerings in the MOOC space.

Best Regards,

Michael

Module 1 – Finding and Understanding Data

Visualization Defined

In the first video of this module, Professor Cairo provides us a definition of visualization. This is the same definition he discusses in his two seminal books, The Functional Art (2012) and The Truthful Art (2016).

Per Professor Cairo,

A visualization is a graphic representation designed to enable exploration, analysis, or communication.

He then provides two interesting stories related to data encoding. The first story deals with course grades provided by an instructor. When the instructor uses a 0 to 100 scale, where 0 would be an F and 100 being an A or A+, the average score was 72. When it came time for the students to write reviews of this instructor, his students provided that instructor less than favorable reviews. So, the professor decided to perform an experiment. He changed his scale from 0 to 137 where 137 would be an A or A+. This change had the average score now at 96. Now, in the next series of reviews, the students started giving him glaring reviews.

The question to raise here is if a score of 96 over 137 is a better average score than 72 over 100? After hearing Professor Cairo tell the story, you may be tempted to think 96 is a better average than 72 as it is a larger number and most of us associate a score of 96 with being close to a very high A. Professor Cairo pointed out we as humans are very bad at dealing with numbers as they are “abstract representations of quantities.” [1]

He then shows the participants the data visualization below where these numbers are mapped onto spatial properties. As you can now visually see, an average of 96 on a scale of 0 to 137 is not as good an average as 72 is on a scale of 0 to 100.

Thayer - Exam Scores

This example fit in with the primary goal of the course to show the participants ways to use data as a source to tell stories. As the course progresses, we will see more tools and techniques from Alberto and Heather use to interrogate data for answers – gathering, cleaning, organizing, analyzing, visualizing and publishing data to find and tell stories. [1]

The Hockey Stick Chart

The next story Professor Cairo told us had to deal with a very famous (and controversial) chart related to climate change. This chart is known as The Hockey Stick Chart.

The Hockey Stick Chart - Chart Only

Professor Cairo first shows us records of global temperatures from the year 1000 up to the year 2001. He presented these numbers as a data set in a numerical table (think of an Excel spreadsheet). Alberto points out that viewing the numbers as a data set, it is almost impossible to see trends and patterns in the data unless you are a
very good statistician or a very good data scientist who is very good at extracting
meaning for this data.

Again, Professor Cairo maps and transforms the data onto a time
series lag chart. He points out that this is one of the most famous and one of the most persuasive data visualizations created. It is commonly called The Hockey Stick Chart that was designed by several environmental scientists in 1988 and 1989. The story
that it tells is very, very persuasive. [1]

At this point, I want to expand on Professor Cairo’s story and delve into more detail about The Hockey Stick Chart.

Michael_MannIn 1998, a yet unknown climate scientist named Michael Mann (photo, right) and two of his colleagues published a paper that sought to reconstruct the Earth’s past temperatures going back 500 years before the era of thermometers to show how out of whack recent warming has been.

The finding: Recent northern hemisphere temperatures had been “warmer than any other year since (at least) AD 1400.” The graph above depicting this result looked rather like a hockey stick: After a long period of relatively minor temperature variations (the “shaft”), it showed a sharp mercury upswing during the last century or so (“the blade”). [2]

The report disseminated quickly through climate science circles. Mann and another colleague soon lengthened the shaft of the hockey stick back to the year 1000 AD. In 2001, the UN’s Intergovernmental Panel on Climate Change featured the hockey stick in its Third Assessment Report. Based on this evidence, the IPCC proclaimed that “the increase in temperature in the 20th century is likely to have been the largest of any century during the past 1,000 years.”

The Hockey Stick Chart - Revisited

Smoothed reconstructions of large-scale (Northern Hemisphere mean or global mean) surface temperature variations from six different research teams are shown along with the instrumental record of global mean surface temperature. Each curve portrays a somewhat different history of temperature variations and is subject to a somewhat different set of uncertainties that generally increase going backward in time (as indicated by the gray shading). This set of reconstructions conveys a qualitatively consistent picture of temperature changes over the last 1,100 years and especially over the last 400.

Then the National Academy of Sciences weighed in in 2006, vindicating the hockey stick as good science and noting:

The basic conclusion of Mann et al. (1998, 1999) was that the late 20th century warmth in the Northern Hemisphere was unprecedented during at least the last 1,000 years. This conclusion has subsequently been supported by an array of evidence that includes both additional large-scale surface temperature reconstructions and pronounced changes in a variety of local proxy indicators, such as melting on ice caps and the retreat of glaciers around the world.

All Hell Breaks Loose

Mann was now facing a myriad of scientific and political attacks on his work. The Hockey Stick Chart was repeatedly attacked, and so was Mann himself. Congress got involved, with demands for Mann’s data and other information, including a computer code used in his research.

This report did not change the minds of climate deniers, in fact, it emboldened them more. Mann and his colleagues were drawn into the 2009 “Climategate” pseudo-scandal, which purported to reveal internal emails that (among other things) seemingly undermined The Hockey Stick Chart. Only, they didn’t.

In the meantime, climate scientists continued to work to prove (or disprove) Mann’s theories. Over the years, other researchers were able to test Mann’s work using “more extensive datasets, and more sophisticated methods. And the bottom line conclusion doesn’t change.” Mann’s single hockey stick chart soon became several dozen variations created by different groups of scientists. Mann referred to them as a “hockey team.”

Recent studies support the hockey stick more powerfully than ever. One report, from Nature Geoscience, featured more than 80 authors, showed with extensive global data on past temperatures that the hockey stick’s shaft seems to extend back reliably for at least 1,400 years. In Science, Shaun Marcott of Oregon State University and his colleagues extended the original hockey stick shaft back 11,000 years. “There’s now at least tentative evidence that the warming is unprecedented over the entire period of the Holocene, the entire period since the last ice age,” says Mann.

“Climate deniers like to make it seem like the entire weight of evidence for climate change rests on the hockey stick,” explains Mann. “And that’s not the case. We could get rid of all these reconstructions, and we could still know that climate change is a threat, and that we’re causing it.” The basic case for global warming caused by humans rests on basic physics–and, basic thermometer readings from around the globe. The hockey stick, in contrast, is the result of a field of research called paleoclimatology (the study of past climates) that, while fascinating, only provides one thread of evidence among many for what we’re doing to the planet. [2]

Next Blog Post: Continuation of the Review of Module 1 – Finding and Understanding Data

Sources:

[1] Alberto Cairo and Heather Krause, Course Video: Module 1: Visualization for Discovery, Data Exploration and Storytelling: Finding Stories in Data with Exploratory Analysis and Visualization, Knight Center for Journalism in the Americas, The University of Texas – Austin, January 16–February 26, 2017.

[2] Chris Mooney, The Hockey Stick: The Most Controversial Chart in Science, Explained, The Atlantic.com, May 10, 2013, https://www.theatlantic.com/technology/archive/2013/05/the-hockey-stick-the-most-controversial-chart-in-science-explained/275753/?utm_source=eb.

 


Filed under: Alberto Cairo, Analytics, Climate Change, Data Visualization, Heather Krause, Hockey Stick Chart, Infographics, Michael Mann, MOOC, Storytelling, Tableau, tableau public, The Functional Art, The Truthful Art, Uncategorized

Review – Part 3: MOOC, “Data Exploration and Storytelling: Finding Stories in Data with Exploratory Analysis and Visualization”

$
0
0

Readers:

In Part 2 of my review of the MOOC, Data Exploration and Storytelling: Finding Stories in Data with Exploratory Analysis and Visualization, I reviewed two stories Professor Cairo presented regarding the importance of data visualization and how data encoding makes data easier to understand.

HeatherKrauseIn Part 3 of this review, I am going to share some of the content that was taught in the third video of Module 1. This video was presented by Heather Krause (photo, right).

Again, I hope you find this MOOC series review helpful and I highly encourage you to take courses not only from them, but other offerings in the MOOC space.

Best Regards,

Michael

Module 1 – Finding and Understanding Data

Finding Data

In this third video for Module 1, Heather provides us an in-depth review of some best practices for finding data. She points out that finding data is a lot easier than it used to be, thanks to the internet. Also, there are increasing amounts of open data sources to help us strongly embrace the ideas of transparency and accountability in research and data. Because of that, many non-academic organizations and individuals have access to all kinds of really interesting data.

So, where do we find the data? The most obvious place to start is Google. If you have
a topic and you enter the topic into Google and end with the word “data”, and there you go. You have some results. Heather used an example of refugee data, which she uses as a theme throughout many of the videos. She was doing quite a lot of work, at the time of recording the video, on how to understand, support and put in place policies for refugee
resettlement using data. So data science for refugee resettlement is one of the projects she was doing a lot of work on.

Finding Excel Spreadsheets with Google

So if we Google “Refugee Data”, the screenshot below shows us what we get.

Google Refugee Data

Heather then showed us how, if we make a few small changes to our Google
search, we can be specific in our search for “Refugee Data’ to only return results with the file type .xls, so we can focus only on Excel spreadsheets related to refugee data.

Google Refugee Data - Excel Sheets Only

As you can see in the screenshot above, Google immediately ignores everything that is not an Excel spreadsheet. An Excel spreadsheet is usually a very good place to start because a large amount of data can be stored in the spreadsheet.

Other extensions or file types you can try are .csv or .xlsx.

So, if we go to the file type .xls Google results (see screenshot below), we see
the State of Virginia is collecting data on refugees. Heather pointed out that this is very interesting and did not come up when we searched without file type (she also pointed out that it probably came up in the results, but it would have been hundreds and hundreds of pages back).

Virginia Refugee Data - Google Search

Reviewing the screenshot below, we can see that we now have a very interesting piece of data on refugees. It looks like the State of Virginia is keeping very careful public data on refugees where they settle in Virginia and where they came from and also even what type of refugees they are.

A story angle to thing about is if we can we show the flow of refugees by State? That would be a very interesting data journalism piece.

Virginia Refugee Data

On the next page of the Google search (see screenshot below), we see a topic for protection incident monitoring inventory.

Inventory Refugee Data - Google Search

We are not sure what this is, so we click on the link to review the data. It looks very interesting (see screenshot below).

Inventory Refugee Data

Heather points out,

One of the useful things about pieces of documents like this, is that it tells you what other people are tracking. It tells you what data points, indicators and variables that are important to people working on the ground in your story area. In this case, here we have a semi-public website. That’s very interesting. Semi-public is very interesting. It might be a good source or a good lead, something unusual. Children, there is an interesting topic that you might want to consider if you are going to work on refugees. And transgender, that’s also very interesting. So you can see how finding data really does contribute to the iterative process. Simply the act of looking for data will help you form possible angles for stories building a foundation of understanding of your topic. This is a very important process. This data here has a lot of metadata, and we will be talking about metadata very soon. Metadata is data about your data, information about your data. So that’s very important. [1]

Unusual Sources in a Google Search

Now if we use the .csv file type, we see that again we get the data to show up much more quickly. Often time, data from sources you might not ever see (see screenshot below).

Google Search - Unusal Sources

Heather told us a story where she had very interesting data come up from things like school board meetings, where everything is public but it is not really published. It certainly isn’t published in a way that is easy to find. So searching by file type is very useful.

Images in a Google Search

Heather pointed out that another very useful trick is searching in images.

Google Search for Images

Heather noted,

This seems counterintuitive. You want data. But one of the very important aspects of data/data journalism, as our colleague Alberto Cairo knows quite well, is visualization and its very important right now.

 

By searching in the images section for refugee data we will probably find people working on data with refugees and analyzing it in interesting ways. This doesn’t mean you should take all of this to be truth. There are a lot of bad visualizations and poorly done visualizations. However, it can be very helpful if you are extremely careful.

Visualoop

The screenshot above is the visualoop website, which is doing something really interesting with refugee data. We can’t really understand what they are doing from here. But if we follow the link, we end up with very interesting sources that are using data about refugees to answer questions.

Below are images from some very popular publications  such as The New York Times. Also, less known publications also have images that are not in English. So this could be a very good lead of where to find unusual and important data that you might not otherwise find any where else.

Refugee Data Image - The New York Times

 

Refugee Data Image - The Guardian

Refugee Data Image - Publico

Data Repositories

After Google search, the next obvious place to look is data repositories. Some data
repositories are indexed by Google, which means their results will show up in your Google searches, but many data repositories are not indexed by Google, which means they won’t show up.

Heather pointed out that, as part of the MOOC course material, they have provided a list of many different data repositories that they know about. She also pointed out that one of the great things about a MOOC is that you can make your own contributions so we can all learn from each other to find new data sources.

Here are a couple of examples.

NICAR data library contains a database that has some very interesting data in it. You will want to note that some of this data in this data repository is free, some of it you pay for.

NICAR Data Library - Copy

The is also true of ProPublica. They have an outstandingly good data store. It’s not very big, but the data is very, very good quality. It can answer some very interesting questions. Again, some of that data is free and some of that data you will need to pay for.

ProPublica - Copy

Here is Enigma, which catalogs a lot of publically available data in interesting ways. If we use this for some refugee data we will find Al Qaeda and Texas budgets. These are pieces of data that you would find searching through a 100 pages of Google, but this makes it easier and also gives you some ideas of angles you might be able to take with data that you haven’t thought of before.

Enigma

Open data is a very important source, although, it is usually not the best source. The data is okay, but the data they make available to the public often won’t answer the questions that you need. Feel free to Google anything you want (e.g., “Open Data”) and you will get it all there.

There is also a project called the Dataverse, which was developed at Harvard. But many similar ones exist throughout the world today. Within the Dataverse, researchers and academics, anyone who is collecting their own data is uploading their data to these Dataverse’s.

Dataverse

Heather stressed that transparency and reproducibility in research has become a very important part of our modern research process. So in order to facilitate that, researchers who use data to do research have to make their data available so you can check to see if their results are real. And it also allows others to use that data as a source for other stories.

Finding Data Sources in a PDF Report

After you have exhausted all of the data sources mentioned above, there are a few other ways to get data. One of them is reading reports really carefully. When you have, usually a PDF report, from an organization that is studying or advocating on your topic.

Heather’s example is refugees. If you read these reports really carefully there are usually little tiny links to the data. So below is an example of the OECD migration report. If you look way down here, you will see an example of the link.

OECD

If you click the OECD link shown above, you’ll get an Excel spreadsheet (see screenshot below) and that Excel spreadsheet will have all the data that is behind the report. That won’t have the microdata, but it will have summarized tables.

OECD - Excel Spreadsheet

Often times there are workbook tabs in the Excel spreadsheet (see screenshot below) and these are tables that are not included in this PDF. It is sort of behind the scenes analysis that they did, but may be too complicated to report on. Heather notes that these are gold in terms of data journalism.

OECD - Excel Data

Scraping Data

If you really can’t find your data anywhere, then you can scrape it. There are a couple of different ways to scrape it. Heather recommended the website Import.io (see screenshot below) that she uses a lot. She feels that it is very logical and a quick and intuitive way to work with data. You need to practice for a while then you can get data off of any website that has data. Or if you want to learn how to code or hire someone to code, you can scrape data from almost any website that has data.

Scraping Data Software

If you are going to scrape data or use data that is a little hard to find, it is a very good idea to consider the ethics of scraping data.

Scraping Data Online Blog

Heather notes,

There is no single consensus on what the ethics are at using data that is semi-public on that link or is scraped. Many people say, if it is on the internet than it is public then it is usable. That’s fine, but that’s not legally written in anybody’s code of ethics that I’m aware of. I think there are many different perspectives on what data is free to use and what data isn’t. It is very important to read the user terms and conditions on any website where you are scraping data or any report you are taking data from.

Scraping Data Ethics

Heather also points out,

It is also important to look at the robots.text file (see example screenshot below). This is only if you are going to scrape data you need to look at the robot.txt file at the root of the website, which will basically tell you the parts of the website you are and are not allowed to scrape.

Scraping Data Ethics Robots

And there is a very, very good article (Click on screenshot below to redirect to the article)on the conversation about the ethics of scraping data. If you are using data that is in a report or is easily downloadable in an Excel spreadsheet from someone’s website you are on very solid ethical grounds.

Scraping Data Ethics Article

Bottom Line: If you are scraping websites, you might want to take some time to think about what you’re doing and who it could expose and the consequences of that.

Summary

Part 3 of this series has shown you many different ways to find data: Google searches, special Google searches, reports and scraping data. In Part 4, we will discuss the scenario where you now have the data you need, how do you understand that data before you start working with it.

Next Blog Post: Continuation of the Review of Module 1 – Understanding Data

Sources:

[1] Alberto Cairo and Heather Krause, Course Video: Module 1: Visualization for Discovery, Data Exploration and Storytelling: Finding Stories in Data with Exploratory Analysis and Visualization, Knight Center for Journalism in the Americas, The University of Texas – Austin, January 16–February 26, 2017.

 


Filed under: Alberto Cairo, Analytics, Climate Change, Data Visualization, Dataverse, Enigma, Ethics, Excel, Guardian, The, Heather Krause, Hockey Stick Chart, Infographics, Michael Mann, Microsoft Excel, MOOC, New York Times, OECD, ProPublica, Publico, Scraping Data, Storytelling, Tableau, tableau public, The Functional Art, The Truthful Art, Uncategorized, Visual Loop, Visualoop

Tableau Releases Version 10.3

$
0
0

Adds Data Driven Alerting, Smart Recommendation Engine, and Expands Data Connections

Tableau_LogoLast Thursday (June 1, 2017), Tableau Software, announced the general availability of Tableau 10.3. This latest release will help organizations achieve data-driven insights faster than ever, through automated table and join recommendations powered by machine learning algorithms that simplify the search for the right data for analysis. It also includes data driven alerts to allow for proactive monitoring of key metrics. 10.3 unlocks six new data sources for rapid-fire analysis, including a new connector for extracting data from PDF documents. Additionally, Tableau Online customers are able to try Tableau Bridge (in beta), which enables a direct connection to data stored on premises directly in the cloud.

Tableau Bridge

Tableau uses machine learning to recommend relevant tables and joins to simplify defining the right data models. Recommendations are based on collective insights derived from usage patterns across your entire organization, and help you get the relevant data to jumpstart your analysis.

“Organizations want to do more with the vast data they have at their disposal,” said Francois Ajenstat, Chief Product Officer at Tableau. “It’s not just simple analysis our customers are seeking, it’s the power to unlock all of their data with ease and efficiency. With smart recommendations, customers can get to the right data faster than ever – without having to spend time finding the right tables and joins. And with proactive monitoring of key metrics through features like data driven alerts, they can take action immediately and be more agile.”

Stay on top of your changing business with data driven alerts

Tableau 10.3 makes it easier for everyone to stay engaged with the metrics that matter most. With new data-driven alerts, customers can instantly receive notifications as their data crosses a pre-set threshold, ensuring they never miss an important change in their organization. Customers can set alerts simply by pointing at the data on which they want to be notified.

“The new data-driven alerts will be a game-changer for our organization,” said Paul Lisborg, manager of Business Intelligence and Analytics at Oldcastle Architectural. “Users will now be provided a way to automatically receive outliers on customer orders, sales, and production data, allowing us to quickly respond to potential issues and more effectively manage our business.”

Tableau Subscriptions

Harness your data with smart table and join recommendations powered by machine learning

Tableau 10.3 makes it easier for people to find the right data for their analysis with smart table and join recommendations. Leveraging machine learning algorithms, Tableau Server analyzes aggregate data source usage to recommend popular tables and corresponding joins across the organization. With recommendations, customers can save time by quickly identifying database tables that are relevant to their analysis and leveraging join recommendations to enrich their data. Now it’s easy for customers to automatically apply insights from experts and other users across their organization, increasing the overall quality of their data models.

Connect to more data, in the cloud and even PDFs

Tableau 10.3 makes it easy for teams to access data, wherever it resides. In all, customers can now connect to more than 75 data sources via 66 connectors, without any programming. That includes a new PDF connector, which allows people to directly import PDF tables into Tableau with just one click. With an Adobe estimated 2.5 trillion PDFs worldwide, this unlocks a new realm of data that can be leveraged for rich analysis.

Additionally, Tableau now comes with new connectors to popular data sources such as Amazon Athena, ServiceNow, MongoDB, Dropbox, and Microsoft OneDrive. These new data connectors add to Tableau’s deep roster of built-in data connectors.

Tableau Connect to PDF

Hybrid data for the cloud

Tableau Online customers can now leverage data stored on premises directly in the cloud with the new Tableau Bridge. Available to all Tableau Online customers to try, this will allow a secure, live connection to on premises data, meaning it’s no longer necessary to move data to perform a live query from Tableau Online. Many organizations have data on premises and in the cloud, and Tableau Bridge allows these customers to easily connect live to all of their data no matter where it is. We’re helping organizations deliver cloud analytics with their existing on premises database investments.

Check out the full features list for Tableau 10.3 at www.tableau.com/new-features/10.3

Source: PRNewswire, Tableau 10.3 Adds Data Driven Alerting, Smart Recommendation Engine, and Expands Data Connections, Yahoo! Finance, PR Newswire, June 1, 2017, https://finance.yahoo.com/news/tableau-10-3-adds-data-130000797.html.

Filed under: Adolfo Arranz, Business Intelligence, Data Visualization, Infographics, Preview, Tableau, Tableau Customer Conference, tableau public, Uncategorized

Tableau Conference on Tour 2017 – London: I Wasn’t There

$
0
0

Readers:

Last week was an exciting time for Tableau and Alteryx users. In Las Vegas here in the States, Alteryx was having their Inspire 2017 Conference. On the other side of the pond, Tableau was having their Tableau Conference on Tour in London.

Sadly, I was not at either of them.

However, I followed both religiously on Twitter. Below are some interesting photos that were posted at the Tableau Conference on Tour. I tried to add a few comments to frame the photos I have selected.

It was a great conference. Too bad I wasn’t there.

Best regards,

Michael

Comment: I really would have liked to see Clive Benford’s presentation in person. I really liked his insights on Jaguar Land Rover’s approach to analytics. If anyone has a copy of the presentation you can send me, I would really appreciate it.

IMG_1297

IMG_1292IMG_1293IMG_1294

IMG_1296

Comment: Very powerful statement.

Data is now Everyone’s Job

IMG_1295

 

Joe Bullock’s presentation on Breaking Down Silos and Improving Data Governance. Joe is with the Medicial Defence Union.

IMG_1298IMG_1299IMG_1300

Miguel Cisneros great Tableau dataviz was showcased at the conference.

IMG_1301

Combating Fake News: Keynote Speaker David Speigelhalter, Professor for the Understanding of Risk at the University of Cambridge.

IMG_1302IMG_1303IMG_1306IMG_1307IMG_1308IMG_1309

 

IMG_1310

Incredible Tableau data visualizations in the 2017 DataViz Gallery.

IMG_1311IMG_1312IMG_1313IMG_1314IMG_1315IMG_1316

I’m a big fan of Ryan Sleeper. Watch for his upcoming book!

IMG_1317

Congratulations to David Pires for winning the Iron Viz.

IMG_1318IMG_1319IMG_1320

An oldie but a goodie. Do you know the answer?

IMG_1321

Look what’s coming in Tableau v10.4

I need to get this book!

My goal is to provide a blog post of an interview with Andy Kriebel in the near future. Watch for it soon.

I hope to join the fun in Las Vegas in October!

 

 


Filed under: Alteryx, Andy Cotgreave, Andy Kriebel, Ben Jones, Data Visualization, Infographics, Iron Viz, Tableau, Tableau Customer Conference, tableau public, Uncategorized

Revisiting Tableau Desktop Fundamentals

$
0
0

Readers:

I am hosting Tableau training this week at work. It is always a good feeling watching the newbies working with Tableau and getting excited about how easy and fun it is to be able to visualize their data.

Since I never had any formal training with Tableau, I have been sitting in the class this week too. It is never too late to teach an old dog new tricks (or at least fill in the holes in my knowledge).

Below are a couple of things I wanted to share from the two-day Tableau Desktop Fundamentals class. The portion on Jacques Bertin and Marks was something I added in case anyone ever asked you what a “Mark” is.

Thanks to Evan Alini and Celeste Luna for keeping the class interesting and fun.

Best regards,

Michael

Tableau File Types

Celeste drew a nice visual explanation of Tableau File Types on the whiteboard. I liked it so much, I recreated it in PowerPoint and added a few more details. Below is a screenshot of the visual I created based on her example.

Tableau File Type

Dual Axis Pills

I learned something new. I did not notice before that the inside rounded ends are missing off the pills to indicate that they participate in a dual axis.

Tableau Dual Axis Pills

Plus (+) and Minus (-) Icons on X-Axis

This is a nice thing to know. Tableau seems to provide you an infinite number of ways to do things.

Plus and Minus Icons

What the Heck is a Mark?

Tableau MarksI have seen people get a puzzled look on their face when a Tableau instructor starts talking about “Marks.” I thought I would provide some history and some background on where this term originated and how it ties into the Tableau Desktop product.

Our usual way of communicating is with words. Written words consist of single symbols (letters), gaining meaning when arranged in certain combinations. The question is: If there are basic visual symbols arranged in a particular way, can they be used to convey information in a similar manner? All of those developments were primarily made for cartographic purposes. With the computerization of information these visual variables were adapted and used for information visualization. The concept of information visualization began in the 1930’s, but after the 1950s became more developed by cartographers. Since the development of computers has revolutionized all aspects of information visualization. [2]

Jacques_BertinJacques Bertin was a French cartographer and theorist, known from his book Semiologie Graphique (Semiology of Graphics), published in 1967. This monumental work, based on his experience as a cartographer and geographer, represents the first and widest intent to provide a theoretical foundation to Information Visualization. [4]

Mr. Bertin described “marks” as these basic units and also developed a given number of methods through which these units can be modified, including position, size, shape, or color. These predefined modifications are called visual variables. Each of these variables can have certain characteristics. Sometimes visual variables are also called visual attributes. [2]

A mark is made to represent some information other than itself. It is also referred to as a sign. Marks can be

  • Points are dimensionless locations on the plane, represented by signs that obviously need to have some size, shape or color for visualization.

  • Lines represent information with a certain length, but no area and therefore no width. Again lines are visualized by signs of some thickness.
  • Areas have a length and a width and therefore a two-dimensional size.
  • Surfaces are areas in a three-dimensional space, but with no thickness.
  • Volumes have a length, a width and a depth. They are thus truly three-dimensional.

Mr. Bertin defined seven Visual Variables consisting of:

Bertin Visual Variables

Noah Iliinsky refined this chart year’s later and his chart is shown below. [3]

VisualPropertiesTable2

O.K., so now you are asking, how did this end up in Tableau Desktop? Let’s introduce Jock Mackinlay.

Jock D. Mackinlay is an American information visualization expert and Vice President of Research and Experience at Tableau Software. [5] Jock invented a number of Information Visualization techniques such as the Information Visualization Reference Model. He also expanded the list of visual variables. In addition, he provided different sorting for their accuracy, based on the task. [2]

Ranking of perceptual tasks:

Mackinlay_PerceptualTask

The list was further expanded by several later publications. Most of them are also grouping the visual variables, e.g. combining length, area and repetition to shape or breaking down position in the three dimensions of space and one time dimension. [2]

Since nowadays information is presented by computers, the addition of motion as a new visual variable becomes important. Changes in motion can include direction, speed, frequency, rhythm, flicker, trails, and style. [2]

I hope this helps you understand why Tableau incorporated the concepts of Marks into their Tableau Desktop product.

Stay tuned for my notes from the Tableau Desktop Intermediate class in a few days.

Sources:
[1] Tableau Software, Tableau Desktop Fundamentals Training v10.2, Glendale, Arizona, June 12-13, 2017.

[2] Infovis Wiki, Visual Variables, http://www.infovis-wiki.net, http://www.infovis-wiki.net/index.php?title=Visual_Variables.

[3] Iliinsky, Noah, Properties and Best Uses of Visual Encodings, ComplexDiagrams.com/properties, June, 2012, http://complexdiagrams.com/properties.

[4] Wikipedia, Jacques Bertin, https://en.wikipedia.org/wiki/Jacques_Bertin.

[5] Wikipedia, Jock D. Mackinlay, https://en.wikipedia.org/wiki/Jock_D._Mackinlay.

 


Filed under: Dual Axis, Jacques Bertin, mark, Noah Iliinsky, Pill, Semiologie Graphique, Tableau, Tableau Customer Conference, tableau public, Uncategorized, Visual Encoding

Tableau Tips & Tricks, Data Blending Revisited, Workflows, Architecture and More

$
0
0

Readers:

As I mentioned in one of my previous blog posts, I hosted Tableau training last week at work. The second class, taught on Thursday and Friday, was steeped with information and tips & tricks related to Tableau.

Since I never had any formal training with Tableau, I had been sitting in the class last week too. It is never too late to teach an old dog new tricks (or at least fill in the holes in my knowledge).

Below are a few tips & tricks I wanted to share from the two-day Tableau Desktop Intermediate class. Also, I have expanded a bit on a previous blog post I did in 2014 about Tableau Data Blending.

Thanks again to Evan Alini and Celeste Luna for the excellent training they provided us last week.

Best regards,

Michael

An Overview of the Tableau Platform and Products [6]

Tableau Visual Intelligence Platform

The Tableau Desktop Architecture Workflow

I used a variety of sources to build, what I feel, is a better depiction of how a visualization is creating using Tableau Desktop. [1][2][3][5][6]

Tableau Desktop Architecture Workflow

Tip: Some Extra Help in the Lower Left Corner

Evan showed us a tip I was not aware of. In the lower left corner of the Tableau worksheet is the following information.

Mark Information Snippet

This is a nice feature to help with your validation of the results in your workbook. If you look at the associated screenshot below, I have 17 marks (horizontal bars), 1 measure column (each horizontal bar represents Sales by Category, Sub-Category), and the total of all sales for the chart is $12,642,502 (if you removed the dimensions, the Sales measure would be the only field displayed).

Mark Information

Measure Values and Measure Names

Measure values and measure names are Tableau-generated fields that serve as containers for more than one measure. You can see the Measure Names field at the bottom of the list of Dimensions and the Measure Values field at the bottom of the Measures list in the Data pane. [1][2]

When you create a combined axis (measure using the same unit of measure such as Sales or Profits) or dual axis view (combine two different mark types including a Stacked Bar Chart), these fields appear in the view automatically, as does a Measure Values card that shows which fields are included. [2]

Measure Names and Values

Tip: Label versus Text on the Mark Card

When you have a picture (visual), you will see “Label” in the upper right of the Mark Card. [2]

Mark Card - Label

When there is not a picture (visual), you will see “Text” in the upper right of the Mark Card.

Mark Card - Text

Tip: Dashboard Actions

The Hover action occurs when you rest the pointer over a mark in the view to run the action. This option works best for highlight actions within a dashboard. [2]

Hover Action

Tip: Enlarge Text in the Calculated Field Editor

Evan showed us a really neat trick. When you are editing in the Calculated Field Editor, if you do a CTRL + Scroll Wheel on your mouse, the text will enlarge or reduce in size depending on the direction you are moving the scroll wheel. [2]

Calculated Field Mouse Scroll Wheel

Tip: Gauges

In class last week, there was a brief discussion about how to create gauges in Tableau. I know there are people out there that have found many creative ways to create these in Tableau, but this fits in the same category where Tableau let pie charts slip into their toolset. I will repeat what Stephen Few said about gauges when I originally took training from him over a decade ago (I am paraphrasing here).

“No!”

I concur. Leave gauges out of your data visualizations and out of Tableau. Nuff said.

Gauges

How Does Data Blending Work? [1][2][4]

Note: Check out my multi-part series on Data Blending in Tableau I blogged about in 2014 by clicking here.

First, to help us in discussing data blending, let’s look at a visual explaining the kinds of joins available.

sql-joins

Data blending is an alternative to joining, depending on factors like the type of data and its granularity. Data blending simulates a traditional left join. The main difference between the two is when the join is performed with respect to aggregation. [2][4]

Left Join (not cross-database)

Left Join

A single query is sent to the database where the join is performed. The results of the join are then sent back to Tableau for aggregation.

Data Blend

Blend

A query is sent to the database for each data source that is used on the sheet. The results of the queries are sent back to Tableau, and then combined. The view uses all rows from the primary data source, the left table, and the aggregated rows from the secondary data source, the right table, based on the dimension of the linking fields.

NOTE: Dimension values are aggregated using the ATTR aggregate function, which means the aggregation returns a single value for all rows in the secondary data source. If there are multiple values for the rows, an asterisk (*) is shown. Measure values are aggregated based on how the field is aggregated in the view. [2][4]

When to Use Data Blending [1][2][4]

Data blending should be considered in the following situations:

You want to combine data from different databases that are not supported by cross-database joins.

First, let’s define a Tableau cross-database join.

When related data is stored in tables across different databases, you can use a cross-database join to combine the tables.

To create a cross-database, you must create a multi-connection Tableau data source. You do so by adding and then connecting to each of the different databases (including Excel and text files) before you join. [7]

Cross-database joins do not support connections to cubes (for example, Oracle Essbase) or to some extract-only connections (for example, Salesforce). In this case, set up individual data sources for the data you want to analyze, and then use data blending to combine the data sources on a single sheet.

Data is at different levels of detail.

Sometimes one data set captures data using greater or lesser granularity than the other data set.

For example, suppose you are analyzing transactional data and quota data. Transactional data might capture all transactions. However, quota data might aggregate transactions at the quarter level. Because the transactional values are captured at different levels of detail in each data set, you should use data blending to combine the data.

Use data blending instead of joins under the following conditions:

Data needs cleaning.

If your tables do not match up with each other correctly after a join, set up data sources for each table, make any necessary customizations (that is, rename columns, change column data types, create groups, use calculations, etc.), and then use data blending to combine the data.

  • Joins cause duplicate data.Duplicate data after a join is a symptom of data at different levels of detail. If you notice duplicate data, instead of creating a join, use data blending to blend on a common dimension instead.
  • You have lots of data.Typically joins are recommended for combining data from the same database. Joins are handled by the database, which allows joins to leverage some of the database’s native capabilities. However, if you’re working with large sets of data, joins can put a strain on the database and significantly affect performance. In this case, data blending might help. Because Tableau handles combining the data after the data is aggregated, there is less data to combine. When there is less data to combine, generally, performance improves.

    Note: When you blend on a field with a high level of granularity, for example, date instead of year, queries can be slow.

Prerequisites for data blending

Your data must meet the following requirements in order for you to use data blending.

Primary and secondary data sources

Data blending requires a primary data source and at least one secondary data source. When you designate a primary data source, it functions as the main table or main data source. Any subsequent data sources that you use on the sheet are treated as a secondary data source. Only columns from the secondary data source that have corresponding matches in the primary data source appear in the view.

Using the same example from above, you designate the transactional data as the primary data source and the quota data as the secondary data source.

Note: Cube (multidimensional) data sources must be used as the primary data source. Cube data sources cannot be used as a secondary data source.

Defined relationship between the primary and secondary data sources

After designating primary and secondary data sources, you must define the common dimension or dimensions between the two data sources. This common dimension is called the linking field.

Continuing the example from above, when you blend transactional and quota data, the date field might be the linking field between the primary and secondary data sources.

  • If the date field in the primary and secondary data sources have the same name, Tableau creates the relationship between the two fields and shows a link icon () next to the date field in the secondary data source when the field is in the view.
  • If the two dimensions don’t have the same name, you can define a relationship that creates the correct mapping between the date fields in the primary and secondary data sources.

Data blending limitations

There are some data blending limitations around non-additive aggregates, such as COUNTD, MEDIAN, and RAWSQLAGG.

Tip: Blue Versus Orange Data Sources [2]

Blue versus Orange Data Sources

I hope this helps you better understand data blending in Tableau and gain an appreciate for the great knowledge you can gain from attending the Tableau Desktop II: Intermediate class.

I have a few more important Tableau topics to discuss, but I want to save them for another day so I can properly discuss and explain them.

Sources:

[1] Milligan, Joshua N., Learning Tableau 10 – Second Edition, Packt Publishing, 2016.

[2] Tableau Software, Tableau Classroom Training – Desktop II: Intermediate v10.2, Glendale, Arizona, June 14-15, 2017.

[3] Tableau Software, Tableau Desktop and Server Architecture, Tableau Software, Seattle, WA, January, 2013.

[4] Tableau Software, Blend Your Data, Tableau Help->Connect to and Prepare Data->Set Up Data Sources, http://onlinehelp.tableau.com/current/pro/desktop/en-us/multiple_connections.html.

[5] Pabba, Ramesh, Data Visualization with Tableau, Knowledgebee Trainings, November, 2015.

[6] Marc Rueter, Tableau Visual Intelligence Platform: Rapid Fire Analytics for Everyone Everywhere, Bloor Group, January, 2012.

[7] Tableau Software, Quick Start: Combine Tables Using Cross-Database Joins, Tableau Help, Tableau Software, Seattle, WA, 2017, https://onlinehelp.tableau.com/current/pro/desktop/en-us/qs_data_integration.html.


Filed under: Architecture, Data Blending, Data Visualization, Database, Dataviz, DataViz Tip, Design, Dimensional Modeling, Distribution, Dual Axis, EagerEyes, Edward Tufte, Export, Extract, Gauges, Graphic, Heat Map, Histogram, Hyperlink, Infographics, Interactive Map, Jacques Bertin, Jock Mackinlay, Joins, mark, Noah Iliinsky, Pill, Report, Report Execution Flow, Semiologie Graphique, Symbols, Tableau, Tableau Customer Conference, tableau public, Visual Encoding, VizQL

Tableau Secrets: Understanding Table Calculations Scope and Direction – Part 1

$
0
0

Readers:

I have been creating some internal documentation at work for our new Tableau community who have been chomping at the bit to start working with it on some real projects.

One of the topics that comes up a lot is the concept of COMPUTE USING or scope and direction. I have created some PowerPoint slides internally and thought I would share them with you.

I relied heavily on the fantastic Tableau book written by Joshua Milligan titled Learning Tableau 10, Second Edition (see cover image below). I will be blogging a review of this book in the next few weeks.

Joshua Milligan Learning Tableau 10 Second Edition Book Cover

So, here is Part 1 of my first Tableau Secrets post about Tableau Calculations and their scope and direction. I hope you find it as helpful as I did while creating the PowerPoint slides.

Part 2 will focus on the scope and directions options in detail.

Best Regards,

Michael

Table Calculations - Slide 1Table Calculations - Slide 2Tableau Table Calculation WorkflowTable Calculations - Slide 3
Table Calculations - Slide 4Table Calculations - Slide 5Table Calculations - Slide 6Table Calculations - Slide 7Table Calculations - Slide 8Table Calculations - Slide 9


Filed under: Data Visualization, Infographics, Joshua Milligan, Tableau, Tableau Customer Conference, tableau public, Tableau Secrets, Uncategorized, Visual Design, Visual Encoding

Tableau Deep Dive: Trends – Part 1

$
0
0

Readers:

Since my life now seems very centered in the Tableau World, I have decided to add a new blog topic called Tableau Deep Dive. I will, to the best of my ability, take a deep dive into a topic related to Tableau Desktop or Tableau Server.

My first dipping of my toe in these waters will focus on Trends. One of the great features of Tableau is that it enables you to quickly enhance your data visualizations with statistical analysis. Built-in features such as trending, clustering, distributions, and forecasting, allow you to quickly add value to your visual analysis. Additionally, Tableau integrates with R, an extensive statistical platform that opens up endless options for statistical analysis of your data. [1]

In Part 1, I will go over the different ways of adding trend lines, how trends are calculated by Tableau after querying the data source, and how trend lines are drawn based on various elements in the view.

In Part 2, I will go over how to customize trend lines as well as the Trend Model.

In Part 3, I will finish up this deep dive by discussing how to analyze Trend Models.

I hope you enjoy this series about trending in Tableau.

Best regards,

Michael

Trending in Tableau

Joshua Milligan Learning Tableau 10 Second Edition Book CoverMuch of the context, dataset,  and Tableau Workbook, I am using for this blog post comes with the book I mention as the primary source, at the end of this blog post (see book cover, right).

The dataset contains one record for each country for each year from 1960 to 2015, measuring population. I will use this dataset to look at the historical trends of various countries. In the example below, I show the change in population over time for Afghanistan and Australia. The Country Name has been filtered to include only Afghanistan and Australia and the field has additionally been added to the Color and Label shelves.

Observations

The growth of the two countries’ populations was fairly similar up to 1980. At that point, the population of Afghanistan went into decline until 1988 when the population of Afghanistan started to increase. At some point around 1996, the population of Afghanistan exceeded that of Australia. The gap has grown even wider since 1996.

2017-09-08_12-38-05

 

Adding Trend Lines

Tableau offers several ways of adding trend lines:

  • From the menu, navigate to Analysis | Trend Lines | Show Trend Lines
  • Right-click an empty area in the pane of the view and select Show Trend Lines
  • Switch to the Analytics pane in the left sidebar and drag and drop Trend Line on to the trend model of your choice (we’ll use Linear for now and we will discuss it in more detail later in this post)

A screenshot of the third option is shown below:

2017-09-11_12-00-47

There will be two trend lines added to your view (one for each country). These lines are thinner than the regular line and are dashed. Later on in this post, I will show you how to customize these lines. Now you view should look like this.

2017-09-11_12-53-44

Trends are calculated by Tableau after querying the data source. Trend lines are drawn based on various elements in the view:

The two fields that define X and Y coordinates: The fields on Rows and Columns that define the x and y axes describe coordinates allowing Tableau to calculate various trend models. In order to show trend lines, you must use a continuous (green) field or discrete (blue) date fields and have one such field on both Rows and Columns. If you use a discrete (blue) date field to define headers, the other field must be continuous (green).

Additional fields that create multiple, distinct trend lines: Discrete (blue) fields on the Rows, Columns, or Color shelves can be used as factors to split a single trend line into multiple, distinct trend lines.

The trend model selected: We’ll examine the differences in models later in this post.

2017-09-11_13-04-36

Notice in the screenshot above that there are two trend lines. Since Country Name is a discrete (blue) field on Color it defines a trend line per color by default.

Earlier, we observed that the population for Afghanistan increased and decreased within various historical periods. Notice that the trend lines are calculated along the entire date range (see screenshot below). What if we want to see different trend lines for those time periods?

2017-09-11_13-07-36

One way to do this is to simply select the marks in the view for the time period of interest. Tableau will, by default, calculate a trend line for the current selection. Below is an example where the points for Afghanistan from 1980 to 1990 have been selected and a new trend is displayed:

2017-09-11_13-18-41

 

Another option is to tell Tableau to draw distinct trend lines using a discrete field on RowsColumns, or Color.

Create a calculated field called Period that defines discrete values for the different historical periods using the code below.

IF [Year] <= 1979
  THEN "1960 to 1979"
ELSEIF [Year] <= 1988
  THEN "1980 to 1988"
ELSE "1988 to 2015"
END

When you place this new calculated field on Columns, you’ll get a header for each time period, which breaks the lines and causes separate trends to be shown for each time period (see screenshot below). You can also see that Tableau keeps the full date range in the axis for each period. You can set an independent range by right-clicking one of the date axes by selecting Edit Axis, and then checking the option for Independent axis range for each row or column.

2017-09-11_13-23-49

 

In this view, transparency has been applied to Color to help the trend lines stand out. Additionally, the axis for Year was hidden (by unchecking the Show Header option on the field). Now you can clearly see the difference in trends for different periods of time. Australia’s trends only slightly change in each period. Afghanistan’s trends were quite different.

Next: Customizing Trend Lines and Trend Models

Source:

[1] I relied heavily on the fantastic Tableau book written by Joshua Milligan titled Learning Tableau 10, Second Edition (see cover image below). I will be blogging a review of this book in the next few weeks. Click here to purchase you own copy of this book.

Joshua Milligan Learning Tableau 10 Second Edition Book Cover

 

 


Filed under: Joshua Milligan, Predictive Analytics, Statistical Analysis, Tableau, Tableau Customer Conference, Tableau Deep Dive, tableau public, Tableau Secrets, Trends, Uncategorized

Tableau Deep Dive: Trends – Part 2

$
0
0

Readers:

In Part 1 of my deep dive into Trending in Tableau, I went over the the different ways of adding trend lines, how trends are calculated by Tableau after querying the data source, and how trend lines are drawn based on various elements in the view.

In Part 2, I will go over how to customize trend lines as well as the Trend Model.

In Part 3, I will finish up this deep dive by discussing how to analyze Trend Models.

I hope you enjoy this series about trending in Tableau.

Best regards,

Michael

Customizing Trend Lines and Trend Models

Joshua Milligan Learning Tableau 10 Second Edition Book CoverAs I mentioned in Part 1, much of the context, dataset,  and Tableau Workbook, I am using for this blog post comes with the book I mention as the primary source, at the end of this blog post (see book cover, right).

Customizing trend lines

Let’s move on to another example using real estate trends. For this example, I will be using the Real Estate Listings data source provided with the book on the right. The scatter plot below shows a comparison of Real Estate Listings by Price and Square Feet.

 

2017-09-12_10-57-55

In the screenshot above, I show a scatterplot with the sum of Size (Sq Ft) on Columns to define the X-axis and the sum of Price on Rows to define the Y axis. Address has been added to the Detail of the Marks card to define the level of aggregation. So each mark on the scatterplot is a distinct address at a location defined by the size and price. Type of Sale has been placed on Color. Trend lines are shown. Based on Tableau’s default settings there are three: one trend line per color.

Assuming a good model, the trend lines demonstrate how much and how quickly Price is expected to rise with an increase in size for each type of sale.

TIP

In this data set we have two fields, Address and ID, either of which define a unique record. Adding one of those fields to the level of detail effectively dis-aggregates the data and allows us to plot a mark for each address. Sometimes you may not have a field in the data that defines uniqueness. In those cases, you can disaggregate the data by unchecking Aggregate Measures from the Analysis menu.

Alternately, you can use the drop-down menu on each of the measure fields on Rows and Columns to change them from measures to dimensions while keeping them continuous. As dimensions, each individual value will define a mark. Keeping them continuous will retain the axes required for trend lines.

Now, let’s look at some of the options available for trend lines. You can edit trend lines by using the menu and navigating to Analysis | Trend Lines | Edit Trend Lines… or by right-clicking on a trend line and then selecting Edit Trend Lines…. When you do, you’ll see a dialog box similar to this:

2017-09-12_11-16-06

In the dialog box, we are provided with the following:

  • Options for selecting a Model type
  • The ability to select applicable fields as factors in the model
  • Allowing discrete colors to define distinct trend lines
  • Showing confidence bands
  • and Forcing the y-intercept to zero.

We will examine these options in further detail. For now, experiment with the options for a bit. Notice how either removing the Type of Sale field as a factor or unchecking the Allow a trend line per color option results in a single trend line.

You can also see the result of excluding a field as a factor in the following view where Type of Sale has been added to Rows:

2017-09-12_11-25-27

 

In the screenshot above, in the left portion, Type of Sale is included as a factor. This results in a distinct trend line for each type of sale. When Type of Sale is excluded as a factor of the same trend line, which is the overall trend for all types, it is drawn three times. This technique can be quite useful for comparing subsets of data to the overall trend.

Trend models

Let’s go back to the original scatter plot we started with today. I am going to use a single trend line as we consider the trend models available. The following models can be selected from the Trend Line Options window:

Linear

We use this model if we assumed that, as Size increases, the Price will increase at a constant rate. No matter how much Size increased, we’d expect Price to increase such that new data points fell close to the straight line.

2017-09-12_11-38-44

Logarithmic

We would use this model if we expect that there is a law of diminishing returns in effect. That is, size can only increase so much before buyers will stop paying much more.

2017-09-12_11-43-37

Exponential

We would use this model to test the idea that each additional increase in size results in a dramatic (exponential!) increase in price.

 

Exponential – Defined [2]

In mathematics, an exponential function is a function of the form

{\displaystyle f(x)=b^{x}\,}

in which the input variable x occurs as an exponent. A function of the form {\displaystyle f(x)=b^{x+c}}, where c is a constant, is also considered an exponential function and can be rewritten as {\displaystyle f(x)=ab^{x}}, with a=b^{c}.

As functions of a real variable, exponential functions are uniquely characterized by the fact that the growth rate of such a function (i.e., its derivative) is directly proportional to the value of the function. The constant of proportionality of this relationship is the natural logarithm of the base b:

{\displaystyle {\frac {d}{dx}}{\left(b^{x}\right)}=b^{x}{\log _{e}}{(b)}}
2017-09-12_11-54-51

Polynomial

We would use this model if we felt the relationship between Size and Price was complex and followed more of an S shape curve where, initially, increasing the size dramatically increased the price but at some point price leveled. You can set the degree of the polynomial model anywhere from two to eight. The trend line shown below is a 3rd degree polynomial.

2017-09-12_12-01-01

Next: Analyzing Trend Models

Source:

[1] I relied heavily on the fantastic Tableau book written by Joshua Milligan titled Learning Tableau 10, Second Edition (see cover image below). I will be blogging a review of this book in the next few weeks. Click here to purchase you own copy of this book.

Joshua Milligan Learning Tableau 10 Second Edition Book Cover

[2] Wikipedia, Exponential Function, https://en.wikipedia.org/wiki/Exponential_function.

 


Filed under: Joshua Milligan, Predictive Analytics, Statistical Analysis, Tableau, Tableau Customer Conference, Tableau Deep Dive, tableau public, Tableau Secrets, Trends, Uncategorized

Tableau Deep Dive: Trends – Part 3

$
0
0

Readers:

In Part 2 of my deep dive into Trending in Tableau, I went over how to customize trend lines as well as the Trend Model.

In Part 3, I will finish up this deep dive by discussing how to analyze Trend Models.

I hope you enjoy this series about trending in Tableau.

Best regards,

Michael

Analyzing trend models

Joshua Milligan Learning Tableau 10 Second Edition Book CoverAs I mentioned in Parts 1 & 2, much of the context, dataset,  and Tableau Workbook, I am using for this blog post comes with the book I mention as the primary source, at the end of this blog post (see book cover, right).

Observing trend lines can be useful, but often we want to understand if the trend model we’ve selected is statistically meaningful. Fortunately, Tableau gives us some visibility into trend models and calculations.

When you hover over a single trend line, Tableau will reveal the formula as well as the R-Squared and P-Value for that trend line.

2017-09-12_17-23-54

NOTE

P-value is a statistical concept that describes the probability that the results of assuming no relationship between values (random chance) are at least as close as results predicted by the trend model. A P-value of 5% (.05) would indicate a 5% chance of random chance describing the relationship between values at least as well as the trend model. This is why a P-value of 5% or less is considered to indicate a significant trend model. If your P-value is higher than 5% then you should not consider that trend to significantly describe any correlation.

Additionally, you can see a much more detailed description of the trend model by navigating to Analysis | TrendLines | Describe Trend Model… from the menu or by using the similar menu from a right-click on the view’s pane. When you view the trend model, you will see the Describe Trend Model window (see screenshot below).

 

2017-09-12_17-31-18

TIP

You can also get a trend model description in the worksheet description, which is available from the Worksheet menu, or by pressing Ctrl + e . The worksheet description includes quite a bit of other useful summary information about the current view.

The wealth of statistical information shown in the window includes a description of the trend model, the formula, number of observations, and P-value for the model as a whole as well as for each trend line. Notice that, in the window shown above, the Type field was included as a factor defining three trend lines. At times, you may observe that the model as a whole is statistically significant even though one or more trend lines may not be.

NOTE

Additional summary statistical information can be displayed in Tableau Desktop for a given view by showing the Summary. From the menu, select Worksheet | Show Summary. The information displayed in the summary can be expanded using the drop-down menu on the Summary card.

2017-09-12_17-40-50

 

2017-09-12_17-43-24

Exporting Trend Model Data

Tableau also gives you the ability to export data, including data related to trend models. This allows you to more deeply, and even visually, analyze the trend model itself. Let’s analyze the 3rd degree polynomial trend line of the real estate price and size scatter plot without any factors. To export data related to the current view, use the menu and select Worksheet | Export | Data. The data will be exported as a Microsoft Access Database (.mdb) and you will be prompted where to save the file.

NOTE

The ability to export data to Access is limited to a PC only. If you are using a Mac, you won’t have the option. In this case, you may wish to read through this section for informational purpose.

On the Export Data to Access screen, specify an Access table name and select whether you wish to export data from the entire view or the current selection (see second screenshot below). You may also specify that Tableau should connect to the data. This will generate the data source and make it available with the specified name in the current workbook.

2017-09-12_17-53-02
2017-09-12_17-54-42

The new data source connection will contain all the fields that were in the original view as well as additional fields related to the trend model. This allows us to build a view such as the following using the residuals and predictions.

2017-09-12_17-59-16

A scatter plot of predictions (X axis) and residuals (Y axis) allows you to visually see how far each mark was from the location predicted by the trend line. It also allows you to see if residuals are distributed evenly on either side of zero. An uneven distribution would likely indicate problems with the trend model.

You can include this new view along with the original in a dashboard to explore the trend model visually. Use the highlight button on the toolbar to highlight by the Address field:

Analyzing trend models

With the highlight action defined, selecting marks in one view will allow you to see them in the other. You could extend this technique to export multiple trend models and dashboards to evaluate several trend models at the same time.

2017-09-12_18-01-45

NOTE

You can achieve even more sophisticated statistical analysis by leveraging Tableau’s ability to integrate with R. R is an open source statistical analysis platform and programming language with which you can define advanced statistical models. R functions can be called from Tableau using special table calculations (all of which start with SCRIPT_). These functions allow you to pass expressions and values to a running R server, which will evaluate the expressions using built-in libraries or custom-written R scripts and return results to Tableau. You can learn more about Tableau and R integration from this whitepaper (you will need to register a free account first): http://www.tableau.com/learn/whitepapers/using-r-and-tableau.

Source:

[1] I relied heavily on the fantastic Tableau book written by Joshua Milligan titled Learning Tableau 10, Second Edition (see cover image below). I will be blogging a review of this book in the next few weeks. Click here to purchase you own copy of this book.

Joshua Milligan Learning Tableau 10 Second Edition Book Cover

 

 


Filed under: Joshua Milligan, Predictive Analytics, Statistical Analysis, Tableau, Tableau Customer Conference, Tableau Deep Dive, tableau public, Tableau Secrets, Trends, Uncategorized

Tableau Deep Dive: Creating a Forecast

$
0
0

First, Either a Joke or Quote

He uses statistics as a drunken man uses lamp posts — for support rather than illumination.

Andrew Lang (1844-1912)

Source: Treasury of Humerous Quotations

What is Forecasting?

Forecasting is a planning tool that helps management in its attempts to cope with the uncertainty of the future, relying mainly on data from the past and present and analysis of trends. [1]

Forecasting starts with certain assumptions based on the management’s experience, knowledge, and judgment. These estimates are projected into the coming months or years using one or more techniques such as Box-Jenkins models, Delphi method, exponential smoothing, moving averages, regression analysis, and trend projection. Since any error in the assumptions will result in a similar or magnified error in forecasting, the technique of sensitivity analysis is used which assigns a range of values to the uncertain factors (variables).

Forecasting

Types of Forecasting Methods

There are two main approaches to forecasting – the Qualitative Method and the Quantitative Method. The image below shows two approaches for forecasting demand.

Forecasting-methods

Qualitative Methods: these are subjective and are based on the judgment and opinion of experts or consumers. When no past data is available, qualitative methods are used for making medium-to-long-range decisions. Market research is a type of qualitative forecasting method.

Quantitative Methods: in these, future data is forecast as a function of past data. These methods are appropriate when we have past numerical data, and when we can reasonably assume that some of the data patterns are likely to continue in the future. Quantitative methods are generally used for making short-term and medium-term decisions.

Average Method: forecasts of all future values equal the mean of the historical data. This method is appropriate with any type of data where past data is available. If we let historical data be denoted by yT, then we can write the forecasts as:

Forecasting formulaEven though the time-series notation has been used here, it is also possible to use the average approach for cross-sectional data. Then the forecast for unobserved values is the average of the observed values. (Image: otexts.org)

Naïve approach: said to be the most cost-effective prediction model, it provides a benchmark against which other, more sophisticated models may be compared. This approach is only appropriate for time-series data. With the naïve approach, the forecasts are equal to the last observed value.

Drift Approach: this is a variation on the naïve approach. It allows forecasts to increase or decrease over time, where the drift (amount of change over time) is set to be the average seen in the historical data. Hence, the forecast for T + h is given by:

Forecasting formula 2This is like drawing a line between the first observation and the last, and extrapolating it into the future. (Image: Wikipedia)

Seasonal Naïve Method: accounts for seasonality by setting each forecast to be equal to the last observed value in that season. For example, the prediction value for all future months of May will be equal to all previous May values. The forecast for T + h is:

Forecasting formula 3Where m = seasonal period, and K is the smallest integer greater than (h – 1)/m. (Image: Wikipedia)

The seasonal naïve approach is especially useful for data that has a particularly high level of seasonality.

Creating a Forecast in Tableau

Forecasting requires a view that uses at least one date dimension and one measure. For example:

  • The field you want to forecast is on the Rows shelf and a continuous date field is on the Columns shelf.
  • The field you want to forecast is on the Columns shelf and a continuous date field is on the Rows shelf.
  • The field you want to forecast on either the Rows or Columns shelf, and discrete dates are on either the Rows or Columns shelf. At least one of the included date levels must be Year.
  • The field you want to forecast is on the Marks card, and a continuous date or discrete date set is on RowsColumns or Marks.

Note: You can also create a forecast when no date dimension is present if there is a dimension in the view that has integer values. See Forecasting When No Date is in the View.

To turn forecasting on, either right-click (control-click on Mac) on the visualization and choose Forecast >Show Forecast, or choose Analysis >Forecast >Show Forecast.

With forecasting on, Tableau visualizes estimated future values of the measure, in additional to actual historical values. The estimated values are shown by default in a lighter shade of the color used for the historical data:

Example 1

Prediction Intervals

The shaded area in the screenshot above shows the 95% prediction interval for the forecast. That is, the model has determined that there is a 95% likelihood that the value of sales will be within the shaded area for the forecast period. You can configure the confidence level percentile for the prediction bands, and whether prediction bands are included in the forecast, using the Show prediction intervals setting in the Forecast Options dialog box (see screenshot below).

Show Projection Intervals - Drop Down Workflow

If you do not want to display prediction bands in forecasts, clear the check box. To set the prediction interval, select one of the values or enter a custom value. The lower the percentile you set for the confidence level, the narrower the prediction bands will be.

How your prediction intervals are displayed depends on the mark type of your forecasted marks:

Forecast mark type Prediction intervals displayed using
Line Bands
Shape, square, circle, bar, or pie Whiskers

In the following example, forecast data is indicated by orange shaded circles, and the prediction intervals are indicated by lines ending in whiskers.

Example 2

NOTE: Whiskers are typically seen in a box and whisker plot, a box drawn around the quartile values, and the whiskers extend from each quartile to the extreme data points.

FYI, the box-and-whisker plot is good at showing the extreme values and the range of middle values of your data. The box shows us the middle values of a variable, while the whiskers stretch to the greatest and lowest value of that variable. The box-and-whisker plot was invented in the 1970’s by John Tukey. [4]

Enhancing Forecasts

For each forecast value, consider verifying the quality or precision of your forecast by dragging another instance of the forecast measure from the Data pane to the Detail shelf on the Marks card and then after right-clicking the field to open the content menu, choosing one of the available options. In my example below, I selected Precision %.

Percision Percent

Forecast Field Results Descriptions

Tableau provides several types of forecast results. To view these result types in the view, right-click (control-click on Mac) on the measure field, choose Forecast Result, and then choose one of the options.

The options are:

  • Actual & Forecast – Show the actual data extended by forecasted data.
  • Trend – Show the forecast value with the seasonal component removed.
  • Precision – Show the prediction interval distance from the forecast value for the configured confidence level.
  • Precision % – Show precision as a percentage of the forecast value.
  • Quality – Show the quality of the forecast, on a scale of 0 (worst) to 100 (best). This metric is scaled MASE, based on the MASE (Mean Absolute Scaled Error) of the forecast, which is the ratio of forecast error to the errors of a naïve forecast which assumes that the value of the current period will be the same as the value of the next period. The actual equation used for quality is:

    The Quality for a naïve forecast would be 0. The advantage of the MASE metric over the more common MAPE is that MASE is defined for time series which contain zero, wheras MAPE is not. In addition, MASE weights errors equally while MAPE weights positive and/or extreme errors more heavily.

  • Upper Prediction Interval – Shows the value above which the true future value will lie confidence level percent of the time assuming a high quality model. The confidence level percentage is controlled by the Prediction Interval setting in the Forecast Options dialog box.
  • Lower Prediction Interval – Shows 90, 95, or 99 confidence level below the forecast value. The actual interval is controlled by the Prediction Interval setting in the Forecast Options dialog box.
  • Indicator – Show the string Actual for rows that were already on the worksheet when forecasting was inactive and Estimate for rows that were added when forecasting was activated.
  • None – Do not show forecast data for this measure.

BTW, you can repeat the process to add additional result types for each forecast value.

By adding such result types to the Details shelf, you add information about the forecast to tooltips for all marks that are based on forecasted data.

In my final example below, I added a bit more formatting and show how the tooltip will look in the final data visualization.

Final Example

Sources:

[1] businessdictionary.com, Forecasting, BusinessDictionary.com, http://www.businessdictionary.com/definition/forecasting.html.

[2] Market Business News, What is forecasting? Definition and meaning,
Market Business News, http://marketbusinessnews.com/financial-glossary/forecasting-definition-meaning/.

[3] Tableau Software, Forecasting, Tableau Software Help, Breadcrumbs: All Tableau Help > Tableau Help > Design Views and Analyze Data > Work with Time > Forecasting, http://onlinehelp.tableau.com/current/pro/desktop/en-us/forecasting.html.

[4] Wilson, James W., Box and Whisker Plot, University of Georgia, Mathematics Education Department, Intermath Dictionary, http://intermath.coe.uga.edu/dictnary/descript.asp?termID=57.

 

 

 


Filed under: Box & Whisker Plots, Data Visualization, Forecast, Infographics, Interactive Data Visualization, John Tukey, Statistical Analysis, Tableau, Tableau Customer Conference, Tableau Deep Dive, tableau public, Tableau Secrets, Trends, Uncategorized

Free Tableau Desktop Boot Camp Notes (PDF)

$
0
0

Readers:

I put my Tableau Boot Camp Notes in Dropbox for the Tableau Community just now. It ranges from creating extracts, explaining Tableau architecture, an Introduction to Dimensional Modeling, A quick overview of SQL Joins, Tableau Tips & Tricks, etc.

Here is the link.

 

https://www.dropbox.com/s/ul4rfsayjjrp5is/Tableau%20Training%20Additional%20Reference%20Material%202017-10-05.pdf?dl=0

Enjoy!

Michael

Boot Camp Cover Page


Filed under: Data Visualization, Infographics, Tableau, Tableau Customer Conference, Tableau Deep Dive, tableau public, Tableau Secrets, Uncategorized

Tableau Conference 2017 – Las Vegas – Data Myths, Honeywell Data Compliance Best Practices

$
0
0

IMG_4515

Readers:

Last week I attended the Tableau Conference 2017 in Las Vegas.

Since many people are blogging about the conference (14,000+ attended), I decided to just focus on the four data myths discussed by Adam Selipsky, CEO of Tableau Software, and the great data governance practices overview presented by Sheri Benzelock, VP of Business Analytics Transformation for Honeywell.

Best regards,

Michael

Moment of Silence

Elissa Fink, Tableau’s Chief Marketing Officer, started the keynote by having the audience join the Tableau Executive Team in a moment of silence in honor and memory of the victims of the Las Vegas Shooting that occurred the week before.

The best way to help a community in need is to show up. — Elissa Fink

Moment_of_Silence

Data Myths

Adam Selipsky, CEO of Tableau Software, started off his keynote with a very timely and prophetic statement.

Data leads to truth. Without truth, the vacuum is filled with myth. — Adam Selipsky

Data Myths

Adam said there are four key myths about the world of analytics and data.

Data Myth 1

Myth #1 – AI will replace the analyst

Some have seen AI as nothing more than “parlor tricks.” Many people assume AI will magically scan your data and make the best decisions for you. This is false. AI works on assumptions people have encoded or work on behavior. Both of these can have racial and gender bias. Statements made by technology companies relying on the the results of AI have led to a new field known as Ethical AI.

In actuality, AI will assist and not replace the analyst. We will still use human intuition and creativity to answer the important questions.

Stanford AI Quote

Data Myth 2

Myth #2 – Data is only for the analysts

The growth of the amount of data in the world is exploding. IDC predicts that by the year 2025, the amount of data we have will increase twenty-fold. Not only will there be more data, it will become even more valuable.

The world’s most valuable resource is no longer oil, but data. –The Economist, May 6, 2017.

The analytics and data programs being taught in colleges and universities today have doubled in the past five years. Companies cannot find workers fast enough. There are now 800 million knowledge workers. Excel used to be taught as a University class. Now, it is taught to children in grade school. Tableau’s goal is the same; to introduce Tableau to children in grade school today.

Data is not just for analysts. It is for everyone.

Data Skills are Hard to Find

Data Myth 3

Myth #3 – Data Governance Means No

Old conventional wisdom was we need to slow down the process of providing data. Data is valuable so it needs to be protected. We needed to ensure it was safe to the point where we impeded business partners from having data when they needed it. IT’s ability to deliver data became a bottleneck.

True governance means secure enablement!

Enabling the data unlocks curiosity, insights and agility. When you let the business partners have the data they need, shadow IT dies on its own.

No Data For You

Adam next introduced Sheri Benzelock, VP of Business Analytics Transformation for Honeywell.

Honeywell Data Governance Practices

Sheri Benzelock

Sheri mentioned that IT is often afraid that Tableau will introduce the “Wild, Wild West of Data” in their companies. Business Partners will create their own data sets, share them, send them attached to e-mails, etc. Well, surprise! This is already happening using Excel.

I Reckon

Sheri said they focused on two key areas of governance:

What is the Truth?

and

Who Gets to See this Truth?

The first thing Honeywell did was implement a Tableau license process where, when you acquire a Tableau license, you need to post a visualization on their Tableau server. It does not matter what kind of viz you create; the point is for you to know there is a Tableau Server and the business partners understand it is the most appropriate way to share your data visualizations. This helps reduce the risk of inappropriate sharing.

The second thing they did was to implement a sandbox and certified data sites for each of their businesses. This allows the businesses to play, but sharing has to go through a certification process first.

Here are the rules.
First, they have to post the data on a certified data site.
Second, it has to comply with Honeywell Data and Security Standards.
Third, it has to be SOX Compliance (if applicable).
Fourth, it has to be fit for purpose.

This helps reduce risk of ungoverned and unvetted data.

The third thing they did was to promote having dedicated data artisans in each area. These people are responsible for the data in their area, gathering and cleaning, and then publishing it.

This is a huge productivity enabler for Honeywell and saves them millions of dollars.

The last point she made was that the Tableau Server helps them see into their data economy. Who published the data, who is using the data, when was it last used, how popular is that data, etc.? For example, if you have an Excel data file that is being used continuously by 500+ users, then this is a data source you probably want to stand up to be populated and refreshed more systematically.

Their Tableau data deployment went viral. In less than two years, they have 20,000 users.

These data and governance standards helped Honeywell strike the right balance of empowering the business, enabling better visibility into their data, and instilling trust and governance in their data.

Honeywell considers this governance myth busted!

Data Myth 4

Myth #4 – There can be one, perfect source of the truth

Everyone talks about the Single Source of the Truth database. However, innovation occurs so rapidly today that is is hard to predict the different data sources required and what combinations of these you need. Fifteen years ago, no one ever heard of having your data in the Cloud.  Ten years ago, no one ever heard of NoSQL, Who knew five years ago that the Internet of Things (IoT) would gain so much traction so quickly.

We now live in a world of many sources of truth.

As data people, we need to embrace that we may require many different data sources to answer the questions our business partners have.

Ask yourself this question: Can you integrate rapidly with all new data sources you need to answer the questions your business partners have?

If the answer is “No,” then maybe it’s time you take a strong look at Tableau.

Myriad of Choices

Additional Images With Meme Expressions Copyright © October 15, 2017, Michael S. Sandberg.


Filed under: Adam Selipsky, Best Practices, Data Blending, Data Cleansing, Data Scientist, Data Visualization, Database, Dataviz, Elissa Fink, Francois Ajenstat, Myths, Sheri Benzelock, Tableau, Tableau Customer Conference, Tableau Deep Dive, tableau public, Tableau Secrets, Uncategorized

2017 KANTAR INFORMATION IS BEAUTIFUL AWARDS ANNOUNCE GROUNDBREAKING SHORTLIST

$
0
0

Readers:

Kantar LogoThe shortlist for the Kantar Information is Beautiful Awards 2017 were announced Thursday 19 October.

Celebrating global excellence in data visualization, infographics and information design, the Awards give $20,000 across eight new subject-based categories in 2017.

From gender and cyberbulling to global issues of terror, migration and climate change, graphics cut through accusations of ‘fake news’ to tell stories about what matters to people right now, whether it ’s the music of David Bowie and how hip hop is turning on Donald Trump, or where to get a good meal in New York.

Data visualizers are expanding the definition of infographics, moving beyond pencil and paper to creating innovative projects including a font made of data and physical data objects made from materials including clay pottery and cigarettes.

The awards also highlight new and exciting markets for data visualization in the non-English speaking world, including two shortlistees from India and a record number from China.

Over the next week, I will be showcasing several entries on the shortlist that I particularly liked. You can also review all of the shortlist entries at http://www.informationisbeautifulawards.com.

Below is the first entry I am showcasing this week.

Best regards,

Michael

On Their Way: the Journey of Foreign Fighters by DensityDesign Lab

In the context of understanding the complex phenomenon of violent religious radicalization, this map details the journey of ISIS’ foreign fighters to the territories of the Caliphate, as well as of those who return. Starting from publicly available data, additional layers of information show how this phenomenon relates to the distance of each country from the destination, its total population and Islamic population.

The artwork was published on “La Lettura”, the cultural supplement of “Corriere Della Sera”.

The creator of this work has supplied multiple images, please click here to view.

Credits

Serena Del Nero, Marco Mezzadra, Claudia Pazzaglia, Alessandro Riva, Alessandro Zotta

Award

isis1

isis2

isis3

isis4

isis5

isis6


Filed under: Data Visualization, David McCandless, Infographics, Information is Beautiful, Information is Beautiful Awards, ISIS, KANTAR, Tableau, tableau public, Uncategorized

Kantar Information is Beautiful Shortlist – Fenced Out by The Washington Post

$
0
0

Readers:

Kantar LogoI am continuing my week-long review of the shortlist entries for the Kantar Information is Beautiful Awards 2017.

This entry was very powerful, engaging and chronicles the plight of Syrian refugees looking for new places to live in Europe, only to be “fenced out.”

Below is more detail on this entry.

Best regards,

Michael

Fenced Out by The Washington Post

Until the upheaval of 2015, Europe was home to the world’s most open frontiers. But within months, a messy effort to halt a mass flow of migrants fleeing wars in Syria, Iraq and Afghanistan cascaded into the construction of more border fences than anywhere else on the globe. For the most part, the once-open door to Europe has closed.

Credits

 

Categories

Story Link

www.washingtonpost.com

Fenced Out Snippet

 

2017-10-23_12-05-19

2017-10-23_12-05-57

2017-10-23_12-06-51

2017-10-23_12-07-16

Fenced Out - The Washington Post

2017-10-23_12-10-16

2017-10-23_12-11-11


Filed under: Data Visualization, David McCandless, Infographics, Information is Beautiful, Information is Beautiful Awards, Interactive Data Visualization, KANTAR, Political DataViz, Politics, Syrian Refugees, Tableau, Uncategorized, Washington Post (The)

Kantar Information is Beautiful Shortlist – Triple Play Art by Tresta Inc

$
0
0

Kantar Logo

Readers:

I am continuing my week-long review of the shortlist entries for the Kantar Information is Beautiful Awards 2017.

Below is more detail on this entry. I also adding some context about the concept of what a Triple Play is after the dataviz screenshots.

Best regards,

Michael

Triple Play Art by Tresta Inc

A visualization of all 711 Triple Plays in MLB history, looking at the frequency of Triple Plays by position and sequence order.

Credits

Categories
Data Visualization Link
Triple Play Art - 1
Triple Play Art - 5
Triple Play Art - Fun Facts
Triple Play Art - How To Read

Adding Context to the Concept of the Triple Play

Baseball_Positions

To help those not familiar with baseball or the concept of a triple play, here is a brief discussion about what constitutes a triple play.

In baseball, a triple play (denoted as TP in baseball statistics) is the rare act of making three outs during the same continuous play.

Triple plays happen infrequently – there have been 716 triple plays in Major League Baseball (MLB) since 1876, an average of approximately five per season – because they depend on a combination of two elements, which are themselves uncommon:

First, there must be at least two baserunners, and no outs. From analysis of all MLB games 2011–2013, only 1.51% of at bats occur in such a scenario. By comparison, 27.06% of at bats occur with at least one baserunner and less than two outs, the scenario where a double play is possible.

Second, activity must occur during the play that enables the defense to make three outs. Common events – such as the batter striking out, or hitting a fly ball – do not normally provide opportunity for a triple play. A ball hit sharply and directly to an infielder, who then takes very quick action – or unusual action, confusion, or mistakes by the baserunners – is usually needed.

Examples of the Triple Play

The most likely scenario for a triple play is no outs with runners on first base and second base, which has been the case for the majority of MLB triple plays. In that context, two example triple plays are:

5-4-3 triple play

The batter hits a ground ball to the third baseman, who steps on third base to force out the runner coming from second (first out). The third baseman throws to the second baseman, who steps on second base to force out the runner coming from first (second out). The second baseman throws to the first baseman, with the throw arriving in time to force out the batter (third out). This is an example of grounding into a 5-4-3 triple play, per standard baseball positions.

During the 1973 season, Baltimore Orioles third baseman Brooks Robinson started two such 5-4-3 triple plays; one on July 7 against the Oakland Athletics, and one on September 20 against the Detroit Tigers.

On July 17, 1990, the Minnesota Twins became the first (and to date, the only) team in MLB history to turn two triple plays in the same game. Both were 5-4-3 triple plays, executed by fielders Gary Gaetti, Al Newman, and Kent Hrbek in a game against the Boston Red Sox.

Brooks Robinson is the all-time MLB leader for grounding into triple plays, with four in his career.

4-6-3 triple play

The baserunners start running in an attempt to steal or execute a hit and run play, and the batter hits a line drive to the second baseman, who catches it (first out). The second baseman throws to the shortstop, who steps on second base before the runner who started there can tag up (second out). The shortstop throws to the first baseman, who steps on first base before the runner who started there can tag up (third out). This is an example of lining out into a 4-6-3 triple play.

Most Recent MLB Triple Play

The most recent triple play in MLB was turned by the Detroit Tigers on September 8, 2017, against the Toronto Blue Jays in the bottom of the sixth inning. With runners on first and second, Kevin Pillar hit a sharp grounder that was fielded by Jeimer Candelario, who stepped on third (one out). Candelario threw the ball to second baseman Ian Kinsler (two outs), who then threw to first baseman Efren Navarro (three outs), completing the 5-4-3 triple play.

 

Source: Wikipedia, Triple Play (Baseball).


Filed under: Baseball, Data Visualization, David McCandless, Infographics, KANTAR, Sports, Sports DataViz, Tableau, tableau public, Triple Play, Uncategorized

Kantar Information is Beautiful 2017 Shortlist – AtF Spark – Code-free Sparkline Typeface by After the Flood

$
0
0

Readers:

I am continuing my week-long review of the shortlist entries for the Kantar Information is Beautiful Awards 2017.

Below is more detail on this entry. I also added some commentary on some tests I ran using these fonts in a Microsoft Word 2016 document.

Best regards,

Michael

AtF Spark – Code-free Sparkline Typeface by After the Flood

AtF Spark is a font that allows for the combination of text and visual data to show an idea and evidence in one headline. Sparklines are currently available as plugins or javascript elements. By installing the AtF Spark font, you can use them immediately without the need for custom code, and in any application or browser that supports OpenType. AtF Spark can be downloaded and used for free from After the flood’s website, with further updates planned in future releases.

The creator of this work has supplied multiple images, please click here to view.

Credits

  • Max Gadney, Director & Founder Mike Gallagher, Design Director Sabih Ali, Commercial Director
Award
Categories
Web Site Link
2301d
2301b
2301e

Caveats

I really like AtF Spark and feel it will be a real helpful tool in the near future. However, I did run some tests to include in a Microsoft Word 2016 document I am preparing and had a few issues. I don’t like to offer criticism when a product is currently in an awards contest, but for full transparency, I am including my tests below. This should not take away from the great font set created by After the Flood. I think they have a great idea and I look forward to its updates in the future.

AtF Spark Test


Filed under: After the Flood (AtF), Dataviz, David McCandless, Fonts, Infographics, Information is Beautiful, Information is Beautiful Awards, KANTAR, Sparklines, Tableau, Tableau Customer Conference, tableau public, Uncategorized

Kantar Information is Beautiful 2017 Shortlist – Cats without a home by Russell Spangler

$
0
0

 

 

Cats without a home by Russell Spangler

These beautiful cats are threatened by growing human populations, loss of habitat, illegal hunting (of both tigers and their prey species) and expanded trade in tiger parts used as traditional medicines.

Credits

 

Cats Without a Home

 

2017-10-29_11-12-30

 

2017-10-29_11-11-25

 

2017-10-29_11-10-35


Filed under: Animal Rights, Animals, Humanitarian, Information is Beautiful, Information is Beautiful Awards, KANTAR, Tableau, Tableau Customer Conference, tableau public, Uncategorized

Tableau Secrets: 10 tips for Viz in Tooltips (Jeffrey A. Shaffer)

$
0
0

Readers:

For this Tableau Secrets blog post, someone else has already done the heavy lifting, so I am going to refer you to his web site, Data + Science.

Back in July 2014, Jeffrey Shaffer wrote a blog post Tooltip Canvas in Tableau – A Mockup for Chart Functionality within Tooltips based on an idea he added to the Tableau Ideas forum back in April 2013. The basic idea was adding the capability of having a visualization inside of a tooltip. This functionality was demonstrated on stage at the Devs on Stage at the 2015 Tableau Conference. After that conference, Jeffrey watched beta after beta to see if this functionality had been added. The functionality was mentioned again at the 2016 Tableau Conference, but it wasn’t until earlier this year that he saw the feature in an Alpha release. He was able to work with that Alpha release and try out different features and he knew that the beta release wouldn’t be too far behind.

Well, yesterday was the day. Viz in Tooltips was released as part of Tableau 10.5.

Before I provide you the link to Mr. Shaffer’s blog post, let me provide you a little information about him.

Jeffrey A. Shaffer HeadshotJeffrey Shaffer is Vice President of Information Technology and Analytics at Recovery Decision Science and Unifund. Mr. Shaffer joined Unifund in 1996 and has been instrumental in the creation and development of the complex systems, analytics and business intelligence platform at Unifund. Mr. Shaffer holds a BM and MM degree from the University of Cincinnati and an MBA from Xavier University where he was the winner of the 2006 Graduate Student Scholarly Project in Research. Mr. Shaffer has attended the Harvard Business School’s Executive Education Program, is a Certified Manager of Quality and Organizational Excellence through the American Society for Quality, a Certified Project Management Professional through the Project Management Institute and has completed Six Sigma Green Belt and Black Belt training with the Xavier Consulting Group.

Mr. Shaffer is also Adjunct Professor at the University of Cincinnati in the Carl H. Lindner College of Business teaching Data Visualization where he was awarded the 2016 Adjunct Faculty of the Year Award for Operations, Business Analytics and Information Systems. He is a regular speaker at conferences, symposiums, universities and corporate training programs on the topic of data visualization, data mining and Tableau. Mr.Shaffer has taught data visualization at the KPMG Advisory University, KPMG Global Analytics and for the University of Cincinnati Center for Business Analytics. He was a finalist in the 2011 Tableau Interactive Visualization Competition, one of the Elite 8 in the 2014 Tableau Sports Visualization Contest, the winner of the 2014 Tableau Quantified Self Visualization Contest and competed in the 2014 Tableau Iron Viz Contest. He was selected as one of twenty-one Tableau Zen Masters in the world for 2015-2016, was selected as a Tableau Social Media Ambassador 2015-2017 and is a founder and leader of the Cincinnati Tableau User Group.

O.K., here is the link to Jeffrey’s post.

Tableau Tips – 10 tips for Viz in Tooltips (Now Available in Tableau 10.5)

Viz in Tooltips

SourceShaffer, Jeffrey A. , Tableau Tips – 10 tips for Viz in Tooltips (Now Available in Tableau 10.5), Data Plus Science, LLC, October 17, 2017, https://www.dataplusscience.com/TableauTips12.html.

Tableau Community Spotlight: An Interview With Brittany Fong

$
0
0

Readers:

Brittany Fong HeadshotToday, I am featuring an interview with Brittany Fong.

Brittany is a leading voice in the data visualization community and has been a Tableau power user since 2012. She is the organizer of the DC Tableau User Group, a co-organizer of the DC Data + Women group, and a Tableau User Group Ambassador (designation given by Tableau).

Brittany has taught numerous Tableau and data visualization best practice courses and workshops at universities and organizations in the United States. Brittany is located in Severn, MD. She enjoys developing creative, informative and functional visualizations to overcome business challenges.

When she’s not “Tableau-ing”, Brittany is coaching gymnastics, taking pictures, working on her car, or exploring the outdoors with her dog Maggie!I hope you enjoy this interview and Brittany’s work as much as I did discussing it with her.Best Regards,Michael

An Interview With Brittany Fong

Michael: Hi Brittany, it is very nice to finally meet you. If you don’t mind I am going to dive right into the questions.

DC Tableau User Group

Michael: You organized and founded the Washington DC Tableau User Group. Can you tell us a bit about what kinds of topics you discuss at your meetings, how often you meet, and how people can contact you to attend your meetings?

Brittany: The DC user group up till this year has been pretty standard.  About 100-120 of us meet about every other month on a weekday afternoon (during work hours).  Generally we have 3 to 4 presentations with questions.  The presentations range from user stories, implementing Tableau, use case examples, Tableau tips and tricks, and more.  This year, I’m going to start trying some non-traditional meetings to see how it goes!

Data Plus Women

Michael: You are also one of the co-organizers of the DC Data + Women group. Can you tell us a little bit about the group and what kinds of services you offer women who work with data?

Brittany: This has been a really great group (Emily Kund, Erin Simpler-Kellett, and Julie Kim) to work with!  Our data + women group really focuses on soft skills.  How to market yourself, how to set goals, and how to achieve your goals.  We have had a few meetings where we focused on technical skills with some lightning talks.  Recently, we’ve partnered up with the DC She Talks Data group (Alexis Smith).

Additionally, we have had a workshop for high school aged girls to come and learn about data analytics and data visualization.  We received really great feedback on this workshop, and we’re planning on conducting it again a few times this year.

Michael: O.K., I am going to put you on the spot here. Tell us three of your favorite Tableau tips that you consider priceless.

Brittany:

Copying formatting!  This is a HUGE time saver.  Formatting takes so many clicks, and being able to setup the formatting on one sheet and pasting it over is really helpful!

Custom color palettes!  Especially when you’re working with companies who want their dashboards to be brand compliant, the custom palettes make life so much easier.  I don’t have to memorize HEX codes and then type them in every time.

Parameters!  I couldn’t use Tableau if it weren’t for parameters.  They can do so much, and add so much flexibility to your dashboards.  No more making the same dashboard over and over again, just so you can get a different dimension or measure.

BFongData

Michael: You recently ventured out on your own with BFONGDATA. Can you tell us a little bit about your company, the services you offer, etc.?

Brittany: I’ve been freelancing part-time for a while now and I decided at the end of 2016 that it was time to make the jump and go full-time.  As a full time data visualization and UX designer, I really want to create dashboards that are creative and functional.  The fun part about my work is listening to the business needs that are voiced and silent, then develop something that is work-changing.  Data that the business never used to be able to see is at the tip of anyone’s finger tips.  It’s really fun to see the enthusiasm of clients when they can see and use their data to make relevant decisions.

Michael: As a coach, how do you approach helping a person brand new to Tableau?

Brittany: I wrote a whole blog post on this, “I don’t know what to do, I’m new!”, since I was getting this question from so many people.  I really stress that you need to work with fun data (not work data, or data you’re really familiar with).  When you already know the data you have a preconceived idea of how it should be displayed.  Once you have data, watch the tutorial videos on Tableau’s website to learn the basics of the tool.  Lastly, jot down/sketch out some graph ideas and try to make them.  Don’t be afraid to mess up or click something wrong, just click around and learn the tool naturally.

Michael: Thank you, Brittany. I really appreciate your insights and look forward to seeing you again at the Tableau Customer Conference 2018 in New Orleans.

 

Brittany Fong Tableau Public

Viewing all 292 articles
Browse latest View live