Last time we took a quick pass through the basics of good data visualisation and then finished with a simple demonstration that showed a bit of what R can do with charts. This post will show how I created the chart in R. By the time we’re done you should have an overview of how graphics work in R and how to turn a basic chart into a presentation quality graphic. Let’s begin with the data. Like Excel, R builds a chart from a data table. In Excel that means arranging data in a block of cells that has a header row to indicate variable names and a row for each data item. The R equivalent is a data frame,[…]

  Analysis and persuasion. A large part of my day-to-day work involves visualising data for either or both of these  purposes. I consolidate data into charts and tables, sometimes to help me see patterns and sometimes to help other see what the data has to say. Excel has been an important tool along the way, but tools on their own aren’t enough. Without due care, good tools can pave a path to a bad chart like this one, which fails to inform. Making charts that deliver a clear message about the story in the data, and that do so in the 8-seconds available to get a reader’s attention is not something most of use are naturally good at. And it’s[…]

A useful, but often overlooked Excel feature is the Analysis ToolPak. It’s useful because it packages about 20 commonly used statistical functions in a format that is very easy to use. It gets overlooked because it has to be activated through the Excel Options dialog before it even becomes visible. If the Analysis ToolPak is new to you you will find it in the Data segment of the Excel Tool Ribbon. If you already use it and you are In the process of extending your analysis repertoire to include R, you may just be looking for a guide that shows how to do ToolPak tasks in R. Either way, this post fills a gap with a quick mapping from Excel to[…]

Excel is such a handy tool for data discovery and analysis that it’s fair to ask, “Why bother with anything else, especially an arcane scripting environment like R?” The truth is that the transition from Excel to R is very much a green eggs and ham experience: decidedly unappealing at the outset, even for adventurous folk, but rewarding in the long run.  This post takes another look at the Titanic data set, this time using R to do the same analysis done last time in Excel. As before, the starting point is making readable text from the raw, coded data shown above on the left.  R provides a simple substitution function that simply specifies the filter conditions for the variable[…]

If data is a window into a problem then one of the main goals of data science is getting people to look through the window: Do you see patterns? Are the apparent patterns real? Do they help you understand the situation? Can you use what you see to predict beyond the horizon? So, how well do everyday desktop tools like Excel fare in helping to provide answers? Teachers of introductory statistics courses (like me) are fond of starting off the discovery process in the form of a game. The class is presented with the data set shown here, giving the basic facts of an unspecified risk event involving a large loss of life. The group is invited to ask questions about[…]