All Models are wrong, but some are useful. (George Box) …and some are measurably better than others. (Douglas Hubbard). Such is the premise of the book How to Measure Anything in Cybersecurity Risk, in which Douglas Hubbard and Richard Seiersen take a critical look at conventional methods of assessing Cybersecurity risk, and offer an alternative. A continuation of Hubbard’s series on business statistics and quantitative decision analysis, this book dives deep into the problem of how to inform business decisions in complex situations when data is scarce. While business statistics may not be everyone’s favourite topic, it is a remarkably engaging overview, and it can equally serve as a desk reference for anyone whose work involves helping organizations make informed[…]
Last time we took a quick pass through the basics of good data visualisation and then finished with a simple demonstration that showed a bit of what R can do with charts. This post will show how I created the chart in R. By the time we’re done you should have an overview of how graphics work in R and how to turn a basic chart into a presentation quality graphic. Let’s begin with the data. Like Excel, R builds a chart from a data table. In Excel that means arranging data in a block of cells that has a header row to indicate variable names and a row for each data item. The R equivalent is a data frame,[…]
Analysis and persuasion. A large part of my day-to-day work involves visualising data for either or both of these purposes. I consolidate data into charts and tables, sometimes to help me see patterns and sometimes to help other see what the data has to say. Excel has been an important tool along the way, but tools on their own aren’t enough. Without due care, good tools can pave a path to a bad chart like this one, which fails to inform. Making charts that deliver a clear message about the story in the data, and that do so in the 8-seconds available to get a reader’s attention is not something most of use are naturally good at. And it’s[…]
A useful, but often overlooked Excel feature is the Analysis ToolPak. It’s useful because it packages about 20 commonly used statistical functions in a format that is very easy to use. It gets overlooked because it has to be activated through the Excel Options dialog before it even becomes visible. If the Analysis ToolPak is new to you you will find it in the Data segment of the Excel Tool Ribbon. If you already use it and you are In the process of extending your analysis repertoire to include R, you may just be looking for a guide that shows how to do ToolPak tasks in R. Either way, this post fills a gap with a quick mapping from Excel to[…]
Excel is such a handy tool for data discovery and analysis that it’s fair to ask, “Why bother with anything else, especially an arcane scripting environment like R?” The truth is that the transition from Excel to R is very much a green eggs and ham experience: decidedly unappealing at the outset, even for adventurous folk, but rewarding in the long run. This post takes another look at the Titanic data set, this time using R to do the same analysis done last time in Excel. As before, the starting point is making readable text from the raw, coded data shown above on the left. R provides a simple substitution function that simply specifies the filter conditions for the variable[…]
If data is a window into a problem then one of the main goals of data science is getting people to look through the window: Do you see patterns? Are the apparent patterns real? Do they help you understand the situation? Can you use what you see to predict beyond the horizon? So, how well do everyday desktop tools like Excel fare in helping to provide answers? Teachers of introductory statistics courses (like me) are fond of starting off the discovery process in the form of a game. The class is presented with the data set shown here, giving the basic facts of an unspecified risk event involving a large loss of life. The group is invited to ask questions about[…]
The client wanted 30 minutes. More precisely, they needed to eliminate a 30-minute delay between completing a finished product test in a manufacturing quality control lab and communicating the result back to the operators on the production line. By making data more available to inform decisions about process equipment settings, simply automating that reporting step led to lower scrap rates and higher throughput – a win for everyone. Another client, also a manufacturer, needed to optimize finished goods inventory and production scheduling for a line of consumer electrical products that had 20 or so discrete models. A statistical model used a rolling 3 years of order data in order to predict optimum monthly production volumes for each product. Reviewing the model[…]
Live theatre, live music, dining out. Besides being three of my favourite past-times, what do they have in common? Each is a performance in which skilled and talented people collaborate, bringing the best they have in order to create memorable experiences for the patrons they serve. The people who do it have to be on top of their game every day, individually and in the way they work together. If you happen to know anyone whose work is performance you will also understand that it’s hard work, and much of it goes unseen. All of which brings me to the Fork and Cork Grill, a new restaurant that has just opened in Kitchener. The name says what this collaboration is about: food[…]
My last post was about why Use Cases should still be considered alive and well as an analysis technique in Agile projects. That begs the question: if people can dismiss Use Cases as old fashioned and a waste of time, is there any place left for traditional Requirements documents? I am not a fan of Big Requirements Up-Front. Fat requirements documents based on the premise that if you don’t ask, you don’t get, tend to accumulate very long feature lists. Crafted early in a project, they risk becoming disconnected from what the front line users will actually use. We try to mitigate this risk by prioritizing the features. But anyone who has been through the exercise of separating a few hundred features into[…]
A few weeks ago a discussion popped up on LinkedIn titled “Use Cases are old-fashioned and a waste of time and money“. Provocative, to be sure, the debate that followed (at last count 88 comments and building) was an entertaining romp through the personal biases of a number of experienced practitioners. There were two distinct camps. Arguing against use cases was a group that liked pictures better than words for displaying the sequence of events to fulfill a user goal. For them, taking the time to write use cases is an unnecessary formalism that is too slow to keep pace with the needs of an Agile team. On the other side was more reflective group who viewed use cases as just[…]