maths & statistics awareness month
Blogs home Featured Image

April marks Mathematics and Statistics awareness month #MathStatMonth in the USA, with the aim of increasing the level of interest in these subjects. Working in an increasingly data-driven world, the ability to harness meaningful insights from data is an essential business and requires specialised data science expertise.

Data science is the proactive use of data and advanced analytics to drive better decision making. This ‘proactive’ use of data is what distinguishes data science from traditional ‘statistical analysis’ and needs to be an active part of an organisation in search of insight, better decision making or improvement.

Data science as a career choice for maths, statistics, and science graduates

Many graduates from maths, statistics and science backgrounds are increasingly attracted by a career in data science. Our current graduate placement Student, Elizabeth tells us more about her early interest in data science and why it presents a natural career path for those interested in mathematics and statistics. “Data science combines the skills and applications of mathematics and statistics with the use of big data and innovating technology to solve a variety of problems. I’m particularly interested in providing solutions to real-world problems and communicating these results at a high level within a business”, says Elizabeth.

“Throughout my placement I have seen the application of using mathematics and statistics within data science projects in performing exploratory data analysis to creating statistical models. My personal interest is in different types of statistical models, and I am due to study Time Series and Bayesian statistics in the final year of my degree”.

Elizabeth has benefited from seeing how mathematics and statistics have been used to model complex situations and improve business decisions from the optimum timing of routine maintenance, saving unnecessary reactivity and costs to creating descriptive, diagnostic and predictive insights which delivered great value and significant return on investment during her time at Mango.

Growing demand for data science

With the demand for Data Scientists still on the rise into 2021, the pandemic has created an even more urgent need for rapid decision making, informed and supported by constantly changing data sets, backed by effective visualization (highlighted by the World Economic Forum (WEF) in July).

Rich Pugh, Mango’s Chief Data Scientist summarises, “Leaders increasingly understand the potential of using data to create smarter, leaner, more engaging organisations. As such, we are still seeing growing demand for “data scientists” who are able to turn that data into acumen in a repeatable and scalable way. As a multi-disciplinary practice, “data science” relies on the combination of “advanced analytics” and “computer science” skill – this, combined with an ability to creatively explore challenges that can be solved, is at the core of realising the value promised by data science”.

“At it’s core, data science relies on mathematics and statistical rigour to provide robust algorithms that can be relied upon to solve often-complex challenges. As interest in data science continues to grow, the work at the Royal Statistical Society becomes increasingly important – to drive the discussion around statistical governance, and the correct and ethical application of statistical routines”, Rich concludes.

Blogs home

Data visualisation is a key piece of the analysis process. At Mango, we consider the ability to create compelling visualisations to be sufficiently important that we include it as one of the core attributes of a data scientist on our data science radar.

Although visualisation of data is important in order to communicate the results of an analysis to stakeholders, it also forms a crucial part of the exploratory process. In this stage of analysis, the basic characteristics of the data are examined and explored.

The real value of data analyses lies in accurate insights, and mistakes in this early stage can lead to the realisation of the favourite adage of many statistics and computer science professors: “garbage in, garbage out”.

Whilst it can be tempting to jump straight into fitting complex models to the data, overlooking exploratory data analysis can lead to the violation of the assumptions of the model being fit, and so decrease the accuracy and usefulness of any conclusions to be drawn later.

This point was demonstrated in a beautifully simplified way by statistician Francis Anscombe, who in 1973 designed a set of small datasets, each showing a distinct pattern of results. Whilst each of the four datasets comprising Anscombe’s Quartet have identical or near identical means, variances, correlations between variables, and linear regression lines, they all highlight the inadequacy of using simple summary statistics in exploratory data analysis.

The accompanying Shiny app allows you to view various aspects of each of the four datasets. The beauty of Shiny’s interactive nature is that you can quickly change between each dataset to really get an in-depth understanding of their similarities and differences.

The code for the Shiny app is available on github.