Blogs home Featured Image

It’s mostly preaching to the converted to say that ‘open-source is changing enterprises’. The 2020 Open Source Security and Risk Analysis (OSSRA) Report found that 99 per cent of enterprise codebases audited in 2019 contained open-source components, and that 70 per cent of all audited codebases were entirely open-source.

Hearts and minds have most certainly been won, then, but there are still a surprising number of enterprise outliers when it comes to adopting open-source tools and methods. It’s no surprise that regulated industries are one such open-source averse group.

It’s still difficult to shake off the reputation open-source resources can have for being badly-built, experimental, or put together by communities with less recognisable credentials than big players in software. When your industry exists on trust in your methods – be it protecting client finances in banking, or the health of your patients in pharma – it’s often easier just to make do, and plan something more adventurous ‘tomorrow’.

This approach made a certain amount of sense in years past, when embracing open-source was more a question of saving capex with ‘free’ software, and taking the risk.

Then, along comes something like Covid-19, and the CEO of Pfizer – who are now among those leading the way in a usable vaccine – singing the praises of open-source approaches back in March 2020. Months down the line, AstraZeneca and Oxford University’s 70 percent-efficacy Covid-19 vaccine emerged. AstraZeneca is having a public conversation around how it’s “embracing data science and AI across [the] organisation” while it continues to “push the boundaries of science to deliver life-changing medicines”.

Maybe tomorrow has finally arrived.

At Mango, our primary interest is in data science and analytics, but we also have a great interest in the open-source programming language R when we’re thinking about statistical programming. We’re not attached to R for any other reason than we find it hugely effective in overcoming the obstacles the pharmaceutical industry recognises implicitly – accessing better capabilities, and faster.

With a growing number of pharmaceutical companies starting to move towards R for clinical submissions, we thought it would be useful to find out why. Asking experts from Janssen, Roche, Bayer and more, we collected first-hand use cases, experiences and stories of challenges overcome, as well as finding out how these companies are breaking the deadlock of open-source’s reputation versus its huge potential for good in a world where everything needs to move faster, while performing exceptionally. Watch the full round table recording here.

If you’d like to find out more, please get in touch and we’d be happy to continue the conversation.

Author: Rich Pugh, Chief Data Scientist at Mango

Blogs home Featured Image

In 2020 the EARL conference was held virtually due to the restrictions imposed by COVID-19. Although this removed the valuable networking element of the conference, the ‘VirtuEARL’ virtual approach meant we reached a geographically wider audience and ensured a successful conference. Thought leadership from academia and industry logged in to discover how R can be used in business, and over 300 data science professionals convened to join workshops or hear presenters share their novel and interesting applications of R. The flexibility of scheduling allowed talks to be picked according to personal or team interests.

The conference kicked off with workshops delivered by Mango data scientists and guest presenters, Max Kuhn of RStudio and Colin Fay from ThinkR, with topics including data visualisation, text analysis and modelling. The presentation day both began and finished with keynote presentations: Annarita Roscino from Zurich spoke about her journey from data practitioner to data & analytics leader – sharing key insights from her role as a Head of Predictive Analytics, and Max Kuhn from RStudio used his keynote to introduce tidymodels – a collection of packages for modelling and machine learning using tidyverse principles.

Between these great keynotes, EARL offered a further 11 presentations from across a range of industry sectors and topics. A snapshot of these shows just some of the ways that R is being used commercially: Eryk Walczak from the Bank of England revealed his use of text analysis in R to study financial regulations, Joe Fallon and Gavin Thompson from HMRC presented on their impressive work behind the Self Employment Income Support Scheme launched by the Government in response to the Covid-19 outbreak, Dr. Lisa Clarke from Virgin Media gave an insightful and inspiring talk on how to maximize an analytics team’s productivity, whilst Dave Goody, lead data scientist from the Department of Education, presented on using R shiny apps at scale across a team of 100 to drive operational decision making.

Long time EARL friend and aficionado, Jeremy Horne of DataCove, demonstrated how to build an engaging marketing campaign using R, and Dr Adriana De Palma from the Natural History Museum showed her use of R to predict biodiversity loss.

Charity donation 

Due to the reduced overheads of delivering the conference remotely in 2020, the Mango team decided to donate the profits of the 2020 EARL conference to Data for Black Lives. This is a great non-profit organization dedicated to using data science to create concrete and measurable improvements to the lives of Black people. They aim to use data science to fight bias, promote civic engagement and build progressive movements. We are thrilled to be able to donate just over £12,000 to this brilliant charity.

Whilst EARL 2020 was our first such virtual event, the conference was highly successful. Attendees described it as an “unintimidating and friendly conference,” with “high-quality presentations from experts in their respective fields” and were delighted to see how R and data science in general were being used commercially. One attendee best described the conference: “EARL goes beyond introducing new packages and educates attendees on how R is being used around the world to make difficult decisions”.

If you’d like to learn more about EARL 2020 or see the conference presentations in full, click here.

Blogs home

Both R and distributed programming rank highly on my list of “good things”, so imagine my delight when two new packages used for distributed programming in R were released:

ddR (https://github.com/vertica/ddR) and

multidplyr (https://github.com/hadley/multidplyr)

 

Distributed programming is normally taken up for a variety of reasons:

  • To speed up a process or piece of code
  • To scale up an interface or application for multiple users

There has been a huge appetite for this in the R community for a long time so my first thought was “Why now? Why not before?”.

From a quick look at CRAN’s High Performance Computing page, we can see the mass of packages that were available for related problems already. None of them have quite the same focus of ddR and multidplyr though. Let me explain. R has many features that make it unique and great. It is high-level, interactive and most importantly, it also has a huge number of packages. It would be a huge shame to not be able to use these packages, or if we were to lose these features when writing R code to be run on a cluster.

Traditionally, distributed programming has contrasted with these principles, with much more focus on low-level infrastructures, such as communications between nodes on a cluster. Popular R packages that dealt with these in the past are the now deprecated packages, snow and multicore (released on CRAN in 2003 and 2009 respectively). However, working with low level functionality of a cluster can detract from analysis work because it requires a slightly different skill set.

In addition, the needs of R users are changing and this is, in part, due to big data. Data scientists now need to be able to run experiments on, and analyse and explore much larger data sets, where running computations on it can be time consuming. Due to the fluid nature of exploratory analysis, this can be a huge hindrance. For the same reason, there is a need to be able to write parallelized code without having to think too hard about low-level considerations, and for it to be fast to write as well as easy to read. My point is that fast parallelized code should not just be for production code. The answer to this is an interactive scripting language that can be run on a cluster.

The package written to replace snow and multicore is the parallel package, which includes modified versions of snow and multicore. It starts to bridge the gap between R and more low-level work by providing a unified interface to cluster management systems. The big advantage to this is that R code will be the same, regardless of what protocol for communicating with the cluster is being used under the covers.

Another huge advantage of the parallel package is the “apply” type functions that are provided through this unified interface. This is an obvious but powerful way to extend R with parallelism, because each any call to an “apply” function with, say, FUN = foo can be split into multiple calls to foo, executed at the same time. The recently released packages ddR and multidplyr extend on the functionality provided by the parallel package. They are similar in many ways. Indeed the most significant way is that they are based on the introduction of new datatypes that are specifically for parallel computing. New functions on these data types are used to “partition” data to describe how work can be split amongst multiple nodes and also a function to collect the work and combine them to produce a final result.

ddR then also reimplements a lot of base functions on the distributed data types, for example rbind and tail. ddR is written by Vertica Analytics group, owned by HP. It is written to work with HP’s distributedR, which provides a platform for distributed computing with R.

Hadley Wickham’s package, multidplyr also works with distributedR, in additional to snow and parallel. Where multidplyr differs to ddR is that it is written to be used with the dplyr package. All methods provided in the dplyr package are overloaded to work with the data-types provided by multidplyr, furthering Hadley’s eco-system of R packages.

After a quick play with the two packages, many more differences emerge between the two packages.

The package multidplyr seems more suited to data-wrangling, much like its single-threaded equivalent, dplyr.

The partition()  function can be given a series of vectors which describe how the data should be partitioned, very much like the group_by() function:

# Extract of code that uses the multidplyr package
library(dplyr)
library(multidplyr)
library(nycflights13)
planes %>% partition() %>% group_by(type) %>% summarize(n())

However, ddR has a very different “flavour”, with a stronger algorithmic focus, as can be seen by the example packages:  randomForest.ddRkmeans.ddR and glm.ddR, implemented with ddR. As can be seen in the code snippet below, certain algorithms such as random forests can be parallelised very naturally. Unlike multidplyr, the

partition()

function does not give the user control over how the data is split. However, provided in the

collect()

function is the

index

argument, which gives the user control over which workers to collect results from. Also, the list returned by

collect()

can then be fed into a

do.call()

to aggregate the results, for example, using

randomForest::combine() .
# Skeleton code for implementing very primitive version of random forests using ddR
library(ddR)
library(randomForest)
multipleRF <- dlapply(1:4, 
 function(n){
 randomForest::randomForest(Ozone ~ Wind + Temp + Month,
 data = airquality,
 na.action = na.omit)
})

listRF <- collect(multipleRF)
res <- do.call(randomForest::combine, collect(multipleRF))

To summarise, distributed programming in R has been slowly evolving for a long time but now in response to the high demand, many tools are being developed to suit the needs to R users who want to be able to run different types of analysis on a cluster. The prominent themes are as follows:

  • Parallel programming in R should be high-level.
  • Writing parallelised R code should be fast and easy, and not require too much planning.
  • Users should still be able to access the same libraries that they usually use.

Of course, some of the packages mentioned in this post are very young. However, due to the need for such tools, they are rapidly maturing and I look forward to seeing where it goes in the very near future.

Author: Paulin Shek

Blogs home Featured Image

Since we first demoed it at our really successful trip to Strata London last year, a few people have asked us how we made the awesome looking Data Science Radar app that we were running on the tablets we had with us. In this post we’ll take a look at how we did it, and hopefully show you how easy it is to do yourself.

Mango is primarily known for its work with the R language, so it should come as no surprise that this is the secret sauce used in the creation of the app. More specifically, we used a Shiny app written by one of our Senior Data Scientists, Aimee Gott. The app uses the radarchart package which you can find on github.

I think the fact that it was written with Shiny has actually surprised a few people, largely because of how it looks and the way that we run it.

The tablets in question are cheap Windows 10 devices, nothing special, but we had to come up with a way of running the application that would be simple enough for the non-R-using members of the team. This meant that anything from the everyday world of R had to be abstracted away or made as simple to use as possible. In turn this means, not starting RStudio, or having to type anything in to start the app.

R and the required packages are installed on the tablets, ready to start the configuration that would allow the whole Mango team to use them in the high pressure, high visibility setting of a stand at an extremely busy conference.

We wrote a simple batch file that would start the app. This only got us part of the way though, because the default browser on Windows 10, Microsoft’s Edge, doesn’t have a full screen mode, which makes the app look less slick. We therefore changed the default browser to Microsoft’s IE, and put it in full screen mode (with F11) when it first opened. The good news here is that IE, remembers that it was in full screen mode when you close and re-open it, so that’s another problem solved. The app now opens automatically and covers the full screen.

The code for the batch file is a simple one-liner and looks like this:

"C:\Program Files\R\R-3.3.0\bin\Rscript.exe" -e "shiny::runApp('/Users/training2/Documents/dsRadar/app.R', launch.browser = TRUE)"

Next, it was necessary to set the rotation lock on the tablets, to avoid the display flipping round to portrait mode while in use on the stand. This is more cosmetic than anything else, and we did find that the Win10 rotation lock is slightly buggy in that it doesn’t always seem to remember which way round the lock is, so that it occasionally needs to be reset between uses. Remember, our app was written specifically for this device, so the layout is optimised correctly for the resolution and landscape orientation, you may want to approach that differently if you try this yourself.

We also found that the on-screen keyboard wasn’t enabled by default with our devices (which have a detachable keyboard), so we had to turn that on in the settings as well.

Having the application start via a Windows batch file, isn’t the prettiest way of starting an app as it starts the windows command prompt before launching the application itself. This is hidden behind the application when it’s fully started, but it just doesn’t look good enough. This problem can be avoided with a small amount of VBScript, which runs the contents of the batch file without displaying the command prompt. Unfortunately the VBScript icon you end up with is pretty horrid. The easiest way to change it is to create a shortcut to the VBScript file and then change the icon of the shortcut, which is much easier.

Here’s the VBScript:

Set objShell = WScript.CreateObject("WScript.Shell")

objShell.Run("C:\Users\training2\Desktop\dsRadar.bat"), 0, True

Check out the video below to see it in action, we hope you agree that it looks really good and we hope you find this simple method of turning a shiny application into a tablet or desktop app as useful as we do!

 

Author: Mark Sellors

 

EARL logo
Blogs home Featured Image

What is EARL?

The Enterprise Applications of the R Language Conference (EARL) is a cross-sector conference focusing on the commercial use of the R programming language. The conference is dedicated to the real-world usage of R with some of the world’s leading practitioners.

Who is it For?

If you use R in your organisation, the EARL Conference is for you and your team. Whether you’re coding, wrangling data, leading a team of R users, or making data-driven decisions, EARL offers insights you can action in your company.

Why Attend?

The best in the business present their real-world projects, ideas and solutions at EARL each year and it’s the perfect opportunity to see how others are using R in production.

 

“Fantastic conference. EARL never lets me down in providing an insightful, applicable and fun experience to learn from other companies on how they apply R in their enterprises.”

 

If you’re considering submitting an abstract to present at future EARL events – and need a little more persuading, here are eight reasons why you should apply to present:

1. Networking

EARL attracts over 300 delegates from a huge range of industries. We make sure to provide plenty of opportunities for you to connect with your fellow R users. If that sounds daunting – don’t worry – plenty of delegates attend solo and we are delighted that EARL attracts such a welcoming and friendly crowd. As a speaker, networking will be easier as people will recognise you and have plenty of questions about your session.

2. Increase your professional experience

Many of our selected speakers have only presented a handful of times, and some never. If you have been aiming to increase your presenting experience, then EARL is the perfect venue. Our audiences are both appreciative and attentive and always have questions to ask – and it’s a great experience to add to your CV.

3. Help Others 

The R community is well known for being wonderfully inclusive, supportive and generous. So it’s nice to be able to contribute and help others. Presenting on your challenges and  your solutions will definitely help others, even if they work in completely different industries.

4. Refine your ideas

As we all know, projects don’t always go to plan, by sharing your commercial usage of R you might get advice or tips from the audience to take away.

5. Promote you

As a speaker you will get 30 minutes to share your work. EARL can give you a platform to share what you do with people from different industries and varying sizes of company.

6. Free Ticket

As a selected speaker you will get a free ticket for the day of your presentation along with a free ticket for our Wednesday night conference networking event*.

7. Be the Best Boss

If you head up a team and encourage a team member to submit a talk, they will enjoy a great conference and return with a sense of pride at having shared your team’s work and inspired by all that they’ve learned.

8. Have fun! 

Sharing your passion for R with a room full of like-minded people makes for a great environment; by sharing what you know you will be starting plenty of conversations. The networking events are a great place to celebrate your presentation and have a good time!

For more information about #EARLconf visit our Event page.

 

Blogs home Featured Image

It’s easy to get stuck in the day-to-day at the office and there’s never time to upskill or even think about career development. However, to really grow and develop your organisation, it’s important to grow and develop your team.

While there are many ways to develop teams, including training and providing time to complete personal (and relevant) projects, conferences provide a range of benefits.

Spark innovation
Some of the best in the business present their projects, ideas and solutions at EARL each year. It’s the perfect opportunity to see what’s trending and what’s really working. Topics at EARL Conferences have included, best practice SAS to R; Shiny applications; using social media data; web scraping, plus presentations on R in marketing, healthcare, finance, insurance and transport.

A cross-sector conference like EARL can help your organisation think outside the box because learnings are transferable, regardless of industry.

Imbue knowledge
This brings us to knowledge. Learning from the best in the business will help employees expand their knowledge base. This can keep them motivated and engaged in what they’re doing, and a wider knowledge base can also inform their everyday tasks enabling them to advance the way they do their job.

When employees feel like you want to invest in them, they stay engaged and are more likely to remain in the same organisation for longer.

Encourage networking
EARL attracts R users from all levels and industries and not just to speak. The agenda offers plenty of opportunities to network with some of the industry’s most engaged R users. This is beneficial for a number of reasons, including knowledge exchange and sharing your organisation’s values.

Boost inspiration
We often see delegates who have come to an EARL Conference with a specific business challenge in mind. By attending, they get access to the current innovations, knowledge and networking mentioned above, and can return to their team —post-conference— with a renewed vigour to solve those problems using their new-found knowledge.

Making the most out of attending EARL

After all of that, the next step is making sure your organisation makes the most out of attending EARL. We recommend:

Setting goals
Do you have a specific challenge you’re trying to solve in your organisation? Going with a set challenge in mind means your team can plan which sessions to sit in and who they should talk to during the networking sessions.

De-briefing
This is two-fold:
1) Writing a post-conference report will help your team put what they have learnt at EARL into action.
2) Not everyone can attend, so those who do can share their new-found knowledge with their peers who can learn second-hand from their colleague’s experience.

Following up
We’re all guilty of going to a conference, coming back inspired and then getting lost in the day-to-day. Assuming you’ve set goals and de-briefed, it should be easy to develop a follow-up plan.

You can make the most of inspired team members to put in place new strategies, technologies and innovations through further training, contact follow-ups and new procedure development.

EARL Conference can offer a deal for organisations looking to send more than 5 delegates.

Buy tickets now

Blogs home Featured Image

Becoming a data-driven business is high on the agenda of most companies but it can be difficult to know how to get started or know what the opportunities could be.

Mango Solutions’ has a bespoke workshop, ‘Art of The Possible’ to help senior leadership teams see the potential and overcome some of the common challenges.

Find out more in our new report developed in partnership with IBM.

Blogs home

We are excited to announce the speakers for this year’s EARL London Conference!

Every year, we receive an immense number of excellent abstracts and this year was no different – in fact, it’s getting harder to decide. We spent a lot of time deliberating and had to make some tough choices. We would like to thank everyone who submitted a talk – we appreciate the time taken to write and submit; if we could accept every talk, we would.

This year, we have a brilliant lineup, including speakers from Auto Trader, Marks and Spencer, Aviva, Hotels.com, Google, Ministry of Defence and KPMG. Take a look below at our illustrious list of speakers:

Full length talks
Abigail Lebrecht, Abigail Lebrecht Consulting
Alex Lewis, Africa’s Voices Foundation
Alexis Iglauer, PartnerRe
Amanda Lee, Merkle Aquila
Andrie de Vries, RStudio
Catherine Leigh, Auto Trader
Catherine Gamble, Marks and Spencer
Chris Chapman, Google
Chris Billingham, N Brown PLC
Christian Moroy, Edge Health
Christoph Bodner, Austrian Post
Dan Erben, Dyson
David Smith, Microsoft
Douglas Ashton, Mango Solutions
Dzidas Martinaitis, Amazon Web Services
Emil Lykke Jensen, MediaLytic
Gavin Jackson, Screwfix
Ian Jacob, HCD Economics
James Lawrence, The Behavioural Insights Team
Jeremy Horne, MC&C Media
Jobst Löffler, Bayer Business Services GmbH
Jo-fai Chow, H2O.ai
Jonathan Ng, HSBC
Kasia Kulma, Aviva
Leanne Fitzpatrick, Hello Soda
Lydon Palmer, Investec
Matt Dray, Department for Education
Michael Maguire, Tusk Therapeutics
Omayma Said, WUZZUF
Paul Swiontkowski, Microsoft
Sam Tazzyman, Ministry of Justice
Scott Finnie, Hymans Robertson
Sean Lopp, RStudio
Sima Reichenbach, KPMG
Steffen Bank, Ekstra Bladet
Taisiya Merkulova, Photobox
Tim Paulden, ATASS Sports
Tomas Westlake, Ministry Of Defence
Victory Idowu, Aviva
Willem Ligtenberg, CZ

Lightning Talks
Agnes Salanki, Hotels.com
Andreas Wittmann, MAN Truck & Bus AG
Ansgar Wenzel, Qbiz UK
George Cushen, Shop Direct
Jasmine Pengelly, DAZN
Matthias Trampisch, Boehringer Ingelheim
Mike K Smith, Pfizer
Patrik Punco, NOZ Medien
Robin Penfold, Willis Towers Watson

Some numbers

We thought we would share some stats from this year’s submission process:


This is based on a combination of titles, photos and pronouns.

Agenda

We’re still putting the agenda together, so keep an eye out for that announcement!

Tickets

Early bird tickets are available until 31 July 2018, get yours now.

Blogs home Featured Image

This article was first published on Nic Crane’s Blog and kindly contributed to the Mango Blog.

I’m going to begin this post somewhat backwards, and start with the conclusion: tidy eval is important to anyone who writes R functions and uses dplyr and/or tidyr.

I’m going to load a couple of packages, and then show you exactly why.

library(dplyr)
library(rlang)

Data wrangling with base R

Here’s an example function I have written in base R. Its purpose is to take a data set, and extract values from a single column that match a specific value, with both input and output both being in data frame format.

wrangle_data <- function(data, column, val){

  data[data[[column]]==val, column, drop = FALSE]

}

wrangle_data(iris, "Species", "versicolor") %>%
  head()
##       Species
## 51 versicolor
## 52 versicolor
## 53 versicolor
## 54 versicolor
## 55 versicolor
## 56 versicolor

It works, but it’s not great; the code is clunky and hard to decipher at a quick glance. This is where using dplyr can help.

Data wrangling with dplyr

If I was to run the same code outside of the context of a function, I might do something like this:

one_col <- select(iris, Species)
filter(one_col, Species == "versicolor")  %>%
  head()
##      Species
## 1 versicolor
## 2 versicolor
## 3 versicolor
## 4 versicolor
## 5 versicolor
## 6 versicolor

This has worked, but how can we turn this into a function?

I may naively attempt the solution below

wrangle_data <- function(data, column, val){
  one_col <- select(data, column)
  filter(one_col, column == val)
}

wrangle_data(iris, "Species", "versicolor")  %>%
  head()
## [1] Species
## <0 rows> (or 0-length row.names)

However, this doesn’t work and returns 0 rows. This is due to a special quirk of dplyr that makes typical usage easier, but we need to be aware of when writing functions. This snippet from the programming vignette in dplyr explains it best:

“Most dplyr functions use non-standard evaluation (NSE). This is a catch-all term that means they don’t follow the usual R rules of evaluation. Instead, they capture the expression that you typed and evaluate it in a custom way. This has two main benefits for dplyr code:

  • Operations on data frames can be expressed succinctly because you don’t need to repeat the name of the data frame. For example, you can write filter(df, x == 1, y == 2, z == 3) instead of df[df\(x == 1 & df\)y ==2 & df$z == 3, ].
  • dplyr can choose to compute results in a different way to base R. This is important for database backends because dplyr itself doesn’t do any work, but instead generates the SQL that tells the database what to do.”

In other words, because dplyr functions evaluate things differently to base R, by using a concept called quoting, we have to work with them a bit differently. I’d recommend checking out this RStudio webinar and the dplyr programming vignette for more detail.

Let’s go back to the previous example.

wrangle_data <- function(data, column, val){
  one_col <- select(data, column)
  filter(one_col, column == val)
}

wrangle_data(iris, "Species", "versicolor")  %>%
  head()
## [1] Species
## <0 rows> (or 0-length row.names)

This doesn’t work as select and filter are looking for a column called “column” in their inputs, and failing to find them. Therefore we must use tidy evaluation to override this behaviour.

Tidy evaluation with dplyr

Using the !! (bang bang) operator and sym functions from the rlang package, we can change this behaviour to make a version of our function which will run.

wrangle_data <- function(x, column, val){

  one_col <- select(x, !!sym(column))
  filter(one_col, !!sym(column) == val)
}

wrangle_data(iris, "Species", "versicolor") %>%
  head()
##      Species
## 1 versicolor
## 2 versicolor
## 3 versicolor
## 4 versicolor
## 5 versicolor
## 6 versicolor

I’m not going to go into detail about these functions here, but if you want more information, check out my blog posts on using tidy eval in dplyr here and here.

In conclusion, whilst tidy eval is not necessary for all uses of dplyr or tidyr, it quickly becomes an extremely handy tool when working with these packages within the context of function. There are some great resources about tidy eval out there, and as ever, I welcome feedback on this blog post via Twitter.

ANNOUNCEMENT: EARL London 2018 + abstract submissions open!
Blogs home Featured Image
MEDIA RELEASE

14 February 2018

Mango Solutions are delighted to announce that loyalty programme pioneer and data science innovator, Edwina Dunn, will keynote at the 2018 Enterprise Applications of the R Language (EARL) Conference in London on 11-13 September.

Mango Solutions’ Chief Data Scientist, Richard Pugh, has said that it is a privilege to have Ms Dunn address Conference delegates.

“Edwina helped to change the data landscape on a global scale while at dunnhumby; Tesco’s Clubcard, My Kroger Plus and other loyalty programmes have paved the way for data-driven decision making in retail,” Mr Pugh said.

“Having Edwina at EARL this year is a win for delegates, who attend the Conference to find inspiration in their use of analytics and data science using the R Language.

“In this centenary year of the 1918 Suffrage act, Edwina’s participation is especially appropriate, as she is the founder of The Female Lead, a non-profit organization dedicated to giving women a platform to share their inspirational stories,” he said.

Ms Dunn is currently CEO at Starcount, a consumer insights company that combines the science of purchase and intent and brings the voice of the customer into the boardroom.

The EARL Conference is a cross-sector conference focusing on the commercial use of the R programming language with presentations from some of the world’s leading practitioners.

More information and tickets are available on the EARL Conference website: earlconf.com

END

For more information, please contact:
Karis Bouher, Marketing Manager: marketing@mango-solutions.com or +44 (0)1249 705 450