Covid infection rate
Blogs home Featured Image

Two years ago the Public Health Evidence & Intelligence team at Hertfordshire Country Council numbered 5, fast forward to today and the team have built their capability to 32 competent R users. 

For Manager Will Yuill, it’s been an extremely busy few years as the urgency of the COVID-19 crisis took hold. The team’s workload doubled overnight, leading to extensive data sharing and analysis of daily infection rates to a range of partnership organisations. Within a week of infection rates hitting the UK, they were being asked to reprioritise workloads, model the pandemic and make recommendations locally on how to limit the spread of infection. 

With a team largely using Excel or a desktop version of R, Will knew changes had to be made to keep up with the quantity and speed of data and to maintain efficiency across the team. Met with these challenges, the team knew they had to consider an outsourcing partnership to meet their immediate and long-term objectives.     

In this blog, Will Yuill, Manager of the Public Health Evidence & Intelligence team at Hertfordshire Country Council, informs us of his challenges and just how his team have dramatically developed their remit, developed their internal capability whilst strengthened stakeholder collaboration across vital health partnerships to actively reduce the spread of the virus. 

The teams’ blockers:  

  • Software – R was not supported internally 
  • Hardware – only able to use R on low power VM’s 
  • Skills – predominantly an academic team , R skills were not applied 
  • Culture – limited opportunity to try new things with IT 
  • Capacity – limited time to try new things   

 How I built my case for R  

“Some of the team members were more familiar with using R, having exhausted the capability of Excel, so in our search for an immediate solution – an R environment seemed a logical solution. Our IT teams were Windows focused and didn’t have the capacity or skills required to support an internal Linux environment. Their priority was to support the council’s migration to home working.  

Before committing to an outsourced partnership, the team had tried high specification laptops to help resolve the immediate challenges of managing data sets but were hindered by with IT corporate policy and firewalls. However, they simply could not meet our sharing analysis needs or compliance with data governance.  With Mango’s Managed RStudio, core stakeholders could have a reliable, secure enterprise environment for data collaborating within days. There was no need for an infrastructure, and it meant we could have 24x7x365 outsourced application monitoring, performance alerts and support. Negating any dependency internally on an already stretched resource.  

With the Managed RStudio, the team have successfully developed their own Shiny applications, both public and private. The public application is currently receiving c20,000 hits a month for detailed analysis around public health services. Internal more data sensitive applications, allow effective dissemination of data trends by location and services, such as environmental health and NHS”. 

Through the use of  RStudio Teams, the team is benefiting from a go-to tool which is empowering their statisticians to manage and develop their code. The ability to provide these tools on a centralised server, accessible from anywhere and without computational constraints of a laptop, has been highly conducive to team productivity, success, and stakeholder engagement.  The data is significantly more secure with improved data governance and infinitely presents less work for the team, allowing them to focus on providing analytic value. 

The lessons learnt  

 “Over the last 2 years and working across an R-environment we have transformed our procedures, implemented best practice and significantly enhanced our stakeholder communications. Here’s some lessons I learnt along the way.  

Take your changes and run with it – if your team is working ineffectively, lacking processes and delivering value, then I strongly recommend investing in a modern data analytics enterprise. This means striving to do more with less resources, which involves pushing productivity to the max to gain the best value.  

Show ROI early – Our team were able to show the impact of our investment. Our data is effectively shared to partnership organisations daily – it is relevant, complete, timely and consistentGone are the days where organisations operate in silos, Managed RStudio has been vital for critical communications with key stakeholders.  

Know what you are looking for  – Sometimes an independent view point prevents you from wasting significant amounts of time, when thinking about data in terms of your objectives is a good place to start.   

Start small and scale – With RStudio Teams, my team is benefiting from a go-to tool which is empowering statisticians to manage and develop their code. The ability to provide these tools on a centralised server, accessible from anywhere and without computational constraints of a laptop, has been highly conducive to team productivity, success, and stakeholder engagement.    

Deployment is hard – RStudio Connect and Pins have been invaluable for production and deployment.  When we got R locally we thought we were set and then realised we could only share analysis via R Markdown and email.  RStudio Connect has allowed to share and publish interactive analysis across partnership organisations.  

Use Git – Git allows an abundance of team collaboration and help manage version control. Utilising Git provides the security, collaboration and certainty required to create and reproduce code and analysis across the team”.  

Will Yuill be joining the NHS R Community as a guest speaker on xxx where he will expand on his case for R and the impact it has had on the local authority.  He will be joined at Matt Sawkins, Product Manager of Mango. 

managed service
Blogs home Featured Image

In a recent webinar, we provided an overview of our Managed RStudio platform and demonstrated how modern technology platforms like RStudio gives you the ability to collect, store and analyse data at the right time and in the right format to make informed business decisions.

The Public Health Evidence & Intelligence team at Herts Country Council demonstrated how they have benefitted significantly from the Managed RStudio – enabling collaborative development, empowerment and productivity at a time when they needed it most. In turn, they have been able to scale their department.

Many of the questions from the webinar focused on the governance and security aspects of Managed RStudio. In this blog, we’ve taken all your questions and have for further clarity attached a document that can help with any further questions regarding architecture, data management and maintenance.

Many of the questions asked were aligned to the management of data in the platform from the process of working with data on local drives, user interfaces to the management of large datasets.

There are several methods of getting data in and out of the Managed RStudio. These methods will largely depend on the type and size of the data involved.

For data science teams to work productively and deliver effective results for the business, the starting point is with the data itself. Data that is accurate, relevant, complete, timely and consistent are the key criteria against which data quality needs to be measured. Good data quality needs disciplined data governance, thorough management of incoming data, accurate requirement gathering, strict regression testing for change management and careful design of data pipelines. This is over and above data quality control programmes for data delivery from the outside and within.

Can you please elaborate on getting data into and out of the Managed RStudio platform?

Working with small data sets (< 100Mb)

For smaller data sets, we recommend using RStudio Workbench’s upload feature directly from the IDE. To do this, you can simply click on ‘upload’ in the ‘file’ panel. From here you can select any type of file from either your local hard disk, or a mapped network drive. The file will be uploaded to the current directory. You can also upload compressed files (zip),. which are automatically decompressed on completion. This means that you can upload much more than the 100Mb limit.

Working with large data sets (>100Mb)

For larger data sets or real-time data, we recommend using an external service such as CloudSQL or BigQuery (GCP), Azure SQL Database or Amazon RDS. These can be directly interfaced using R packages such as bigrquery,  RMariaDB or RMySQL.

For consuming real-time data, we recommend using either Cloud Pub/Sub or Azure Service Bus to create a messaging queue for R or python to read these messages.

Sharing data between RStudio Pro/Workbench, connect and other users

Data can easily be shared via ‘Pins’, allowing data to be published to Connect and shared with other users, across Shiny apps and RStudio.

Getting data out of Managed RStudio

As with upload, there are several methods to export data from Managed RStudio. RStudio Connect allows the publishing on Shiny Apps, Flask, Dash and Markdown. It also allows the scheduling of e-mail reports. For one-off analytics jobs, RStudio also allows you to download files directly from the IDE.

The Managed Service also allows uploading to any cloud service such as Cloud storage buckets.

Package Management

R Packages are managed and maintained by RStudio Package Manager giving the user complete control of which versions are installed.

RStudio Package Manager also allows the user to ‘snapshot’ a particular set of packages on a specific day to ensure consistency.

The solution to disciplined data governance

Data that is accurate, relevant, complete, timely and consistent are the key criteria against which data quality needs to be measured. Good data quality needs disciplined data governance and thorough management of incoming data, accurate requirements gathering, strict regression testing for change management and careful design of data pipelines. This all leads to better decisions based on data analysis but also ensures compliance with regulation.

As a Product Manager at Mango, Matt is passionate about data and delivering products where data is key to driving insights and decisions. With over 20 years experience in data consulting and product delivery, Matt has worked across a variety of industries including Retail, Financial Services and Gaming to help companies use data and analytical platforms to drive growth and increase value.

Matt is a strong believer that the combined value of the data and analytics is the key to success of data solutions.

Blogs home

To celebrate EARL taking place this September we launched a competition to win a free online training course and the competition closes today at midnight – Submit your competition entry here.

You have the chance to win a 2-day training course for you and up to 9 other attendees from your company. The winner can select either ‘Intro the R for analytics’ or ‘Intro to Python for analytics’. The workshop will be delivered online on UK time – so you can enter from any location!

To enter you just need to complete this form and answer a few questions about you and your team. The closing date is the 30th of September 2021, the winner will be contacted in early October. Please read the full terms and conditions here.

 

Blogs home

Thank you to everyone who joined us for EARL 2021 – especially to all of the fantastic presenters! We were pleased to receive lots of really positive feedback from the online event and there are plenty of highlights to share.

Branka Subotic, NATS

It was great to kick off EARL 2021 with our first keynote of the day from Branka. She has worked for NATS since 2018 and is currently their Director of Analytics. Branka shared with us interesting ways to help teams to work together and also some unusual ways to upskill! Her talk was peppered with some videos showing us flight data and the impacts of Covid.

Chris Beeley, NHS – Stronger together, making healthcare open- building the NHS-R Community

We are always delighted to hear from the NHS at the EARL Conference and this year was no exception. We were treated to a passionate talk from Chris on how the NHS-R community has been built up over the years and how their conference has gone from strength to strength. We all know how supportive the R community can be, so it is great to see this in action.

Amit Kohli – Introduction to network analysis

Amit gave us an introduction to the principles of network analysis and shared several use-cases demonstrating their unique powers. Amit also included a fun way to interact with his talk with the use of a QR code  – we can always rely on Amit to entertain us! Our team thought it was a really interesting topic and it felt accessible to those who perhaps don’t know much on the subject.

Emily Riederer, Capital One – How to make R packages part of your team

We loved Emily’s fun concept of making R packages a real part of your team and her use of code, and the choices she made along the way. Her talk examined how internal R packages can drive the most value for their organisation when they embrace an organisation’s context, as opposed to open source packages which thrive with increasing abstraction. Read our interview with Emily here.

Dr. Jacqueline Nolis, Saturn Cloud

We closed the day with our final keynote talk from Jacqueline Nolis. She is a data science leader with over 15 years of experience in managing data science teams and projects, at companies ranging from DSW to Airbnb. She currently is the Head of Data Science at Saturn Cloud where she helps design products for data scientists. Jacqueline spoke to us about taking risks in your career and shared with us the various risks she has taken over her career and how they went! It was inspiring to hear from an experienced data scientist that it’s ok to take a risk every now and then  – and refreshing to hear her honesty about what could have gone better – and how she has ultimately learned and grown from this.

These are just a few of the brilliant talks from a fantastic conference day. It was a delight to have speakers and attendees joining us from across the world – so thank you again to all that came along.

We are hoping to be back in London next year to host EARL in-person again. We are tentatively holding the 6th-8th of September 2022 as our conference dates. If you’d like to keep up-to-date on all things EARL please join our mailing list. We will open the call for abstracts in January 2022.

 

Blogs home Featured Image

Mango’s ‘Meet-Up’ at Big Data London on 22nd September features guest speaker Adam Hughes, Data Scientist for The Bank of England, whose remit involves working with incredibly rich datasets, feeding into strategic decision-making on monetary policy. You can read about Adam’s incredibly interesting data remit and his team’s journey through Covid-19, in this short Q&A.

Can you tell us about your interest in data and your role at Bank of England?

Working at the Bank it’s hard not to be interested in data! So much of what we do as an organisation is data driven, with access to some incredibly rich datasets enabling interesting analysis. In Advanced Analytics, we leverage a variety of data science skills to support policy-making and facilitate the effective use of big, complex and granular data sets. As a data scientist, I get involved in all of this, working across the data science workflow.

What’s the inspiration for your talk  – effectively data science at speed?

As with so much recently – Covid. With how fast things have been moving and changing, traditional data sources that policymakers were relying on weren’t being updated fast enough to reflect the situation.

Can you tell us about your data team’s journey through covid-19 and the impact it has had?

In a recent survey, the Bank of England sought to understand how Covid has affected the adoption and use of ML and DS across UK Banks. Half of the banks surveyed reported an increase in the importance of ML and DS as a result of the pandemic. Covid created a lot of demand for DS skills and expertise within the Bank of England too. Initially this led to some long hours, but it was motivating and generally rewarding to work on something so clearly important. Working remotely 100% of the time was a challenge at first, but generally the transition away from the office has been remarkably smooth in terms of day-to-day working (though there are still disadvantages due to the lack of face-to-face contact). As outputs have subsequently been developed and shared widely in the organisation, they have been an excellent advert for data science, showing the value it can add. In particular, it’s been great to see the business areas we worked with building up their local data science skills as a consequence.

What’s the talk about and what are the key takeaways?

The talk will cover some of the techniques we used to get, process and use new data sources under time pressure, including what we’ve learnt from the process. The key takeaways are:

  • Non-traditional datasets contain some really useful information – and can form part of the toolkit even in normal times;
  • Building partnerships is key;
  • A suite of useful building blocks, such as helper packages or code adapted from cleverer people helps speed things up;
  • Working fast doesn’t mean worse outcomes.

We look forward to seeing you at Mango’s Big Data London, Meet Up, 22nd September 6-8pm, Olympia ML Ops Theatre. You can sign up here.

Guest speaker, Adam Hughes is one of The Bank of England’s Data Scientists, https://www.linkedin.com/in/adam-james-hughes/

Blogs home

We are launching a competition to celebrate the Enterprise Applications of the R Language Conference taking place from the 6-10th of September 2021.

You have the chance to win a 2-day training course for you and up to 9 other attendees from your company. The winner can select either ‘Intro the R for analytics’ or ‘Intro to Python for analytics’. The workshop will be delivered online on UK time – so you can enter from any location!

To enter you just need to complete this form and answer a few questions about you and your team. The closing date is the 30th of September 2021, the winner will be contacted in early October. Please read the full terms and conditions here.

EARL is a cross-sector conference focusing on the commercial use of the R programming language. The conference is dedicated to the real-world usage of R with some of the world’s leading practitioners. If you use R in your organisation, the EARL Conference is for you and your team. Whether you’re coding, wrangling data, leading a team of R users, or making data-driven decisions, EARL offers insights you can action in your company.

This year there are four online workshops you can join for £90 each and also a final day full of presentations on using R in enterprise, which is just £9.99.

Blogs home

The opening keynote at the Enterprise Applications of the R Language Conference presentation day will be Dr Branka Subotić. Branka has over 15 years of experience in the aviation industry and she has worked for NATS for 12+ years. Branka will be joining Jacqueline Nolis as our second keynote speaker at EARL. The presentation day will run on Friday the 10th of September all-day – tickets to this event are just £9.99!

Branka made her start as the Senior/Principal Human Factors Specialist at the Directorate of Safety, working mostly on the implementation of the new enroute electronic system for air traffic control (iFACTS) into live operation at Swanwick Area Control (the UK’s largest air traffic control centre).

In November 2018, Branka joined the CIO team to lead the development and implementation of the enterprise-wide Data Strategy for NATS; working with a range of colleagues from across the business. This led to Branka becoming Director of Analytics, leading a team of 80+ analysts including a Data Science team focused on the implementation of advanced analytics and AI/ML. Branka’s analytics team is responsible for enhanced data-driven customer insights, tools, and services. The team brings together the data analytics functions and delivers insight and data-driven solutions to NATS operation, corporate functions, and customers around the world.

Branka holds a PhD in air traffic management from Imperial College London (UK), MSc in aeronautical science from Embry Riddle Aeronautical University (USA) and MEng in air transport engineering from the University of Belgrade (Serbia). She is also a Chartered Engineer.

And if you needed any more information, Branka is from Belgrade (Serbia) and has been living in few countries – Bosnia, France, USA, Germany and the UK!

Join other Rstat users and find how out other people are using R in enterprise – get EARL tickets today and get inspired!

 

 

Blogs home

WHAT IS EARL?

The Enterprise Applications of the R Language Conference (EARL) is a cross-sector conference focusing on the commercial use of the R programming language. The conference is dedicated to the real-world usage of R with some of the world’s leading practitioners.

WHO IS IT FOR?

If you use R in your organisation, the EARL Conference is for you and your team. Whether you’re coding, wrangling data, leading a team of R users, or making data-driven decisions, EARL offers insights you can action in your company.

WHERE IS EARL THIS YEAR?

We are back online from 6th-10th of September. EARL will be hosted on ‘Big Marker’ – event software that does not require any downloading, you simply join with a link. All of the EARL sessions will be recorded and sent out to ticket holders after the event.

WHY SHOULD YOU ATTEND?

  1. To hear from fellow Rstats users – you might have been working from home for the past year like so many of us, or perhaps you’re the only Data Scientist in your team or haven’t had time to chat with others as much. EARL is a great opportunity to find out how others are solving enterprise issues with R and to get ideas to take back to your team or work.
  2. Network – we know it’s not the same as having a cold glass of Pimms overlooking Tower Bridge, but you can chat to fellow Rstats fans while at home in your most comfortable leisurewear! You can talk to other attendees on the conference chat or send a private message to someone you have burning questions for! EARL online attracts an even more varied attendance list, with people joining from across the globe.
  3. You don’t need to travel – though I’m sure many of us are desperate to travel anywhere right now, you can join EARL from wherever is most convenient and comfortable for you. And if you don’t need to persuade your boss to pay for travel and accommodation, perhaps more of your team can come along!
  4. Cheaper tickets – as the conference is online, we simply just don’t have as many overheads for this event when compared to our usual in-person event and that means we can make the tickets a lot cheaper. The online half-day workshops are priced at £90, whereas in-person they would normally be £200. The final Friday presentation day session is only £9.99, which we hope will make EARL even more accessible.
  5. Learn a new skill – there will be four workshops you can join at EARL this year. All of the workshops will run from 2pm-5pm (BST) and will be recorded and sent out to ticket holders to watch again at their leisure! Maybe you want to learn what Shiny apps can do for you,  how to develop a package in R, how to use Purr for functional programming or how to web scrape and text mine. All the workshops will be taught by Mango’s expert Data Scientists who will be happy to answer any questions you have.
  6. Be charitable – last year we donated EARL Conference profits to Data For Black Lives, this year we have chosen DataKind UK as our charity of the year to support. DataKind UK are a brilliant organisation that help social change orgs and charities use Data Science. So feel good knowing your ticket purchase will go towards helping people.
  7. Get inspired and energised – taking a break from work is good, you don’t need us to convince you of that. But taking the day or afternoon for you and your personal development can be hugely beneficial. You will always leave EARL with new ideas and ways to work. If you’re a team leader, EARL is a great opportunity to grow as a team, once you’ve attended the conference your team can share what they have learnt and can apply it at work.
  8. Start planning your talk for EARL London 2022! We are planning for EARL 2022 to be an in-person event again, in London next year – so start planning your abstract submission after watching Friday’s presentation day and share your work!

There is now less than a month until EARL online 2021 – so grab your tickets now and don’t miss out!

Blogs home

We caught up with Emily Riederer ahead of her presentation at the upcoming Enterprise Applications of the R Language Conference taking place online in September and asked her a few questions…

Hi Emily! So your talk at EARL is called ‘How to make R packages part of your team’, why is it so important to have a coherent internal ecosystem?

There are two reasons in my mind why I get excited about this. First is the most basic level, and the reason I got to thinking about it was as a pure efficiency play – it makes it so you can get the easy part of your job done and you can then look at the harder and more interesting complex parts. It makes work faster and more fun.

More holistically, I’ve realised that having a really solid internal ecosystem can actually be an amazing community building device. We are so lucky to have the external R Community, so having that structure to build off from, really helps to build a similar culture internally.

What problems could you run into if you didn’t have an internal ecosystem set up?

There can be a lot of inefficiencies in people solving similar problems in different ways and using different terms to describe their problems. I think that often large companies have silos – whether that’s a data silo or the structure of data teams. Using different tools can put us in boxes that we perhaps don’t need to be in.

Why do you think the R community is so special?

That’s a tough question! I think there are a couple of things that make it special – especially because R is a unique language – and I think it attracts a certain type of person. If you’re really passionate about R, you’re probably very curious about answering questions with data and maybe came to programming as your second choice – not your first. It’s an interesting group of people that are bound by similar challenges. Also thinking about the R community versus other tech communities, I think we have great informal leadership at the top, that really emphasises the importance of being inclusive and making space for newcomers. Other communities are defined by their ‘stars’ whereas we are lucky to have the whole community on board.

What can people expect to leave your talk knowing?

I hope that I can inspire the audience to think a little more critically and holistically about bringing internal packages into their own organisation in a couple of different ways.  If they haven’t thought about it before, I’d like to spark some curiosity – could this work for their teams? I have also learned some hard lessons through trial and error, and for those who have started their journey, I hope I can lay out a more formal structure for them to follow. I want to help them to be able to succeed – we have many great external R packages too, and you don’t want to steal the limelight from those. It’s about finding the right balancing act of using internal packages and using them to solve the unique challenges where nothing can work.

Are there any developments in the Rstats world or are there any things that are on your list to try or learn?

Where do I start! That’s a good question. A couple of things that come to mind for me is the phenomenal interoperability work that is going on right now –  better integration with R and Python and reticulate (even in the tidymodels space) making it equally easy to have one unified interface to hook up the many different back ends of different R packages to a common framework. I think that’s another interesting challenge with internal R packages, you definitely have to find good ways to recruit teams that are mostly Python people or mostly Microsoft Office people. So I think in the internal enterprise setting, all of the interoperability work is super exciting to me because it means we can go places and talk to people that we couldn’t have talked to before.

You can hear from Emily and a host of other Rstats users at this year’s EARL online Conference – tickets for the Friday session are just £9.99.

There are also four workshops in the week leading up to the Friday session – each workshop is taught by one of Mango’s expert Data Scientists and it is £90 for a half-day workshop.

Blogs home Featured Image

We are proud of the speaker lineup we have at this year’s online Enterprise Applications of the R Language Conference and we are delighted to share that Jacqueline Nolis will be joining us as keynote presenter on Friday 10th September.

Dr. Jacqueline Nolis is a data science leader with over 15 years of experience in managing data science teams and projects at companies ranging from DSW to Airbnb. She currently is the Head of Data Science at Saturn Cloud where she helps design products for data scientists. Jacqueline has a PhD in Industrial Engineering and coauthored the book Build a Career in Data Science. For fun, she likes to use data science for humour—like using deep learning to generate offensive license plates. Which to us sounds like an incredible way to use deep learning!

Tickets are now on sale for EARL online 2021 – to watch Dr. Jacqueline’s keynote and a whole host of other talks on how people are using rstats in real-world situations it is £9.99 for a ticket to Friday’s presentation session.

Also at EARL online, we are hosting four workshops – all are 2-5pm UK time and £90 each.

  • Introduction to Shiny
  • Package Development in R
  • Functional Programming with Purrr
  • Web Scraping and Text Mining Lyrics in R