Blogs home Featured Image

Our first speaker at London R is Giles Heywood who works as Chief Data Scientist at Seven Dials Fund Management. As an alternative property specialist, he uses model-driven strategies to support residential property investment – and as a user of R for 20 years, and author of ‘its’ package for irregular time-series (published on CRAN), he naturally turns to R for all analysis.  

Once a proof of concept, his robust and optimised product readily models district, area and regional property trends, cycles and risks. 

How can the right data support property choice?  

There is a growing appetite among investors for real estate alternatives – including student accommodation, senior housing, build-to-rent residential and hotels, which can offer better prospects for income growth. It also offers risk-adjusted returns than the traditional commercial real estate segments.  

The 7-strong team at Seven Dials Fund management takes a structured and systematic approach to direct real estate investment and also indirect investment through funds in this way.   

Whilst commercial real estate lacks comprehensive open data on transactions, residential property benefits from transparent and complete data on crucial variables of transaction price, floor area and income data to model the dynamics of affordability.  

Our approach is a meticulous analysis of the systematic drivers of return and the regular and often predictable patterns generated in long cycles. For a first-time buyer that can choose a property between small and expensive and larger but cheaper, the right data could help the most appropriate choice and its impact on the future property ladder progression.    

Is modelling in a property-related application fairly unique?   

Although Seven Dials primarily advises institutional clients on large portfolios, some of the most exciting opportunities are in delivering quantitative insights to homebuyers and in particular high net worth investors. We see important synergies or at least significant overlaps between institutional and retail.   

For many, buying their property is one of the most significant financial decisions they will make. Imagine if data science could be used to support decision making in line with their mortgage in the future. The housing market has generally gone up since the key ‘Price Paid’ dataset appeared in 1995, however in 2008 we saw falls of 15-20% nationwide, and in some areas prices have only recently regained 2007 highs.  Both the relative returns and risks can be tracked, modelled and managed. Of course institutions have models for property risk and return, and had sophisticated models back in 2007 which to some extent failed in the crash.  Technology has moved on considerably, aided by t-copulas, non-parametric bootstrap and stress-testing. What our team has done is not to copy others, but to start from the ground up with the best repeat sales indices we can construct, factor risk models, and forecasting consistent with those foundations.   

And how are data science models used for residential property investors now?   

There are some prototypical models on the major portals, and one of the most popular is automated valuation models (AVM).  We don’t do that, for all kinds of reasons, but it’s very appealing for individuals to get updated valuations on their homes and maybe on others, including those that are not on the market.     

What will your talk focus on and what might be the key take aways from your talk? 

My specific contribution to modelling at Seven Dials is projecting relative return within sectors of residential real estate to an investment horizon, using factors. The first factor is the overall market direction and is the sort of macroeconomic variable that is quite hard to predict, so for example an unforeseen pandemic did not hold back the market – to the surprise of many. However the relative price performance is more foreseeable since it is essentially driven by microeconomic forces, and in particular by affordability.   

In addition to the models’ straightforward price-forecasting applications for homebuyers the same analytical framework will be familiar to institutional investors and lenders, and can provide strategies for risk-controlled portfolio management.    

I’ll take you on a highly focused and structured trip through a stack of three models and show how they relate both to familiar ideas like the ‘ripple effect’ but also give precise insights into a long cycle driving relative returns both locally and nationally.  Everything is in R, and I’ll link it to some of my package choices for getting both coding and analysis done fast and accurately, or at least I can answer questions about that.  

Using R to Model UK Residential Property by Giles Heywood 

Will you be joining us at LondonR ? Giles Heywood who works as Chief Data Scientist at Seven Dials Fund Management uses model-driven strategies to support residential property investment. In his talk, discuss how both the relative returns and risks in property investment can be tracked, modelled, and managed.  Join us at LondonR  

Blogs home

The next LondonR session will be a workshop focused on ‘Text Analysis in R’.

The workshop will be hosted by Hannah Alexander and Elizabeth Brown on Tuesday 23rd March from 3.30pm-5.30pm (GMT).

In this introductory workshop, we will show you how to get started with analysing text data – from simple manipulation through to sentiment analysis. A good working knowledge of R programming is assumed and familiarity with basic analytic techniques and linear modelling is required.

Tickets to the session are free, if you would like to join, please register for a ticket here. 

If you’d like to find out more about our future events, click here.

Later this year The Enterprise Applications of the R Language Conference will be hosted online from 6-10th September 2021, to stay up-to-date join the maling list here.

Blogs home Featured Image

In 2020 LondonR moved online and whilst we have missed the opportunity to connect face-to-face over a few drinks, it has been great to keep in touch with the R stats community virtually.

The move to online meetups has enabled us to make LondonR more accessible – something we will take forward with us when we move back to in-person meetups – and so we will be streaming and recording all our future live events. If you would like to catch up on 2020’s LondonR’s, you can find the recordings here.

We are starting our 2021 meetups with a bang! Faris Najo from Tercen will be presenting a talk entitled ‘Empower the Biologist and liberate the Bioinformatician’. Faris will discuss and demonstrate how to “empower” the Biologist and “liberate” the bioinformatician for data analysis using a platform called Tercen. Tercen aims to liberate bioinformaticians by removing them from the operational support and allowing them to concentrate on the algorithmic fundamentals of R packages.

Next on the agenda, we have Jonathan Ng, who will be taking advantage of the remote side of LondonR and will be joining us live from New Zealand -Jonathan may be familiar to the LondonR crowd as he has previously presented at our annual EARL Conference. Jonathan will be sharing career and business tips for data scientists, including advice on ‘how to get paid more working on more meaningful and enjoyable work while creating happier clients’, which we are sure will be an interesting topic to most of the data scientists joining us on the 2nd Feb.

Finally, we will be joined by another far-flung guest, this time from America – Rachael Dempsey. Rachael is from the RStudio team and will be sharing highlights from the recent Rstudio::global conference, a 24-hour event which took place on 21st January – so there are sure to be plenty of highlights to cover.

LondonR is open to all, so if you’re new to R and want to get to know the community (which is incredibly friendly) then please join us on Tuesday 2nd February at 5pm GMT. It’s free to join – please register here for a ticket.



Blogs home Featured Image

We were thrilled to host Hadley Wickham who delivered, as ever, a funny and engaging talk to a packed house at LondonR in August. In fact, to give you an idea of how much anticipated this event was, tickets to see Hadley sold out in under two hours!

It’s always fascinating for us elder members of the R community who remember the good old days, to witness the move from academic tools through to commercial adoption and engagement. For many years, R was proposed and rejected by many organisations due to the environment and architecture that existed. We used to spend time trying to work out data sizes and whether things would help.

I remember talking to Hadley at the first EARL in the US about creating toolsets that allowed organisations who didn’t “Love” R to use it and deploy it internally, comfortably. Hadley’s and latterly his team’s work, has allowed the ecosystem around R to develop from introspection, to a wide view of the analytic landscape, and his talk reflected I felt on some of these shifts.

Hadley’s insight into the mistakes he has made rang very true when considering the scale of the user base today compared to when he started developing packages. That moment of clarity when you realise that you need to prepare things in order for people you don’t know to pick up and use them efficiently, lies at the heart of good programming practice but sometimes is easily forgotten. This has driven Hadley on, to create better and easier codebases that are central platforms but also initiating others thoughts and developments.

It was great to hear someone like Hadley acknowledge that innovation isn’t a straight line and that forking and dead ends are essential parts of the process. Speaking to attendees afterwards, this message was highly prized and it felt as though there was an increased confidence with many attendees to go out and try things without the fear of failure.

All in all a fantastic evening that reinforced just how great the R community is.

If you’d like to view Hadley’s LondonR presentation, you can download it here.

Blogs home Featured Image

To the uninitiated, entering UCL’s packed Darwin lecture theatre on Monday evening knowing you held a golden ticket so coveted that 400 names remained on a waitlist, you could be forgiven for thinking this was the most popular meetup on the planet

And perhaps, on this occasion at least, LondonR was the focal point of the R universe. Because after all, it’s not every day that one has the opportunity to attend an in-person presentation by the great Hadley Wickham.

If you’ve been living under a rock for the past ten years, Hadley Wickham is Chief Scientist at RStudio as well as an adjunct professor of statistics at the Universities of Auckland, Stanford and Rice University.  In short, he builds tools (computational and cognivite) that make data science easier, faster and more fun.

As soon as the title slide: “Tidyverse: The Greatest Hits” was revealed, murmurs began to echo around the lecture theatre.  And these murmurs increased in intensity when Hadley dismissed the title as somewhat misleading and instead promised to talk about the “biggest mistakes” that have been made since the tidyverse came into being.

The hushed predictions of many were confirmed when a giant ggplot2 sticker appeared on the screen and the presentation that followed was enlightening, entertaining and introspective in equal measure.

Mistakes Part I

Beginning with his eternal remorse over discovering the magrittr pipe only after the first major wave of ggplot2 uptake, and via a sheepish admittance that masking stats::filter() was perhaps overly callous, it wasn’t too long before we arrived at the topic of tidyeval.

This was where Hadley was able to expand for the first time on one of the core messages of his talk: borrowing a quote from software development coach, GeePaw Hill, he advocated the benefits of making as many mistakes as possible as quickly as possible and described how, over the course of several years, he had experienced several “false epiphanies” which ultimately resulted in the creation of lazyeval and subsequently the tidyeval we all know and tolerate today.

Of course I jest; tidyeval is incredibly powerful, and Hadley was unwavering in his conviction that it will be the source of much future progress in R development, highlighting among other uses its fundamental role in innovative interface packages such as dbplyr and dtplyr.

He went on to acknowledge, however, that the number of people who share his passion for the underlying theory of tidyeval is rather small and subsequently reflected on the decision to reveal it to the world while still in its relative infancy. Although initially disappointed by the slowness of the community to warm to the new concepts, he has come to terms with the fact that not everyone will immediately jump onto the quosure bandwagon. Consequently there have been efforts in recent times to increase the accessibility of tidyeval, and we were proudly shown one of the latest developments: the “interpolation”, or “curly-curly”, or (Jenny Bryan’s wonderful coinage) “embrace” operator {{_}}.

The trend towards user-friendliness and, in particular, self-explanatory functionality, is set to continue: we can look forward to the imminent release of tidyr 1.0.0, where the introduction of pivot_longer() and pivot_wider() is sure to delight those of us who never wrapped our heads around gather() and spread(), and to delight the rest of us, who DID eventually get to grips with them but still had to look up the syntax of both every time we wanted to use either.

But what about Python?

We couldn’t claim to have hosted a top R event if we hadn’t had some mention of Python from the audience. Hadley took the mandatory “R vs Python” question in his stride, perhaps unsurprisingly given the frequency with which he must face it.

In order to use Python, he argued, one must necessarily learn at least a small amount of programming – enough that someone coming from a purely data science perspective might be discouraged from continuing beyond the earliest stages of learning.

It’s possible to do useful data science work in R without learning any programming at all, and then as greater complexity is required, one can start to learn more about programming and about the language itself. Once someone has reached that point though, it is more a question of what is most suitable for the task at hand, in the context at hand.  And here Hadley animatedly encouraged us to “use Python!” if that was the sensible option.

Mistakes Part II

This sense of unity was a common theme throughout the presentation. Approaching the conclusion, Hadley expressed some regret for, in his view, one of the largest mistakes of all: the decision to denominate a certain group of packages as “the tidyverse”.

The intention, he elaborated, was never to provide a complete-but-isolated paradigm. Putting aside our human tendency to see conflict where there are merely options, there is no “base vs tidyverse” turf war. A tidyverse package can be used, is designed to be used, in exactly the same way as any other R package, ie: in whichever context works best, with whatever other packages work best.

Hadley cited specific examples such as the effective combination of data.table and ggplot2, praising the utility and speed of the former in conjunction with the visualisation power of the latter. The name “tidyverse” is a blessing and a curse, he concluded.  Powerful as a label for the concepts it represents, but overly evocative of completeness and correctness.

Love for the R Community

In response to a question about how the R community has developed over the years, Hadley described how, at every stage of the community’s slow transition from the original R-Help mailing list, through StackOverflow, and most recently to Twitter and the RStudio Community forums, “asking for help” has gradually become a much easier thing to do.

The openness and friendliness of the community is one of the major strengths of R, and Hadley was quick to praise the community at large, giving RLadies a special mention for the work they have been doing in recent years.

After concluding with some enticing hints about where his efforts might be focused in the near future (look out for new and improved vctrs, maybe…) our time was up and Hadley had to leave for his next engagement, but not before hinting at the possibility of another visit when his “travel budget” allows.  But before you rush to put your name down for the next event, breathe steady folks, he’s a busy man and it might be a little while before he’s back over here on the wrong side of the world.

It was refreshing to hear someone like Hadley acknowledge that innovation isn’t a straight line and that forking and dead ends are essential parts of the process. Speaking to attendees afterwards this message was highly prized and it felt as though there was an increased confidence with many attendees to go out and try things without the fear of failure.

For those of you who were not lucky enough to get a golden ticket this time, don’t worry, all is not lost.  You can see a recording of Hadley’s presentation here.

And if this inspires to you to find out more about the R community, rest assured that spaces aren’t usually quite so keenly fought over.  We’re a friendly lot and you’ll probably find you even get a seat at the next event!




Blogs home Featured Image

My first LondonR took me back to my days at University as UCL hosted us for the evening.

Our first speaker of the night was Mike Smith from Pfizer. Mike had joined us to give a version of his talk that he delivered at this years rstudio::conf – ‘lazy and easily distracted report writing in R’. While there was a strong focus on his (wonderful) tidyverse themed t-shirt and his messy kitchen drawer, Mike had some home truths for us – we all get distracted very easily! This is why it’s so important to produce rmarkdown reports that help you to remember exactly what you were doing, not only for future you, or different people – but for presently distracted you!

He also emphasised how vital knowing your audience is, then showed us how easy it is to adapt an rmarkdown report for various audiences by parametrising your rmarkdown reports. I won’t go in to any detail here but definitely something worth looking in to if your work (or play) involves producing rmarkdown docs for multiple audiences.

After Mike’s talk, Laurens Geffert from Nielsen Marketing Cloud showed us how to build a supercomputer using the cloudyr project and AWS. Laurens definitely got the message across that R can be made into a very powerful tool very easily. Something we can all relate to is how Laurens code has progressed over the years; from base R, to purrr, then on to furrr! Dropping package names like it was going out of fashion Laurens introduced us to a suite of packages for parallel computing on AWS; aws.ec2, future, remoter and the aforementioned furrr the stars of the show. He ended his talk with a call to action, the cloudyr project are looking for people to help with maintenance of their AWS packages (if this sounds like something that interests you then check out ).

Our last speaker was Mango’s very own Hannah Frick – providing the low down on all the news from this year’s rstudio::conf. This year’s conference was held in Austin and featured titans from the R community such as Joe Cheng and Hadley Wickham. Hannah didn’t have time to tell us about all the brilliant talks in a half an hour presentation – so I certainly won’t try and do it here. What I can offer is a link to all of the materials from the conference here, and all of the sessions were recorded and are freely available here for your viewing pleasure.

The night ended with a shameless plug for our annual EARL London conference in September (abstract submissions close on the 31st of March!). If you’re looking for a reason to attend, or more likely convince your boss that you should attend, then look no further than this blog post.

All the information from this event and past LondonR’s can be found at We hope to see you all again at the next one on the 15th of May – again at UCL. We’re always looking for speakers so please get in touch if you’ve got anything to talk about!

Blogs home Featured Image

It’s the New Year and we’re kicking off 2019 with our first LondonR! The meetup took place on the 15th of January, and we were delighted to have about 100 people in attendance. With excellent speakers lined-up and a free bar for networking, we started 2019 with a BANG!

Please find all the presentations here.

Dawid Kaledkowski, ClickMeeting – Using sport package to predict speedway results

Dawid came all the way from Gdansk, Poland to talk to us about sport – the R package for online update algorithms that he authored. Dawid introduced us to his passion (and what is apparently the most popular Polish sport) –  speedway –  and to how you can use update algorithms to predict its results. Not only did he build a package to predict those results, but he also spent a good part of 3 years on building the most comprehensive database with speedway results. The talk finished with a lively discussion on algorithm and predictors’ specifics sprinkled with words of admiration for his devotion to speedway.

Colin Magee – R is for Racing

Colin made sure that we keep our racing hats on by sharing his passion for… horse racing! He told us about Ada Lovelace’s weakness for gambling and predicting horse race results before he moved onto his own take on horse race analytics. It turns out – surprise, surprise – R is a fantastic tool for data scraping, munging and feature engineering. And if it wasn’t clear to our audience already, Colin made it obvious in his live-coding demos! Colin is going to share more hints and tips on building accurate predictive models in his upcoming book, ‘R is for Racing’. We’re looking forward to reading it!

Dan Joplin, SparkBeyond – Automating fun: A choose your own adventure talk

It was definitely the most interactive and entertaining presentation of the evening. Dan went above and beyond to make sure that the audience saw all of his 81 (!) slides. He took us through the intellectual journey of how to build a tool that automatically finds the optimal pub-crawl route from point A to B. It would have been quite a typical talk if it was not for Dan offering quiz questions to the audience at various points where we could choose what approach to use in the next stage of the project. So not just entertaining but also very educational.

After the talks, we held our usual networking drinks – it was a great end to another fascinating LondonR. We hope to see you at our future meetups – join our mailing list or check out our website for more information.

Join Us For Some R And Data Science Knowledge Sharing In 2018
Blogs home Featured Image

We’re proud to be part of the Data Science and R communities.

We recognise the importance of knowledge sharing across industries, helping people with their personal and professional development, networking, and collaboration in improving and growing the community. This is why we run a number of events and participate in many others.

Each year, we host and sponsor events across the UK, Europe and the US. Each event is open everyone —experienced or curious— and aims to help people share and gain knowledge about Data Science and to get them involved with the wider community. To get you started we’ve put together a list of our events you can attend over the next 12 months:

Free community events


We host LondonR in central London every two months. At each meet up we have three brilliant R presentations followed by networking drinks – which are on us. Where possible we also offer free workshops about a range of R topics, including Shiny, ggplot2 and the Tidyverse.

The next event is on 27 March at UCL, you can sign up to our mailing list to hear about future events.

Manchester R

Manchester R takes place four times a year. Following the same format as LondonR, you will get three presentations followed by networking drinks on us. We also offer free workshops before the main meeting so you can stay up-to-date with the latest tools.

Our next event is on 6 February where the R-Ladies are taking over for the night. For more information visit the Manchester R website.

Bristol Data Scientists

Our Bristol Data Science events have a wider focus, but they follow the same format as our R user groups – three great presentations from the community and then drinks on us. If you’re interested in Data Science, happen to be a Data Scientist or work with data in some way then you are welcome to join us.

This year, we’re introducing free Data Science workshops before the meeting, so please tell us what you’d like to hear more about.

The Bristol meetup takes place four times a year at the Watershed in central Bristol. If you’d like to come we recommend joining the meetup group to stay in the loop.


This meet up is a little further afield, but if you’re based in or near Basel, you’ll catch us twice a year running this R user group. Visit the BaselR websitefor details on upcoming events.


As you may have guessed, we love R, so we try to support the community where we can. We’ve partnered up with OxfordR this year to bring you pizza and wine while you network after the main presentation. OxfordR is held on the first Monday of every month, you can find details here on their website.


BirminghamR is under new management and we are helping them get started. Their first event for 2018 is coming up on 25 January; for more information check out their meetup page.

Data Engineering London

One of our newest meetup groups focuses on Data Engineering. We hold two events a year that give Data Engineers in London the opportunity to listen to talks on the latest technology, network with fellow engineers and have a drink or two on us. The next event will be announced in the coming months. To stay up-to-date please visit the meetup group.

Speaking opportunities

As well as attending our free events, you can let us know if you’d like to present a talk. If you have something you’d like to share just get in touch with the team by emailing us.

EARL Conferences

Our EARL Conferences were developed on the success of our R User Groups and the rapid growth of R in enterprise. R users in organisations around the country were looking for a place to share, learn and find inspiration. The enterprise focus of EARL makes it ideal for people to come and get some ideas to implement in the workplace. Every year delegates walk away feeling inspired and ready to work R magic in their organisations.

This year our EARL Conference dates are: London: 11-13 September at The Tower Hotel Seattle: 7 November at Loews Hotel 1000 Houston, 9 November at Hotel Derek Boston, 13 November at The Charles Hotel

Speak at EARL

If you’re doing exciting things with R in your organisation, submit an abstract so others can learn from your wins. Accepted speakers get a free ticket for the day they are speaking.

Catch us at…

As well as hosting duties we are proud to sponsor some great community events, including PyData London in April and eRum in May.

Plus, you’ll find members of the Mango team speaking at Data Science events around the country. If you’d love to have one of them present at your event, please do get in touch.

Wherever you’re based we hope we will see you soon.