North East Data Scientists Group Works As a Professional Group

R Consortium talks to Colin Gillespie (from Jumping Rivers) about how a relatively small area deals with increasing membership, what companies are doing to make their data science teams more efficient, and how we might want to look at how governments might view data science.

What is the R community like in Newcastle upon Tyne?

CG: Newcastle upon Tyne is the largest city in the northeast of England. We are a large area with a small population. Correspondingly, our community is rather small. However, we do have a lot of people who are using R. There are a number of world-leading universities in the region (Newcastle University and Durham University). We also have a number of government agencies that use R, such as the Department of Work and Pensions. On top of that, the National Innovation Centre for Data and Jumping Rivers are situated in Newcastle.

The user group started around 2016 with around eight people. About 2018 we rebranded as a Data Science User Group named North East Data Scientists and our attendance grew (almost overnight) to twenty people. We still talk about R applications, but we get a larger audience. Not all of our talks are about R, but a good portion of them are. In general, we have a data theme that covers R, Python, and Machine Learning. Newcastle is big enough to have one of everything (R, Machine Learning, and Python), but not big enough to have large single groups.

We have over a thousand members right now. During COVID we had around 25-45 members attending. Over the last six months, we’ve gone back to in-person events at the amazing Catalyst building. We typically have around 40 attendees.

A typical meet-up would have two talks per night, one short and one long. We’ve also just started running short tutorials.

How has COVID affected your ability to connect with members?

CG: Once covid happened we had more people coming from around the world. We kept up talks at the same pace (one every two months). We also tended to keep the meetings short (1800 to 1930). One big benefit is that we have been able to grab people from around the world to give talks. However, we always try to aim the talks at our group specifically.

Now we’ve started in-person meet-ups, some things have changed (for the better)! We have a great new venue that is home to multiple data companies, including the National Innovation Centre, Jumping Rivers, and DSTL. This has increased attendance and made organizing the meet-up much easier (my office is two floors above the meet-up room).

Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting?

CG: Dean Attali had a presentation where he took a shiny app that he made and went through how to make it better. He showed tricks and other ways to make them more efficient. This was an outstanding talk. We didn’t get a recording of it due to time constraints.

What trends do you see in R language affecting your organization over the next year?

CG: Companies in our area are expanding their data, and science teams. One of the things that they are wondering is how to better implement best practices and how to work in teams with data scientists. This is especially important when you work in small teams. This isn’t necessarily R, but data science in general.

Are there particular data journalism projects that you’ve seen in the last year that you feel had a positive impact on society?

CG: My favorite one was the Financial Times piece on COVID. It is excellent. It shows the numbers on COVID, density plots, and how well vaccines are affecting infections. It runs on R as well. One of the main authors did a talk at RStudio last year as well.

Of the Funded Projects by the R Consortium, do you have a favorite project? Why is it your favorite?

CG: HTTP testing in R. Maëlle Salmon, who is writing the book, is always excellent. I think that a lot of work is being taught by blog posts, which can be a bit dangerous. So a canonical source would be great.

Of the Active Working Groups, which is your favorite? Why is it your favorite?

CG: Keeping my eye on the R validation hub. Not sure how much progress they are making. Jumping Rivers (the company I work for), often has clients who are interested in the topic. So it’s nice if it’s solved based on community effort.

There are four projects that are R Consortium Top-Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?

CG: Something around the legal aspects of R. Perhaps working with governments and big companies to demystify R. Because of the potential legal ramifications (like can you sell R code?) having a central answer would be great.

When is your next event? Please give details!

CG: Our next meet-up is scheduled for July 14th. But we are also running an in-person conference this October 6-7: Shiny in Production! The conference is going to be an afternoon of workshops on the 6th, followed by a day of talks from experts across a range of industries on the 7th. We’re really looking forward to welcoming people to the Catalyst!

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications!

Apply Here