Skip to main content
All Posts By

R Consortium

Latin R talks about the Trials of Starting a Conference in Latin America

By Blog, Events

R Consortium talks to Yanina Bellini Saibene, Riva Quiroga, and Natalia da Silva on starting up a conference in Latin America, the importance of networking, and some of the difficulties that some people have in different parts of the world.

RC: What is the R community like in LatinR?

NDS: We are growing fast. Under new initiatives, we have grown to over 45 R-Ladies chapters in Latin America. We are now having conferences in Latin America like LatinR, SER in Brazil, and more.

RQ: Many people thought they were all alone. LatinR and different conferences have shown that the groups are connected. We met online without meeting in person. This was possible because we had connections through R-Ladies. This allows those connections to become visible, both in communities and in a broad region. It is growing very fast. For many people, LatinR was their first contact with the R community, and from there they created RUG in their community or started projects.

NDS: The first LatinR was in 2018 in Buenos Aires in  2019 was in Chile. We hope to have 2020 in Montevideo but were unable to due to COVID. We ended up doing so virtually.

RC: How has COVID affected your ability to connect with members?

NDS: The pandemic situation has made it very difficult. LatinR is done with volunteering, and everyone had a lot of things going on.  It was difficult to find time to organize a  conference in this situation but we did it. In the end, we had a great conference and we got more people involved in the R community.

RQ: I think another problem has to do with sponsors. If you are organizing an in-person conference, sponsors get something. They have a space for giving away swag or give a talk, so they can feel the people are getting something for their money. And that is not as easy in a virtual event. It’s not as easy.

NDS: Not just not easy, it’s impossible to find one for these conferences. The only way that LatinR has survived is because of volunteering.

RQ: Because it’s virtual, we don’t want to charge people for attending the conference. So not too many institutions can give us help for organizing the conference.

YS: However, R Consortium has helped us. They let us use zoom for hosting the last three days of LatinR. It was a way to help where we didn’t have to pay for it out of our pocket.

RC: In the past year, did you have to change your techniques to connect and collaborate with members?  For example, did you use GitHub, video conferencing, online discussion groups more?  Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?  

NDS: We used slack to organize the conference. We are also very active on Twitter. It’s basically for promotion and to figure out what is going on. We also have a web page that has a lot of information. We also use Github to share information from previous conferences as well as the presentations. We also have a youtube channel with different presentations. That is the way that we are organizing the materials, by combining all of these tools.

RQ: Slack is also a social space for the members to gather. It’s not only for the organizing team but also for the people who want to attend the conference as well as for people who have attended before. They can participate, ask questions, or how to prepare for R Conferences. Not just for LatinR, but other conferences like the R Studio Conference. It helps because if you want to give a talk at a conference you can go to the channel and talk to the committee and they can give you some feedback. It’s also a space that is used not just for organizing LatinR but also throughout the year. During the conference, we didn’t have a live stream, but we did have a live Q and A panel where we were able to ask questions.

RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting? 

YS: The fact that we had the conference here and brought some rockstar names from the R community to this part of the world is another huge benefit for South America. We had Alison Presmanes Hill and Maëlle Salmon for the last conference. We had a lot more talks, about reproducibility, data science, and more. It’s all volunteer work and we are proud because it’s done by hand. The fact that these people come and see the community that comes here is amazing.

NDS: Having these people here shows that we can get anyone. Right now, with COVID it’s not quite the same. It’s a lot easier for them as well as us. But, if we can get them in person, the effect will be much bigger on the community. That would be something important for us and the community as well.

RQ: And these talks were useful given the context.  Maëlle Salmon gave a talk on making your website using distill. Alison Hill gave a talk about how to learn new things and that was impactful for people who were alone in their home and didn’t have a community in real-life spaces. It was really important because it was something that impacted their lives. We have tried to invite people who will talk about things that are important for our region. For example, when Hadly Wickman we asked him to run a workshop on how to make an R package. The idea is that having more people in our region making R packages for the community. More people after the workshop sent their packages to CRAN and had their packages published. It was really interesting, because that workshop was in 2019, and last year we had people who presented packages that they produced because of the workshop. We are trying to think strategically, what people can do that will help the community and help others.

NDS: I think, overall, we had an impact on package production for the R community after this workshop. 

YS: Also, the tutorials we did last year. In Buenos Aires and Chile we had tutorials. Last year we were able to have more because of the virtual part. We had the first two in Spanish and English. Last year we were able to do it in Portuguese. We were also able to have people certified as RStudio and Carpentries instructors teaching the tutorials which allow us to have nice, high-quality tutorials. This shows that people from our region can give quality training in R. We like to say that we are not just a conference, but rather a community. The conference is a way for us to see each other and show what we have been doing since the last time. What works and what doesn’t work as well as concerns in the community.

RC: What trends do you see in R language affecting your organization over the next year?

NDS: I think that R is changing and we have to pay a lot of attention. Right now, R Studio is involved in the R Community and is taking a more major role in the community. All these changes mean that we have to be kept up to date. We cannot go to sleep and think that the world has changed. We need to keep inviting people to our conference to keep our community up to date. In terms of the day, R Studio is a companion. People are also using the Tidyverse, and you have to keep learning this all the time. I think that there will be some confusion between R and R studio making sure that the programs work well together.

YS: We have a really big player in R Studio and they lead some of the changes because they make good tools. And we use that tools. But they are a company. We need to make sure that we have a strong community that can ensure that R Studio pays attention to the wants of the community. For instance, more people want better accessibility in R, so we need to make sure that R Studio works on this as well. We need to have a balance because they are using an open-source language.

NDS: Maybe we need to make sure that we are teaching in R and not just in R Studio. Because R came first and is open-sourced and no company runs it.

YS: I think that R Studio needs to be held accountable. One thing that I’m happy about is that the R Community has standards. For instance, the data camps, R-Ladies, Africa R, and others give people the space to discuss and feel safe doing so.

RQ: We have to be very careful about what we want to showcase at the conference. Right now, the pipe being in the base is the new thing. We are talking about do we want to use it, how will we do it. This is on the top of our list because there are no resources in Spanish or Portuguese to learn the pipe. Some people are blogging on it. We want to be up-to-date in R Base, Tidyverse, DataTable,  etc. We want to show people a variety of ways to do something. So if you use R Base, we need to offer this. If you are involved in Tidyverse and Tidymodels, we need to offer that. If you are a Data Table user, we need to offer that as well. Offering those different ways to address the same problem. Even though there are major players that may have a way to do things simply, we have to be aware that some people may not be able to use them. This is why we need to offer a wide variety of approaches to the same problem.

RC: Do you know of any data journalism efforts by your members?  If not, are there particular data journalism projects that you’ve seen in the last year that you feel had a positive impact on society?

RQ: People from Datasketch, a company from Bogota, Columbia, started making tools for data journalists for people to use without using R. They made a lot of shiny apps for people to use. If you have data you can upload the data, get a plot, and put it in your report. They are doing a lot. They ran a crowdsource a couple of years ago. They presented their shiny apps in 2019 and 2020. Mostly they are used by data journalists in Latin America. They are also organizing meetups.

YS: We also have some people from Politics who are building a package that analyzes speeches.

NDS: In this field, there are people from Argentina and Uruguay who are working on similar packages who are making packages to analyze this type of data.

RQ: We have both people doing data journalism as well as people making packages that would help people do their job.

RC: When is your next event? Please give details!

NDS: LatinR will be on November 10-12 and the previous week we will have workshops. Right now we are doing calls for papers till July 31.

RQ: People can present in Spanish, Portuguese, or English. We are very open in terms of language. We are a trilingual conference so people can present in any of those languages.

RC: Of the Funded Projects by the R Consortium,  do you have a favorite project?  Why is it your favorite?

YS: R-Ladies. That’s an easy question. For me, R-Ladies changed my life. It is amazing.

RQ: Because of R-Ladies we were able to meet each other and started organizing a conference.

NDS: We exist because of R-Ladies. Without R-Ladies, LatinR would not be and we would never have met.

YS: And a lot of people from the organizing team are from R-Ladies. And a lot of the other user groups have R-Ladies members in them. 

RQ: It has been easier to invite people to the conference because of the R-Ladies network. It was very difficult to ask people about a conference that has never happened before. We asked Jenny Bryan, and she said yes I think because we asked through the R-Ladies channels. 

RC: Of the Active Working Groups, which is your favorite?  Why is it your favorite?

NDS: Certification would be a good one to do. TO have a common certification program would be very important. Also, this would be good for academics as well as work. Maybe if they have some sort of certification is good.

RQ: Currently, the only way to show that you know anything about R is to take the R Studio certification and pay for that. There is no way that you can prove that you know anything. Maybe your Github would be a way to show that you know something.

NDS: Maybe you can show your Github account and show people your work and what you can do.

RQ: Having an R certification would be great!

RC: There are four projects that are R Consortium Top Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?

NDS: Something about diversity. The language barrier is hard. We have different barriers other than language. You have support for a conference, but it’s not the same for us in Latin America. Each country has its problems. Maybe different areas need different types of support.

YS: Yes. Something as simple as getting reimbursed is different in Latin America. It’s really hard for us. It took me 12 months to get a check for $97 for my chapter of R-Ladies. And I had to pay taxes and pay to get it. This is something that people don’t realize. This is something that you don’t know if you don’t live here. Not all countries in Latin America has post mail for instance. Some of them have a central post office where you would have to travel to receive your post, pick it up and pay for it. While other countries in the region may have it. I have to pay a notary, go through customs, and more to get a shirt that someone sent me. These are the types of issues that we deal with. If you want to help people, you need to listen to us. You need to listen to people from this region in the places where decisions make. The effort has to make by the R Consortium to make these changes. I already have to work 3 or 4 times as much. I have to learn another language, be able to understand your language, and I have to try to speak it in a way that could be understood. We cannot afford to work on the process as well. A way to streamline support will help us immensely.

 

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications!

R Ladies Montevideo, Uruguay, Explains How to Create a High Quality Conference in the Southern Hemisphere

By Blog, Events

R Consortium talks to Daniela Vázquez of R Ladies Montevideo on how they are building community in Latin America and trying to host a conference that people would attend. The initiatives they have done are helping create a sense of community and encourage people from different places to attend conferences together.

RC: What is the R community like in Latin America?

Our community is mostly Spanish speakers, but we also have English and Portuguese members. We have guests, especially keynote speakers, that only speak in English, but most of the talks are in Spanish. We would have rooms for related fields in different languages. The main talks were then translated into different languages. Because most conferences are in the Northern Hemisphere, we tried to have a great quality conference of the same quality as those up north but without the traveling. The idea is to foster community by having a conference where people are close to each other. We also translated the R for the data science book. We are in a space where we have a community and we share similar idiosyncrasies. When we attend a conference in the Northern Hemisphere, we have the same baseline of interests and act the same.

RC: How has COVID affected your ability to connect with members?

We were pretty active before COVID. In Montevideo, we didn’t have a mandatory lockdown. Luckily, both LatinR and our local R Ladies groups were able to stay active while socially distant. The organizers tend to be very busy, so we don’t meet in person as much. I haven’t even seen my mother except over the fence. We had to stop because we didn’t have time. In the little time we did have, we had to work. I’m a consultant myself, so my work time was very erratic. It was crazy. Everyone was having the same difficulties.

That was one thing, but the other thing was that you felt that we were all together. We were all in the R Ladies Group, and we were meeting regularly and had good communication. Others were not so talkative and were muted with cameras off. We did most of the talks because we did the introductions for the speakers. It was hard for us because it was difficult to build something where people felt comfortable talking. When you have 20 boxes where people are just looking at you, it can be very daunting. We did reach more people because you didn’t have to be in the city – which was good. The bad part was that you didn’t know the people because they were from different countries. On the one hand, it was good that you could take advantage of this. On the other, the people who always came to the meetups didn’t know any of the new people so they were more withdrawn and didn’t interact.

One added benefit was that we were able to invite and bring in speakers from far away. We were able to invite María Teresa Ortíz to talk about geospatial data, which we have never done before so that was good. 

RC: In the past year, did you have to change your techniques to connect and collaborate with members? For example, did you use GitHub, video conferencing, online discussion groups more? Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?  

We have tried a Slack channel where we were encouraging people to join, but we had very little interaction there. This may be a cultural thing. I was a founder of the Buenos Aires chapter, and everyone talked on the Slack channel there. But, here it is the exact opposite. We were unable to take these relationships and make them digital. We would talk about one specific project and come up with a solution in person. We haven’t tried doing this online. We also didn’t know about meetup. People just joined so they knew when the meetings were. It was just popular for people working in the software industry but not for others. Most of our community is academics. It wasn’t easy but we are making progress. Most people are on Twitter so we are using that now. We have a repo where we put the presentation and the materials so that others can review it if they miss the meeting.

RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting? 

We only had one speaker this year, Teresa. She gave a talk on geospatial data. We had a lot of interest in that presentation. It was great, because it was a subject that not many people knew how to use, and it was a way to facilitate many different topics for our members. She was great. It was great we were able to bring her on virtually, due to the pandemic. We are contacting two more speakers to talk about things that we don’t have locally.

RC: What trends do you see in R language affecting your organization over the next year?

One of the biggest issues that our members have is reproducibility. This is mostly because our members work in academia. They need to be sure that the results can be independently reproduced.

RC: Do you know of any data journalism efforts by your members? If not, are there particular data journalism projects that you’ve seen in the last year that you feel had a positive impact on society?

I love data journalism and it started to be a thing a few years ago. We have a newspaper that works with people who are specialists, like water, and lets them work in their field. They are trying to do things with data, and they are gradually acquiring the skills. That is something that people are improving on.

We also have an initiative called ILDA (Latin American Initiative for Open Data). They are conducting research based on femicide, among other things. Here, it is hard for a homicide to be cataloged as femicide, and they are trying to make the statistics comparable between Latin American countries because all countries have different ways to catalog those. I think they are trying to do some data journalism on that. I don’t know if they are doing any other topics, but they are trying on this front now.

RC: When is your next event? Please give details!

We are planning on having a new event by the end of June. Probably about reproducibility, but we still need to clarify and find the details. It’s not written in stone, and we have to book our slot on R Ladies Zoom.

RC: Of the Funded Projects by the R Consortium,  do you have a favorite project?  Why is it your favorite?

I had to read about that because I wasn’t aware of all the initiatives. I’ve seen tweets about some of them, but I’ve not checked them on Twitter. I like R Ladies Global, which I think is crucial because, when we have an R User Group we feel that women feel more comfortable being around other women and ask more questions. It’s been key for the R-ladies to go to LatinR. Natalia was very interested in the visualization project because she works with that topic. I wasn’t aware of all the things that you do, but it’s awesome!

RC: Of the Active Working Groups, which is your favorite?  Why is it your favorite?

I was interested personally in R Certification. I think it’s pretty cool and would love to have that available. It was something that I was looking at before.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications

R Consortium Diversity & Inclusion – Speaker Nomination

By Blog, Events

By Samantha Toet, Derrick Kearney, Sydeaka Watson, Gwynn Sturdevant, Kevin O’Brien, and Joe Rickert 

We’re trying something new and we want your support. 

One of the goals of the R-Consortium Diversity and Inclusion project is for R Consortium-affiliated events to be more representative of the wider R community. As a result, we put together a community form for you to nominate your peers to speak at upcoming R Community events. This is a great way to promote the work that your colleagues are doing, and draw speakers from varying levels of expertise.

We are working with several R-related conferences and events that are seeking recommendations for knowledgeable and engaging speakers on topics that are of interest to the R Community. Our aim is to encourage speakers from diverse backgrounds to consider speaking at R events, and we would like to build a platform to bring these potential speakers to the attention of conference program committees. 

Fill out the form here: https://forms.gle/oshLzCmWm1FWduVY6

Screenshot https://benubah.github.io/r-community-explorer/rugs.html

About the R-Consortium R Community Diversity & Inclusion Project

The goal of the R Community Diversity and Inclusion Project (RCDI) is to broadly consider how the R Consortium can best encourage and support diversity and inclusion across a variety of events and platforms. Anyone is welcome to join our team, and you can find more information about joining here: https://github.com/RConsortium/RCDI-WG

Stephan Kadauke from R/Medicine talks to R Consortium about Racial Bias

By Blog, Events

Stephan Kadauke, MD, PhD, is an Assistant Professor of Clinical Pathology and Lab Medicine. He runs the Cell and Gene Therapy Lab at the Children’s Hospital of Philadelphia (CHOP) and also sees patients on the Apheresis service. As the Lead of the Cell and Gene Therapy Data Operations team, he has implemented several apps that are being used internally for clinical operations. He co-founded the CHOP R User Group, and he is the chair of the R/Medicine Organizing Committee.

R Consortium talks to Stephan Kadauke on the topic of Racial bias in algorithms. We also talk about how an overarching collaboration of R/Medicine and R in Pharma may help solve some difficult problems facing those in healthcare who use the R ecosystem.

RC: What is the R community like for R/Medicine?

The odd thing is that we are not all R programmers and we are not all health care professionals. After the R/Medicine 2020 conference, we looked at the demographics of our attendees and found out that 16% were practicing physicians. So a large contingent of our community actually takes care of patients. It’s also pretty international! Because of COVID, we switched from in-person to virtual in 2020. The previous two years, we had the R/Medicine conference at Yale (in New Haven, CT), then in Boston. We had about 100 people attend both of these. In 2020 we went virtual and grew five-fold, had participants from 43 different countries, and 1/3 of attendees were international. R/Medicine went global. 

RC: How has COVID affected your ability to connect with members?

Things have changed and for better or worse they aren’t going back. We can’t have an exclusively US-centric conference anymore. I missed the interpersonal connections of a real live in-person event, and we did have virtual Birds-of-a-Feather sessions in Zoom breakout rooms, but there is only so much you can do to try to replicate the in-person experience. For our next event, we plan on using a platform that helps with these interactions, but nothing approximates the in-person experience. Of course, we don’t want to do this in a way that loses the virtual experience or the global community. This is a very difficult situation and one we haven’t resolved. Once the COVID pandemic is history, it would be really cool to do a hybrid-distributive conference where there are lots of small watch parties in various places to have a virtual and in-person event globally. I don’t know if we will pull it off but that would be awesome.

RC: In the past year, did you have to change your techniques to connect and collaborate with members?  For example, did you use GitHub, video conferencing, online discussion groups more?  Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?  

For R/Medicine 2020 we used Crowdcast. I think Crowdcast is awesome. Compared to some other programs, Crowdcast is more limited because you don’t have an exhibit hall, multitrack, poster board, and those types of things. Instead, everyone is dropped into a single track video stream with a chat window on the side. By removing degrees of freedom there’s fewer user experience possibilities to consider for the org committee, and it also tends to lead to a more cohesive feeling for the participants. I was a big fan of using Crowdcast and we will use it again this year. For the Birds-of-a-Feather sessions, we used Zoom, which worked OK. For 2021, we’re looking into an alternative platform that does a better job allowing the kinds of chance or planned encounters and interactions that happen at a real live conference. Behind the scenes communication is super important too. We used a closed Slack channel, which was efficient for conference planning and putting out fires when necessary. Much of the planning relied on Google Docs. This stack I think is pretty common in planning online conferences these days.

RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting? 

Racial bias in medicine is such an important issue, especially when you’re using the electronic health record to feed predictive models that will have some kind of downstream effect on patient care. Both of our keynotes for R/Medicine 2021 will discuss this topic. There’s a paper in Science that looked at a widely used algorithm that is supposed to identify patients who are at high risk for disease and need extra care. They found that the algorithm disproportionately selects White patients for extra help, and if you’re Black you’re on average much sicker to be selected. This is because the algorithm used health costs as a proxy for health needs, and because less money is spent on Black patients, it falsely concludes that Black patients are healthier than equally sick White patients. This is insane, and probably just the tip of the iceberg. But algorithmic equity really is a hard problem! For example, do you want your algorithm to consider a person’s race or not? It may seem to make sense that you want it to be color-blind, but then you can’t correct for biases embedded in the training data. Another issue is the trade-off between accuracy and fairness. Predictive algorithms nowadays exclusively optimize for predictive accuracy. How can we make them consider measures of fairness? This is something that Michael Kearns discusses in his book The Ethical Algorithm but something that we aren’t talking about in mainstream AI yet. You can mathematically define equity, and you can design your algorithm so it optimizes for a specific trade-off between accuracy and equity. We need to talk about this. I want people to get more fired up about this. I think the R community is one of the wokest communities. We should be the leaders in ethical AI! When will tidymodels support socially aware modeling? There are so many fields where algorithmic bias can screw with people’s lives. Cathy O’Neil did an awesome job capturing this in her book Weapons of Math Destruction, which is both amazing and depressing at the same time… to see where machine learning algorithms are systematically discriminating against minorities in law enforcement, lending, and health care. Especially in health care where we “first do no harm”!

RC: What trends do you see in R language affecting your organization over the next year?

How do you put R into production? How do you make clinical-grade R applications? There are some major efforts trying to establish standards and guidelines, both for software engineering and validation, and this will really help, but as of yet there isn’t a consensus. This year, we are trying to gather ideas from people in the field who are thinking about this professionally. We’re planning to have a fireside chat with folks who have made R based applications that undergo regulatory scrutiny and folks who made frameworks for production-grade Shiny apps. Hopefully, eventually we’ll be able to offer a workshop to package some of these best practices into a teachable format. We hope this will lower the bar for creating good clinical software. 

RC: When is your next event? Please give details!

R/Medicine 2021 will be held August 24-27. The org committee is working full steam on the program and logistics as we speak. R consortium is helping tons with that as well as the Linux Foundation. We will have two days of workshops and two days of presentations. This will be all virtual. Registration is open now, and it’s all-inclusive, with all workshops and videos, for $50. If you are an Academic it will be $25. All students and trainees can attend for $10. If this price point presents a financial hardship, we can talk about that as well. We are trying our best to keep the barrier low for attending. I’m biased, of course, but I think it’ll be an awesome event – we have some great workshops, keynotes, and sessions lined up!  

RC: Of the Funded Projects by the R Consortium,  do you have a favorite project?  Why is it your favorite?

I’m a big fan of R Validation Hub! Validation is really important when it comes to clinical software – you really have to be able to show, to a reasonable degree, that what you think your software does is what it actually does. Another important part is software engineering for reliability, scale, and usability. When we think about best practices, we think about software engineering as well as validation, and how we bring those together is an important question.

RC: Of the Active Working Groups, which is your favorite?  Why is it your favorite?

I am a big fan of R in Pharma. The R in Pharma conference last year was amazing and had tons of great workshops and speakers. Thematically, R/Medicine is really well aligned with R Pharma. Of course not all of medicine is pharma, and arguably not all of pharma is medicine, but we do have a lot of overlap.

RC: There are four projects that are R Consortium Top Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?

I would love to have an “R in Healthcare” project and working group to capture some of the synergies between the R/Medicine and R in Pharma communities. This working group could hammer out some of the issues that are shared between pharma and medicine, which includes the question of how to build R based apps that can withstand regulatory scrutiny as well as some of the important societal issues with algorithmic bias that we talked about earlier. I think the funding could also go to fill some of the gaps in the R ecosystem that healthcare researchers face – for example, building CONSORT diagrams with ggplot2, or creating a high-level functionality to de-identify sensitive microdata. It would be great to have R Consortium funding for one or two engineers building open-source solutions here. 

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications

Working with Databases in R – Video Presentation from NairobiR and R-Ladies Nairobi

By Blog, Events

During this Working with Databases in R online presentation, Christopher Maronga shares his years of practical experience in accessing and working with Databases in R. R Consortium assisted by providing access to Meetup.com Pro as a platform for information sharing

John Mutiso, a statistician and member of Nairobi R, introduces the presentation on Working with Databases in R and points to the support of the R-Ladies Nairobi Organizers and NairobiR Organizers. 

Christopher Maronga, a data manager, then shares his practically gained experience on how we can turn R into a powerful tool for accessing MySQL databases and writing SQL code, pulling and querying data from within the R environment.

Maronga structured this session to be mainly hands-on learning by using coding examples and implementations. In his presentation he teaches how to efficiently connect R to RDBMS, query data stored in the RDBMS via R, connect and export data from REDCap and API security. He introduces what RDBMS is and how it is used for storage and management of data. He then goes on to explain REDCap and how it is a secure web application for building and maintaining online surveys and databases. Maronga then jumps into practical examples illustrated through the use of a local SQL database.

In the concluding section, it is emphasized that just knowing how to get data into R efficiently is half the battle in the path towards using R in data science. The speaker ended the session on the note that we should expect to see many more collaborations between R-Ladies Nairobi Organizers and NairobiR Organizers in the future.

We’re looking forward to it!

Full video here

By section:

R Conferences that Serve a Diverse Population

By Blog, Events

The R Conference has been able to thrive with the changes that have occurred due to COVID. They also are planning for life post-COVID, with the lessons that they have learned from a digital world working their way into how they organize conferences going forward. R Consortium talked to Jared Lander about the issues with online conferences, how they have seen increased attendance, and how they will incorporate them into a new hybrid system.

RC: What is the R community like at the R conferences in New York and DC?

Each conference is a microcosm of the area that they are located in. We see all fields at the conference since all groups come together. 

For the New York conference, it’s mostly people from the metro area, but since New York City is a hub people will stop by and attend. People are always visiting beyond the geographic area. People from Europe come and talk. We get a lot of people we wouldn’t otherwise because it’s a hub. 

The DC conference is a government conference. It’s a way to focus on the DC community and their interests. It focuses on government and public life. Military talking about military matters. Intelligence offers to talk about working behind a secure network. Economists doing economic data. And teachers talking about how to analyze student data.

RC: How has COVID affected your ability to connect with members?

We used Hopin for the conference. This was a real resource drain for a lot of the attendees. I needed to run two computers to keep it going. We had to instruct people to turn off all other programs and run Zoom from the browser. It worked decently, had a good stage area, a good chat room. However, our conference has always been very lively. We normally had walk-on music for the conference. We had a visiting professor tell students at the conference to not expect this at any other conference and that this one is not normal. 

To try to replicate that at the virtual conference I played walk-on music through my speakers. It was okay. We were able to get a mathematical comedian at the conference to attend, we were able to get a whiskey and rum producer to give a lesson on it and give them a discount on the products. 

Usually, the speakers are local people or companies that let people travel to give talks. When we went virtual we could get a lot more people. We got Rob Hyman to give a talk because he could. Going forward, we plan on doing a hybrid approach so that people can attend anywhere. 

All of our content is available on our website.

RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting? 

Andrew Gelman was great. He went up there with no slides and talked for 40 minutes. He is the only person I know of who could do that. This talk was on open science and how to make it work. You need to publish your data, methods, and code to make this work.

For the government conference, there was Graciela Chichilnisky, who gave a talk about carbon offsets and went over how they helped at the time and how they would work going forward. She went over how carbon offsets can save you money and help the climate at the same time.

RC: When is your next event? Please give details!

The New York Conference is going to be September 8-10 and will be HYBRID! We will have speakers announced soon. We have 8 to 10 planned right now. We are very excited. It’s going to be in Midtown East, and it will be online as well. Our R in government will be in-person and virtual also and will be held in early December but no dates yet.

RC: Of the Funded Projects by the R Consortium,  do you have a favorite project?  Why is it your favorite?

CVXR is such a nice way to do optimization methods, and it’s so explainable, and it can do quasi complex programming and not just linear programming.

RC: There are four projects that are R Consortium Top Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?

I would like to add something that would allow vendors to support R better. We have people like database companies and people like Nvidia who have APIs for every platform but R. They say there will be community support because there is no market for it. Even companies that have an R API do not update it as much as their other APIs. They tend to only update the tools that are for their target area, not realizing that there is a market for people who work in R. It wouldn’t even be that hard since R was written in C, so all you would have to do is modify an existing C API.

 

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications

New York City’s R Community’s Diverse Population Provides a Depth of Speakers

By Blog, Events

R Consortium talked with Jared Lander about the R Community in the New York Metro area. The R Community’s ability to draw upon not only their large metro area as well as their status as a hub city has allowed them to have many different speakers. This has allowed them to grow the community to an impressive size.

RC: What is the R community like in New York?

We are not just the City, but we have a broad reach. We have an active and vibrant community that is very enthusiastic. From Brooklyn Hipster data scientists with MacBooks riding a fixie [Editor’s note: A type of bicycle with just one gear, considered cool and stylish], office workers in slacks and ties just off from work, and everything in between. 

Old school data scientists who consider themselves hardcore statisticians, people just learning the field, people with 30 years of experience, people with one-year experience dipping toes into a different field, we have so many people, the topic of the meeting determines who shows up. If we have a finance talk we will give finance and insurance. Marketing topics will give marketing and business people. If we have a pharmaceutical talk we will get biologists and biostatisticians. 

We also have talks on topics like racing, military, doctors, and lawyers. We have talks and members from all walks of life. We have over 12,200 members. We usually have a core group that shows up to meetings, but others will come because a topic speaks to them. Some won’t come for 6 months till another topic interests them. Attendance comes down to the meeting.

RC: How has COVID affected your ability to connect with members?

We lose the in-person connection. We usually have people show up at 6:15-6:30pm for pizza and (budget allowing) drinks. Usually, people meet for 30 or 40 minutes and talk/socialize and start making friends. We have a speaker go on for 45 minutes or so. We then went to the bar to hang out about whether or not they drink. We get a great lecture, the best night school, that’s free, sandwiched between hanging out with people who are interested in your field or something close to your field. 

Even though we are missing out on that, we are meeting every month. We are having Zoom fatigue because we are losing people. So we do clearly lose out on that social interaction. In-person meetings can have several people having a chat, while a Zoom meeting can only have, at most, two people talking at a time. It becomes a one-way lens. 

People will chat in a Slack room where people can ask their questions. However, I preferred the in-person where people could nod to show understanding or sit next to people and whisper to them. We are missing out on that. 

We were interested in using gather.town and spatial.chat, but were unable to. Due to meetup.com restrictions, we are forced to use Zoom. 

However, the nice thing about going virtual is that we got a worldwide attendance from those who wouldn’t otherwise be able to attend. 

Going forward, we will have the in-person meetup (hopefully soon) and Zoom as well. This way, people can participate online and via Slack as well as in person. People can also access all of our meetings via our website or our YouTube channel.

RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting? 

Will Landau wrote the drake package and the targets package. Targets is a real game changer at my work. It makes my life so much easier as it looks at dependencies to not rerun redundant processes and parallel processes tasks. I was able to get Will to give a speech after he attended another meetup. I have done that several times actually, where people who wrote packages attended, and I have been able to get them to do future meetups.

RC: What trends do you see in R language affecting your organization over the next year?

It’s getting more respect. People are starting to learn that R is a full production language, and I have used it in many sectors. We are getting more people who are learning it for a job, and they come and visit us, and they learn that there is more to it.

RC: Do you know of any data journalism efforts by your members?  If not, are there particular data journalism projects that you’ve seen in the last year that you feel had a positive impact on society?

We have a lot of members who are journalists who either use data to figure out what’s going on and/or use data to inform their readers. We have members from AP Press, NY Times, Wall Street Journal, Bloomberg, and more. It’s so cool seeing this because they want to use data to find out what’s going on. They love it, and it’s so great to see these people who are using the code to find the outcome.

RC: When is your next event? Please give details!

Our next meetup is on June 15 is a talk by Emil Hvitfeldt who is going to talk about text mining. The July 13 is by Sean Taylor with no topic chosen yet.

RC: Of the Funded Projects by the R Consortium,  do you have a favorite project?  Why is it your favorite?

SFDBI because I do a lot of geospatial work. I have always asked people if I can work with geospatial data in postGIS, I can use geospatial data in R, can I use postGIS in R and I’m hoping that I can do it in R now.

RC: Of the Active Working Groups, which is your favorite?  Why is it your favorite?

Distributing computing. Most things don’t need it, but we had a speaker who worked on Ballista (now merged with Arrow Datafusion) which is a language-agnostic distributive computing program that could be integrated into R. This could force people like NVidea to port CUFD to R.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications!

Bioconductor Asia Membership Increasing Due to Going Virtual

By Blog, Events

R Consortium talked to Bioconductor Asia co-organizer Matt Ritchie about the upcoming conference (BioC Asia 2021), how COVID has affected attendance at the conference, and how they deal with multilingualism at their meetings.

RC: What is the R community like for Bioconductor Asia?

In terms of other Bioconductor groups, such as Europe and America, Bioconductor Asia is quite small. We are building up, and are grateful to receive R Consortium support for our event. The first BioC Asia was held in 2015 as a satellite of the GIW/InCoB meeting in Japan. The program included a workshop day, followed by a mini-conference where people could learn about Bioconductor and present their research. Martin Morgan (project lead at the time) gave a talk and an introductory workshop on Bioconductor to around 30 participants. In the following years, we organized our meetings as satellite events to the larger ABACBS annual conference in Australia (Brisbane 2016, Adelaide 2017, Melbourne 2018 and Sydney 2019). In 2020, we were planning an in-person meeting in China hosted by Tsinghua University and were expecting a large audience of around 200 attendees. Oddly, after we went digital we ended up having 400 registrants. Thanks to the generous support of our sponsors, BioC Asia 2020 offered free registration and more people could participate in the workshops than ever before. In 2021, we will again run our event virtually, with lead organizer Kozo Nishida from RIKEN in Japan. We are also keen to have conferences in other countries. Last year we had several people from South Korea attend, so maybe future BioC Asia events can be hosted virtually (or in person) by South Korean researchers.

RC: How has COVID affected your ability to connect with members?

We canceled the in-person conference and went virtual. Zoom was the main tool that we use to communicate, and we set up a #biocasia2020 channel on the Bioconductor community slack for further discussion. Because of the virtual conference we were able to increase attendees in the workshops, where we had more people possible for virtual than in person (100 plus people online versus 30 or so for in person workshops, often limited due to physical restrictions of space). For the conference, we used the Orchestra platform that Sean Davis developed for running workshops in the cloud. We recently used this platform to run virtual training in Africa, so it’s been well tested now in different parts of the world and can scale up or down very easily. We are likely to keep a hybrid option in the future as it is more accessible for students and people without a travel budget. We want our meetings to be as accessible as possible. It’s also nice to have a meeting in your general time zone as opposed to one that is scheduled for when you were hoping to be asleep, which is often the case for meetings based around other time zones. 

RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting? 

I enjoyed the presentation by Koki Tsuyuzaki on tensor decomposition and the work that has gone into applying this approach to single-cell data analysis in the scTensor package (a video of this presentation is available from the Bioconductor YouTube channel). F1000Research, who also sponsored our meeting, hosted the slides from this presentation which were accessible before and after the talk so people could easily follow along at home. F1000Research also allows hosting of poster presentations, which is great for virtual poster sessions.

RC: What trends do you see in R language affecting your organization over the next year?

The growing influence that the tidyverse is having on the Bioconductor project, with software such as tidyBulk and plyranges applying these principles to genomic data analysis. Both packages have been developed by researchers based in our region, and it will be exciting to see further applications in the future.

RC: Do you know of any data journalism efforts by your members?  If not, are there particular data journalism projects that you’ve seen in the last year that you feel had a positive impact on society?

Although not directly related to Bioconductor, the work led by Rafael Irizarry that I saw presented at an AMSI Bioinfosummer public lecture in 2019 on estimating the number of excess deaths in Puerto Rico in the aftermath of Hurricane Maria (bioRxiv preprint here and news story here) is very inspiring. They modeled data to look at excess deaths and found that the government reported figures grossly underestimated the real impact of the disaster, which lasted long after the Hurricane had ended.

RC: When is your next event? Please give details!

BioC Asia 2021 will be held Nov 1-4 as a virtual event. The lead organizer is Kozo Nishida from RIKEN, who represents our region on the Bioconductor Community Advisory Board. This is the second time the event is hosted in Japan, albeit virtually this time around.

RC: Of the Funded Projects by the R Consortium,  do you have a favorite project?  Why is it your favorite?

An early one funded in the 2016 round: ‘Software Carpentry R Instructor Training’ by Dr Laurent Gatto was a really valuable contribution for teaching countless people how to use R. Software carpentry is an amazing platform for onboarding new users, and Laurent and colleagues are currently planning a new curriculum that introduces the world of Bioconductor using this approach.

RC: Of the Active Working Groups, which is your favorite?  Why is it your favorite?

R / Medicine, as a lot of Bioconductor tools are used in clinical research and there is a lot of interest in Bioconductor from that sector. R / Medicine has an annual conference that will be held virtually this year on August 24-27th.

RC: There are four projects that are R Consortium Top Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?

More support for teaching R in other Languages would be great for our work. At BioC Asia 2020 we ran workshops in Mandarin which proved very popular. We have also published RNA-seq analysis workflows in English and Chinese. It would be great to see more multilingual vignettes and workflows so that people can learn about different packages in whatever language suits them best. We are looking at redeveloping the Bioconductor website and aim to have key landing pages and training material translated into different languages. Adding closed captioning in English to talks can also improve accessibility.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications!

R-Ladies Philly – Building our Online Community During the Pandemic

By Blog, Events

Authored by: the R-Ladies Philly organizing team

Since the start of the COVID-19 pandemic in March 2020, R-Ladies Philly has shifted from local in-person meetups to virtual events. Hundreds of local, national, and even international R enthusiasts joined us for monthly virtual meetups and social activities! We organized more than 10 R workshops and co-hosted a datathon with local partners. We even launched a YouTube channel to make our workshop recordings widely available. Based on feedback from our members, this has been very successful despite the difficulties associated with COVID-19. In this post, we share a bit more information on our events and what has worked for our group.

Workshops

Starting from April 2020, we organized 11 R workshops, ranging from basic data cleaning and visualization to more advanced R usage like R package development and popular topics such as machine learning. One of our goals is to celebrate gender diversity in the R community by highlighting different speakers. We also aim to engage R users from all different levels and encourage speakers to share their learning experiences. We made all workshop content and notes captured during Q&A available to the public through our Youtube channel and event recaps on the R-Ladies Philly blog. See below for links to our workshop events:

  1. Tidyverse (April 2020) – Kelsey Keith
  2. A/B Testing in R (May 2020) – Elea McDonnell Feit
  3. Introduction to R lightning talks (June 2020)
    • 6 R Packages to Add to Your Workflow – Jinyi Kuang
    • How to install R and RStudio for the first time – Tess Cherlin
  4. How to Test and Roll (July 2020) – Elea McDonnell Feit and Ron Berman
  5. Hands on Machine Learning (August 2020) – Jaclyn Taroni
  6. Decision Trees & Random Forests (September 2020) – Trang Le and Karla Fettich
  7. Flexdashboard and debugging with shinyobjects (October 2020) – Jake Riley
  8. Your first R package in 1 hour (November 2020) – Shannon Pileggi
  9. Data Visualization (December 2020) – Trang Le
  10. Introduction to R dashboard development with Shiny Dashboard (January 2021) – Anastasia Lucas
  11. From Learn-R to Teach-R (February 2021) – Ama Nyame-Mensah, Cass Wilkinson Saldaña and Silvia Canelón
A screenshot from our panel event that is posted on YouTube. Here we used slido.com to facilitate a panel session with R educators. The top upvoted questions are: “How can I teach my coworkers R?”, “Learning what to google when something went wrong was such a big factor in my own learning curve. Any advice for helping beginners learn what to search?” and “How do you come up with good active learning opportunities, or examples?”

Datathons

We asked Datathon 2021 participants how they would describe the experience in one word or phrase. This word cloud visualizes their responses. Teamwork, learning, and cats feature prominently.

Every year, R-Ladies Philly organizes a datathon that aims to bring R enthusiasts and community organizations together to create insights through data and give participants exposure to new techniques, real-world data, and a diverse group of local data science professionals. These datathons consist of in-person kickoff and conclusion events, and 6 weeks of online collaboration in between. 

The pandemic caught us in the middle of our 2020 datathon, which was a collaborative effort with other local data groups to help address the opioid epidemic in Philadelphia, so the conclusion meetup had to change format from in-person to online (see a recording here). 

In 2021, we had to switch to a fully online format, where the kickoff meetup was held via Zoom and participants organized themselves into groups through breakout rooms. Our 2021 datathon explored judicial patterns in Philadelphia courts, including bail, sentencing, and the concept of ‘judge harshness’. Participants worked together using Zoom, Slack, GitHub, and a shared Google doc for Q&A that also allowed the partner organization to answer questions asynchronously. Our conclusion meetup (view the recording here) showcases some of the highlights of this year’s datathon findings and the work that participants have put into analyzing a large and complex real-world dataset. 

Insights from a virtual year

Overall, we are looking forward to returning to our pre-pandemic format when it is safe to do so. Being forced to adapt our approach has had some benefits. We were able to reach a broader audience that was not previously able to travel to our events. We were also able to try new event formats and new technology. For example, we held two virtual social events where we experimented with different formats to get to know each other remotely. We also used tools like breakout rooms in zoom for our datathon and online tools like sli.do for polls and Q&A sessions for our panel and datathon events. We also tried to keep our events as interactive as possible with lively chats and the usage of Google docs to track and answer all participant questions during workshop events. These practices will be useful for all future workshops, whether virtual, in-person, or hybrid.

We are looking forward to continuing to build our online presence with more YouTube and blog posts, even when we are able to meet again face to face. If you are interested in joining us, please look for upcoming events on our meetup page. We are also seeking volunteers to plan and lead hands-on workshops for the remainder of 2021. Please learn more by visiting our website.

Some photos of our in-person meetups before the pandemic. We look forward to eating pizza together again one day.

Tidyverse Overtakes For Loops in Melbourne, Australia

By Blog, Events

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. The wealth of knowledge in the community and the drive to learn and improve is inspiring. We had a chance to talk with Jiun Siew, Data Scientist and organizer of R User Group Melbourne, to find out more about the R community in Melbourne, how they’re holding up during the pandemic, trends in R, and what the future holds. 

If you are interested in applying to the RUGS program for your organization, see the How do I Join? section at the end of this article.

RC: What is the R community like in Melbourne?

Melbourne has a vibrant community, with a mixture of students, professionals, and industries involved. Our R Meetup has almost 3,000 members who are interested in R and Data Science. We do get a little support in organizing from some of the companies in the area.

RC: How has COVID affected your ability to connect with members?

Melbourne had a very strict lockdown happen. We had a 5km travel radius where we were allowed to travel, restrictions on going to grocery stores, and the like. Because of that, we couldn’t meet in person. To get around this, we did our meetings primarily on Zoom.

RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting? 

In the last meetup in November, WhyHive, who does analytics using Shiny, did a presentation for the work they did for the International Women’s Development Agency. It was amazing to see, and they were such a young crew. WhyHive is a social enterprise that works with non-profits to analyze and make data-driven decisions. Typically, our Zoom meetings with higher turnout tend to be around 30 to 40 people. For this one, we had 70 to 80. The conversation for this was so good that we had to cut off questions to them at the end due to time.

RC: What trends do you see in R language affecting your organization over the next year?

The overall trend where more of our users are going is Tidyverse and all things Tidy. For better or worse, it is where our users tend to be going. We have seen a lot of time series, tidy objects, tidy models, and the like. In our organization, it has become almost the de facto method, to the point where people don’t usually use For loops anymore.

RC: When is your next event? Please give details!

We are currently in the planning stage, and we are looking for a speaker. More details to come soon!

RC: Of the Funded Projects by the R Consortium, do you have a favorite project? Why is it your favorite?

Due to my work, Mater 2.0 stuck out to me. It would be very nice to deal with larger data frames. It would help with scaling projects to a larger size.

RC: Of the Active Working Groups, which is your favorite?  Why is it your favorite?

The ones that I saw, distributive computing, were really interesting. Being able to scale is one of the problems and one of the limitations that our members run into. You can multithread it or hack parallels or distribute a job, but it would be awesome if it were more native. This would also help with scaling projects up for people who work in industry.

RC: There are four projects that are R Consortium Top Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?

I think there should be more put into diversity and inclusion. This is a really important field that I believe should be emphasized. Another one would be looking into scalability in R and making it more usable in work environments. I attended an R User Conference in Brisbane some years ago, and what I saw showed a lack of scaling. In industry we try to use R in production, and the lack of scalability is an issue in R. This issue is becoming more important.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications!