Skip to main content

R/Database: Using R at Scale on Database Data


Abstract: Many users, organizations, and enterprises rely on R as a powerful language and environment for statistical analysis, data science, and machine learning. Many of these same users work with data in – or extracted from – Oracle databases. As data volumes increase, moving data complicates the life of data professionals, like data engineers and data scientists, as well as solution developers and administrators. In this session, you’ll learn how to increase overall solution performance with R tightly integrated with Oracle databases. Oracle Machine Learning for R (OML4R) leverages the database as a high-performance computing environment, enabling users to explore, transform, and analyze data faster and at scale, while allowing the use of familiar R syntax and semantics. In-database parallelized machine learning algorithms are exposed through a natural R interface, including the use of R Formula. R users can run user-defined R functions in database-environment spawned and managed R engines using R, SQL, and REST interfaces – even taking advantage of system-enabled data parallelism. User-defined R functions and other R objects can be stored directly in the database to facilitate ease of solution deployment while benefiting from database security – avoiding the use of flat files. Join us for this engaging session highlighting multiple use cases with demonstrations. 

Main Sections 

00:00 Introduction

2:54 Agenda

3:33 Background: R for databases

8:43 R Packages for database access

13:22 Memory and scalability

15:24 Leveraging parallelism

22:03 OML4R introduction

26:00 Demonstration: OML4R data exploration and preparation, dplyr

31:40 In-database algorithms

34:13 Demonstration: OML4R modeling

37:14 Embedded execution

39:14 Demonstration: OML4R embedded R execution

49:01 Using additional third-party packages with ODB/ADB

50:21 Oracle Machine Learning related components

51:08 Summary and more information

Speakers

Mark Hornick, Senior Director, Oracle Machine Learning

Mark Hornick is senior director of product management for Oracle Machine Learning. Mark has more than 20 years of experience integrating and leveraging machine learning with Oracle software as well as working with internal and external customers to apply Oracle’s machine learning technologies. He has been involved with R technology for the past 15 years.  Mark is Oracle’s representative to the R Consortium and is an Oracle Adviser of the Analytics and Data Oracle User Community. He has been issued seven US patents. Mark holds a bachelor’s degree from Rutgers University and a master’s degree from Brown University, both in computer science. Follow him on Twitter @MarkHornick and connect on LinkedIn. He blogs at blogs.oracle.com/machinelearning.

Sherry LaMonica, Consulting MTS, Oracle Machine Learning

Sherry is a member of the Oracle Machine Learning Product Management team. She has 20 years of software experience focused on enabling the commercial use of the open-source data analysis software systems with R and Python for data science and machine learning projects. She has worked with customers in fields as diverse as pharmaceutical research, financial analysis, manufacturing, and healthcare IT.


The R Adoption Series

This is a new series of webinars focused on the adoption of R.  Each session will include a case study and often include panels or discussions to enable those starting their journey to ask questions.

R Consortium will keep this page updated with information on future webinars in the R Adoption series. If there is some information that you are looking for specifically and you don’t see it here, feel free to email us at info@r-consortium.org.