Confirmed Sessions for Data Day Texas 2018

Take advantage of advance discount prices at the official conference hotel - The AT&T Conference Center.

We are just now beginning to announce the confirmed sessions. Check this page regularly for updates.

Introduction to SparkR in AWS EMR (90 minute session)

Alex Engler - Urban Institute

This session is a hands-on tutorial on working in Spark through R and RStudio in AWS Elastic MapReduce (EMR). The demonstration will overview how to launch and access Spark clusters in EMR with R and RStudio installed. Participants will be able to launch their own clusters and run Spark code during an introduction to SparkR, including the SparklyR package, for data science applications. Theoretical concepts of Spark, such as the directed acyclic graph and lazy evaluation, as well as mathematical considerations of distributed methods will be interspersed throughout the training. Follow up materials on launching SparkR clusters and tutorials in SparkR will be provided.
Intended Audience: R users who are interested in a first foray into distributed cloud computing for the analysis of massive datasets. No big data, dev ops, or Spark experience is required.