Who will speak at Data Day Texas 2018?

Keynote Lukas Biewald (SF Bay) @l2k

Lukas Biewald (Wikipedia / LinkedIn / GitHub) is the founder and CEO of CrowdFlower. Founded in 2007, CrowdFlower provides Labor-on-Demand to help companies outsource high-volume, repetitive tasks to a massively-distributed global workforce.
Before founding CrowdFlower, Lukas was a senior scientist and manager within the Ranking and Management Team at Powerset, Inc., acquired by Microsoft in 2008. He led the Search Relevance Team for Yahoo! Japan after graduating from Stanford University with a B.S. in Mathematics and an M.S. in Computer Science. Recently, Lukas won the Netexplorateur Award for GiveWork – a collaboration with Samasource that brings digital work to refugees worldwide. Lukas is also an expert level Go player.
Check out Lukas' recent interview with Ben Lorica for the O'Reilly Data Show
Lukas will be giving the Data Day / AI Weekend presentation: Deep Learning in the Real World

Jans Aasman (SF Bay)

Jans Aasman (Wikipedia / LinkedIn) is a Ph.D. psychologist and expert in Cognitive Science - as well as CEO of Franz Inc., an early innovator in Artificial Intelligence and provider of the graph database, AllegroGraph. As both a scientist and CEO, Dr. Aasman continues to break ground in the areas of Artificial Intelligence and Knowledge Graphs as he works hand-in- hand with numerous Fortune 500 organizations as well as US and Foreign governments. Jans recently authored an IEEE article on “Enterprise Knowledge Graphs”.
Dr. Aasman spent a large part of his professional life in telecommunications research, specializing in applied Artificial Intelligence projects and intelligent user interfaces. He gathered patents in the areas of speech technology, multimodal user interaction, recommendation engines while developing precursor technology for tablets and personal assistants. He was also a professor in the Industrial Design department of the Technical University of Delft. Dr. Aasman is a noted conference speaker at such events as Smart Data, NoSQL Now, International Semantic Web Conference, GeoWeb, AAAI, Enterprise Data World, Text Analytics, and TTI Vanguard to name a few.
Jans will be giving the following presentation: Navigating Time and Probability in Knowledge Graphs.

John Akred (SF Bay) @BigDataAnalysis

John Akred is the Founder and CTO of Silicon Valley Data Science. In the business world, John Akred likes to help organizations become more data driven. He has over 15 years of experience in machine learning, predictive modeling, and analytical system architecture. His focus is on the intersection of data science tools and techniques; data transport, processing and storage technologies; and the data management strategy and practices that can unlock data driven capabilities for an organization. A frequent speaker at the O'Reilly Strata Conferences, John is host of the perennially popular workshop: Building A Data Platform.
John will be giving the following AI Weekend presentation: Machine Learning: From The Lab To The Factory

Mara Averick (Boston) @dataandme

Mara Averick (LinkedIn / GitHub / Medium ) is a polymath and self-confessed data nerd. With a strong background in research, she has a breadth of experience in data analysis, visualization, and applications thereof. Currently, by day, she’s a Consultant at TCB Analytics. By night, you’ll find her sharing dope R related stuff on Twitter and translating heavily technical subject matter into easy reading for a non-technical audience. When she’s not talking data, she's diving into NBA stats, exploring weird and wonderful words, and/or indulging in her obsession with all things Archer. (Thanks to Mango Solutions for bio.)
Mara will be speaking as part of R User Day.

Dave Bechberger (Houston) @bechbd

Dave Bechberger is a Sr. Architect at Gene by Gene, a genetic genealogy and bioinformatics company, where he works extensively on developing their next-generation data architecture. Dave has spent his career engaging in full stack software development but specializes in building data architectures in complex data domains such as bioinformatics, oil and gas, supply chain management, etc. He uses his knowledge of graph and other big data technologies to build out highly performant and scalable systems. Dave has previously spoken at a variety of international technical conferences including NDC Oslo, NDC London, and Graph DayTexas.
Dave will co-present the following Graph Day session: Improving Graph Based Entity Resolution using Data Mining and NLP.

Ryan Boyd (SF Bay) @ryguyrg

Ryan Boyd (Linkedin) is a SF-based software engineer focused on helping developers understand the power of graph databases. Previously he was a product manager for architectural software, built applications and web hosting environments for higher education, and worked in developer relations for twenty products during his 8 years at Google. He enjoys cycling, sailing, skydiving, and many other adventures when not in front of his computer.
Ryan will present the following Graph Day session: Data Science Tools: Cypher for Data Munging.

Ben Bromhead (SF Bay)

Ben Bromhead (Linkedin) is Co-founder and CTO at Instaclustr, where he sets the technical direction for the company. Ben is well known as an active of the Apache Cassandra community. Prior to Instaclustr, Ben had been working as an independent consultant developing NoSQL solutions for enterprises. He ran a high-tech cryptographic and cyber security formal testing laboratory at BAE Systems and Stratsec.
Ben will be giving the following presentation: Cassandra and Kubernetes.

Jeff Carpenter (Scottsdale, Arizona) @jscarp

Jeff Carpenter (Linkedin) is a technology evangelist at DataStax, where he leverages his background in system architecture, microservices and Apache Cassandra to help empower developers and operations engineers build distributed systems that are scalable, reliable, and secure. Jeff has worked on projects ranging from a complex battle planning system in an austere network environment, to a cloud-based hotel reservation system and is the author of Cassandra: The Definitive Guide, 2nd Edition.
Jeff will be giving the following Cassandra presentation: Cassandra Architecture FTW!

Glauber Costa (Toronto) @glcst

Glauber Costa (Linkedin) is a Principal Architect at ScyllaDB. He shares his time between the engineering department working on upcoming Scylla features and helping customers succeed.
Before ScyllaDB, Glauber worked with Virtualization in the Linux Kernel for 10 years, with contributions ranging from the Xen and KVM Hypervisors to all sorts of guest functionality and containers.
Glauber will be presenting the following session: Go big or go home! Does it still make sense to do Big Data with Small Nodes?.

Lucy D'Agostino (Nashville) @LucyStats

Lucy D'Agostino McGowan (LinkedIn / GitHub) is a Biostatistics PhD candidate at Vanderbilt University where her research focuses on observational studies, large-scale inference, and methods for quantifying and estimating the effect of unmeasured confounding. She is the co-founder of R-Ladies Nashville and is enthusiastic about learning from and uplifting other women in the R and STEM communities.
Lucy will be presenting the R User Day session: Making Causal Claims as a Data Scientist: Tips and Tricks Using R.

Jasmine Dumas (Connecticut) @jasdumas

Jasmine Dumas (LinkedIn / GitHub) is a Data Scientist at Simple Finance where she is focused on experimentation and data product development. She earned a B.S.E. in Biomedical Engineering from the University of Hartford and has experienece in Aerospace Manufacturing, Medical Devices and Financial Technology. She is an active member of the R programming community and has developed open source packages: shinyGEO, ttbbeer, shinyLP, & gramr and participated in Google Summer of Code, NASA Datanauts, R-Ladies, and Forwards. She is currently developing a course on shiny with DataCamp and co-organizing the regional Noreast'R Conference.
Jasmine will be giving the following R User Day talk: R, What is it good for? Absolutely Everything.

Joey Echeverria (SF Bay) @fwiffo

Joey Echeverria is the platform technical lead at Splunk, where he builds applications for scaling IT operations built on the Apache Hadoop platform. Joey is a committer on the Kite SDK, an Apache-licensed data API for the Hadoop ecosystem. Joey was previously a software engineer at Cloudera, where contributed to several ASF projects including Apache Flume, Apache Sqoop, Apache Hadoop, and Apache HBase. Joey is also a coauthor of Hadoop Security, published by O'Reilly Media.
Joey will be co-presenting the following session: Debugging Apache Spark

Jonathan Ellis (Austin) @spyced

Jonathan Ellis is CTO and co-founder at DataStax. Prior to DataStax, Jonathan worked extensively with Apache Cassandra while employed at Rackspace. Prior to Rackspace, Jonathan built a multi-petabyte, scalable storage system based on Reed-Solomon encoding for backup provider Mozy.
Jonathan will be speaking in the Cassandra track at Data Day Texas.

Alex Engler (Washington, D.C.) @alexcengler

Alex Engler (LinkedIn / GitHub / Urban Institute / Georgetown / Johns Hopkins) is the Program Director and Lecturer for the M.S. in Computational Analysis and Public Policy program at the University of Chicago. He is also a contributing data scientist to the Urban Institute, where he worked before UChicago. Alex also previously taught visualization and data science for policy analysis at Georgetown University and Johns Hopkins University.
Alex will be presenting the following workshop: Introduction to SparkR in AWS EMR, as part of R User Day.

Dr. Denise Koessler Gosnell (Charleston) @DeniseKGosnell

In August 2017, Dr. Denise Gosnell, transitioned into a Solutions Architect position with DataStax where she aspires to build upon her experiences as a data scientist and graph architect to further their established line of graph solutions. Prior to her role with DataStax, Dr. Gosnell was a Data Scientist and Technology Evangelist at PokitDok. During her three years with PokitDok, she built software solutions for and spoke at over a dozen conferences on permissioned blockchains, machine learning applications of graph analytics, and data science within the healthcare industry.
Dr. Gosnell earned her Ph.D. in Computer Science from the University of Tennessee. Her research on how our online interactions leave behind unique identifiers that form a “social fingerprint” led to presentations at major conferences from San Diego to London and drew the interest of such tech industry giants as Microsoft Research and Apple. Additionally, she was a leader in addressing the underrepresentation of women in her field and founded a branch of Sheryl Sandberg’s Lean In Circles.
Dr. Gosnell will be giving the following Graph Day presentation: Everything is not a graph problem (but there are plenty).

Michael Grove (Washington, DC) @mikegrovesoft

Michael Grove is VP of Engineering and co-founder of Stardog where he oversees the development of the Stardog Knowledge Graph Platform. Michael studied Computer Science at the University of Maryland and is an alumnus of its well-regarded MIND Lab which specialized in semantic technologies. Before Stardog, he worked at Fujitsu Resarch on the use of graphs and semantic technologies in pervasive computing environments. Michael is an expert in large scale database and reasoning systems and has worked with graphs and graph databases for nearly fifteen years.

Dikang Gu (San Francisco) @dikanggu

Dikang Gu (Linkedin) is a Staff Software Engineer at Facebook. He has years of experience working with big data/cloud computing platforms. Dikang will be speaking as part of the Cassandra track.

Jon Haddad (Los Angeles) @rustyrazorblade

Jon Haddad (Linkedin) is the Principal Consultant at The Last Pickle, as well as a committer and PMC member for Apache Cassandra. Prior to The Last Pickle, Jon was a technical evangelist at DataStax. He has worked on dozens of Cassandra clusters across a wide variety of hardware, both on-prem and in the cloud. Jon has contributed to a wide variety of open source projects and has almost 20 years experience in the field.
Jon will be giving the following presentation: Cassandra Performance Tuning and Crushing SLAs .

Kristian Hammond (Chicago) @kj_hammond

Kristian Hammond (LinkedIn) is chief scientist at Narrative Science and professor of computer science and journalism at Northwestern University. Previously, Kris founded the University of Chicago’s Artificial Intelligence Laboratory. His research has been primarily focused on artificial intelligence, machine-generated content, and context-driven information systems. He currently sits on a United Nations policy committee run by the United Nations Institute for Disarmament Research (UNIDIR). Kris was also named 2014 innovator of the year by the Best in Biz Awards. He holds a PhD from Yale.
Kristian will be giving the following AI Weekend presentation: Here and now: Bringing AI into the enterprise.

Humayun Irshad (Palo Alto) @humayunirshad

Humayun Irshad (LinkedIn) is a computer scientist with expertise’s in machine learning, deep learning, computer vision, medical image analysis and statistical methods. He is developing computational techniques to combine computer vision, machine learning and statistical approaches for automated object detection, segmentation and classification in 2D and 3D images and application ranging from medical, retail, self-driving car, etc. He has completed a 3 years PostDoc at Harvard Medical School and also received a PhD in Computer Science from University of Grenoble France, where he developed machine learning and deep learning techniques include region of interest detection and classification, and nuclei and gland detection, segmentation and classification in 2D and 3D histopathological images (H&E stained, IHC stained and fluorescence images).
Humayun will be co-presenting the following workshop: Hands on Machine Learning / Deep Learning Apps using AWS/Keras/Tensorflow.

Chester Ismay (Portland) @old_man_chester

Chester Ismay (LinkedIn / GitHub) is Curriculum Lead at DataCamp. He was formerly an Adjunct Professor of Sociology at Pacific University and an Instructional Technologist and Consultant for Data Science, Statistics, and R at Reed College. He obtained his PhD in statistics from Arizona State University and has taught courses and led workshops in statistics, data science, mathematics, computer science, and sociology. He is the co-author of the fivethirtyeight R data package and is the author of the thesisdown R package. He is also a co-author of an open source textbook entitled ModernDive: An Introduction to Statistical and Data Sciences via R.
Chester will be speaking as part of R User Day.

Holden Karau (San Francisco) @holdenkarau

Holden Karau is a transgender Canadian, Apache Spark committer, an active open source contributor, and co-author of Learning Spark & High Performance Spark. When not in San Francisco working as a software development engineer at IBM’s Spark Technology Center, Holden talks internationally on Spark and holds office hours at coffee shops at home and abroad. She makes frequent contributions to Spark, specializing in PySpark and Machine Learning. Prior to IBM she worked on a variety of distributed, search, and classification problems at Alpine, Databricks, Google, Foursquare, and Amazon. She graduated from the University of Waterloo with a Bachelor of Mathematics in Computer Science. Outside of computers she enjoys dancing & playing with fire.
Holden will be co-presenting the following session: Debugging Apache Spark

Mayank Kejriwal (Los Angeles) @kejriwal_mayank

Mayank Kejriwal is a research scientist and lecturer at the University of Southern California's Information Sciences Institute (ISI). He received his Ph.D. from the University of Texas at Austin under Daniel P. Miranker. His dissertation involved Web-scale data linking, and in addition to being published as a book, was recently recognized with an international Best Dissertation award in his field. Some of his projects at ISI, all funded by either DARPA or IARPA, include: automatically extracting information from large Web corpora and building search engines over them (the topic of his talk); 'automating' a data scientist with advanced meta-learning techniques; representing, and reasoning over, terabyte-scale knowledge graphs; combining structured and unstructured data for causal inference; constructing, embedding and analyzing networks over billion-tweet scale social media; and building a platform that makes research easy for geopolitical forecasters. His research sits at the intersection of knowledge graphs, social networks, Web semantics, network science, data integration and AI for social good. He is currently co-authoring a textbook on knowledge graphs (MIT Press, 2018), and has delivered tutorials and demonstrations at numerous conferences and venues, including KDD, AAAI, ISWC and WWW.
Mayank will be giving the following presentation: Building advanced search and analytics engines over arbitrary domains...without a data scientist.

Jason Kessler (Seattle) @jasonkessler

Jason Kessler (LinkedIn) is a lead data scientist at CDK Global, where he analyzes language use and consumer behavior in the online auto-shopping ecosystem. Prior to joining CDK, Jason was the founding data scientist at PlaceIQ and worked as a research scientist for JD Power and Associates. He has published peer-reviewed papers on algorithms and corpora for sentiment and belief analysis and has sat on program committees and reviewed for several AI and NLP conferences. Most recently, he has conducted research on identifying persuasive and influential language and the visualization of differing corpora.
Jason will be giving the following presentation: Lexicon Mining for Semiotic Squares: Exploding Binary Classification

Albert Y. Kim (Amherst) @rudeboybert

Albert Y. Kim (LinkedIn / GitHub) is a Lecturer in Statistics in the Mathematics & Statistics Department at Amherst College. Born in Montreal Quebec, he earned his BSc in Mathematics and Computer Science from McGill University in 2004 and his PhD in Statistics from the University of Washington in 2011. Prior to joining Amherst College, he was a Decision Support Engineering Analyst in the AdWords division of Google Inc, a Visiting Assistant Professor of Statistics at Reed College, and an Assistant Professor of Statistics at Middlebury College.
Albert will be giving the following R User Day talk: Something old, something new, something borrowed, something blue: Ways to teach data science (and learn it too!).

Gunnar Kleemann (Berkeley / Austin) @GunnarKleemann

Gunnar Kleemann is a Data Scientist with the Berkeley Data Science Group (BDSG). He is interested in how data science facilitates biological discovery and lowers the barrier to high-throughput research, particularly in small, independent labs. In addition to his work with BDSG, he is also involved in the development and implementation of technologies like the ATX Hackerspace Biology Laboratory.
Gunnar holds a PhD in Molecular Genetics from Albert Einstein College of Medicine and a Master’s in Data Science from UC Berkeley. He did post-doctoral research on the genomics of aging at Princeton University, where his research focused developing high throughput robotic assays to understand how genetic changes alter lifespan and reproductive biology.

Dr. Steve Kramer (Austin) @ParagonSci_Inc

Steve Kramer (LinkedIn) is the President and Chief Scientist of Paragon Science, a company he founded with the goal of developing cutting-edge technologies to aid in the counter-terrorism efforts of the United States. He has since expanded Paragon Science's scope to focus on providing valuable business intelligence in the commercial data-mining industry.
In 2005, Dr. Kramer started his current research in graph theory, network analysis, and complex systems theory, yielding Paragon's patent-pending dynamic anomaly detection technologies. He has performed data-mining consulting work for multiple clients, including The Advisory Board, Digital Motorworks, RetailMeNot, and Vast.com. He presented his paper "Anomaly detection in extremist web forums using a dynamical systems approach" at the 2010 ACM SIGKDD Workshop on Intelligence and Security Informatics (ISI-KDD 2010) and at the Pentagon. He also recently served as a program committee member and paper reviewer for IEEE International Conferences on Intelligence and Security Informatics 2011, IEEE International Conferences on Intelligence and Security Informatics 2012, ACM SIGKDD Workshop on Intelligence and Security Informatics 2012, IEEE Intelligence and Security Informatics 2013, and FOSINT-SI 2013 (International Symposium on Foundations of Open Source Intelligence and Security Informatics)..

Chris LaCava @uxchrislacava

Chris LaCava has spent the past two decades defining, designing and building software for a variety of industry verticals. He has experience as a usability engineer, interaction designer, front-end developer as well as product manager for both consulting and product-oriented organizations. Chris leads Expero's efforts in defining visualization for graph datasets.
Chris LaCava will co-present the following AI Weekend session: Vital Role of Humans in Machine Learning .

Corey Lanum (Boston) @corey_lanum

Corey Lanum (LinkedIn), has a distinguished background in graph visualization. Over the last 15 years he has managed technical and business relationships with dozens of the largest defense and intelligence agencies in North America, in addition to working with many security and anti-fraud organizations in private industry. Prior to joining Cambridge Intelligence as their US Manager, Corey was helping the customers of i2 (now IBM) and SS8 to solve their most complex graph data challenges.
Corey is the author of Visualizing Graph Data from Manning Publications
Corey will co-present the following Graph Day session: How to Destroy Your Graph Project with Terrible Visualization.

Jared Lander (NYC) @jaredlander

Jared Lander (LinkedIn) is the Chief Data Scientist of Lander Analytics a data science consultancy based in New York City, the Organizer of the New York Open Statistical Programming Meetup and the New York R Conference and an Adjunct Professor of Statistics at Columbia University. With a masters from Columbia University in statistics and a bachelors from Muhlenberg College in mathematics, he has experience in both academic research and industry. His work for both large and small organizations ranges from music and fund raising to finance and humanitarian relief efforts.
Jared specializes in data management, multilevel models, machine learning, generalized linear models, data management and statistical computing. He is the author of R for Everyone: Advanced Analytics and Graphics, a book about R Programming geared toward Data Scientists and Non-Statisticians alike and is creating a course on glmnet with DataCamp.
Jared will be speaking as part of R User Day.

Dor Laor (Sunnyville, CA) @dorlaor

Dor Laor is the CEO of ScyllaDB, the company behind the open source Cassandra-compatible database of the same name. Previously, Dor was part of the founding team of the KVM hypervisor under Qumranet that was acquired by Red Hat. At Red Hat Dor was managing the KVM and Xen development for several years. Dor holds an MSc from the Technion and a Phd in snowboarding.

Victor Lee (Kent, Ohio)

Dr. Victor Lee is Senior Product Manager at TigerGraph, bringing together a strong academic background, decades of experience in the technology sector, and a strong commitment to quality and serving customer needs. His first stint in Silicon Valley was as an IC circuit designer and technology transfer manager, before returning to school for his computer science PhD, focusing on graph data mining. He received his BS in Electrical Engineering and Computer Science from UC Berkeley, MS in Electrical Engineering from Stanford University, and PhD in Computer Science from Kent State University. Before joining TigerGraph, Victor was a visiting professor at John Carroll University.
Dr. Lee will be giving the following Graph Day presentation: Real-time deep link analytics: The next stage of graph analytics

William Lyon (SFBay) @lyonwj

William Lyon is a software developer at Neo4j, the open source graph database. As an engineer on the Developer Relations team, he works primarily on integrating Neo4j with other technologies, building demo apps, helping other developers build applications with Neo4j, and writing documentation. Prior to joining Neo, William worked as a software developer for several startups in the real estate software, quantitative finance, and predictive API fields. William holds a Masters degree in Computer Science from the University of Montana. You can find him online at lyonwj.com.
William will be giving the following Graph Day presentation: Graph Analysis of Russian Twitter Trolls

Rob McDaniel (Seattle)

Rob McDaniel is the founder of Lingistic, the machine learning team behind howbiased.com, which has a focus on NLP problems related to politics, debate analysis and the detection of bias in news media. HowBiased.com hopes to help humans learn to be more critical of the material they ingest, by identifying traits and cues in the language which may be hidden or non-obvious.
Rob has a diverse background in engineering and machine learning, both with major corporations and startups. He has worked on problems related to machine translation, taxonomy classification and information extraction, and has a passion for unsupervised methods and graph theory. When not working on his startup, Rob is also Manager of Applied Science at Rakuten, where he manages AI that expands the depth and quality of Rakuten's global product catalog.
Rob will be giving the following NLP Day presentation: Detecting Bias in News Articles

Patrick McFadin (SF Bay) @patrickmcfadin

Patrick McFadin (Linkedin), VP of Developer Relations at DataStax, is regarded as one of the foremost experts of Apache Cassandra and data modeling techniques. While at DataStax, he has helped build some of the largest deployments in the world. Previous to DataStax, Patrick was Chief Architect at Hobsons, an education services company. There, he spoke often on web application design and performance.
Patrick will be giving the following Cassandra presentation: What have we done!? 10 years of Cassandra.

Jessica Minnier (Portland) @datapointier

Jessica Minnier (LinkedIn / GitHub)
is an Assistant Professor of Biostatistics at Oregon Health & Sciences University. She is a faculty member of the OHSU-PSU School of Public Health with appointments in the Knight Cardiovascular Institute and Knight Cancer Institute Biostatistics Shared Resource. Her statistical research interests include risk prediction with high dimensional data sets and the analysis of genetic and other omics data. She is also interested in statistical computing (mostly in R), reproducible research and open science.
Jessica teaches Mathematics/Statistics II, a statistical inference course for the MS in Biostatistics program at OHSU-PSU School of Public Health. Jessica has an A.M. and Ph.D. in Biostatistics from Harvard University and a B.A. in Mathematics with minor in Computer Science from Lewis & Clark College.
Jessica will be presenting the R User Day session: Building Shiny Apps: Challenges and Responsibilities.

Qazaleh Mirsharif (San Francisco)

Qazaleh Mirsharif (Linkedin / Google Scholar) is a Machine Learning Scientists specializing in Computer Vision. She has received a PhD in computer science from University of Houston, Texas focusing on applications of computer vision in developmental studies of children. She received her MSc in Artificial intelligence and has worked on areas as diverse as medical image processing, object segmentation and motion analysis in videos and parking sign detection in street view images.
Qazaleh will be co-presenting the following workshop: Hands on Machine Learning / Deep Learning Apps using AWS/Keras/Tensorflow

Jonathan Mugan (Austin) @jmugan

Jonathan Mugan (Linkedin) is a researcher specializing in artificial intelligence, machine learning, and natural language processing. His current research focuses in the area of deep learning for natural language generation and understanding. Dr. Mugan received his Ph.D. in Computer Science from the University of Texas at Austin. His thesis was centered in developmental robotics, which is an area of research that seeks to understand how robots can learn about the world in the same way that human children do. Dr. Mugan also held a post-doctoral position at Carnegie Mellon University, where he worked at the intersection of machine learning and human-computer interaction. One of the most requested speakers at the Data Day Texas conferences, he recently also spoke on the topic of NLP at the O’Reilly AI conference, and is the creator of the O’Reilly video course Natural Language Text Processing with Python. Dr. Mugan is also the author of The Curiosity Cycle: Preparing Your Child for the Ongoing Technological Explosion.
Jonathan will be giving the following AI Weekend presentation: Generating Natural-Language Text with Neural Networks

Misty Nodine (Austin)

Misty Nodine (Linkedin / GitHub) has a long history of being interested in trying to understand, organize, and make sense of complexity. She is a respected researcher and developer in the areas of natural language processing, information and knowledge management, agent-based information systems, communications system management, and collaboration management. More recently, she has been focused more on data pipelines and data architectures, specifically for developing a comprehensive understanding of users to improve recommendations and ad targeting.
Misty received her Ph.D. in Computer Science from Brown University in 1993. She received her S.B. and S.M. in EECS from Massachusetts Institute of Technology. She has 30+ years of experience in computer and data science, both in
industrial research and in startup companies. She is currently the Data Architect at Spiceworks.
Misty will be giving the following Graph Day presentation: Understanding People Using Three Different Kinds of Graphs

Jonathan Nolis (Seattle) @skyetetra

Jonathan Nolis (LinkedIn / GitHub) is the Director of Insights & Analytics at Lenati, and is the lead of the Customer Insights & Analytics team. He has over a decade of experience in solving business problems using data science. Jonathan has provided insights and strategic advice in industries such as retail, manufacturing, aerospace, health care, and e-commerce. Jonathan helps create proprietary technology for Lenati including the Loyalty Program ROI Simulator – a tool that uses big data to predict the value of a loyalty program. He has a PhD in industrial engineering, and has several academic publications in the field of applied optimization. Prior to joining Lenati, Jonathan was a Lead of Advanced Analytics at Promontory Financial Group, a regulatory compliance consulting firm.
Jonathan will be speaking as part of R User Day.

Hilary Parker (San Francisco) @hspter

Hilary Parker (LinkedIn / GitHub) is a Data Scientist at Stitch Fix and co-host of the Not So Standard Deviations podcast. She is an R and statistics enthusiast determined to bring rigor to analysis wherever she goes. At Stitch Fix she works on teasing apart correlation from causation, with a strong dose of reproducibility. Formerly a Senior Data Analyst at Etsy, she received a PhD in Biostatistics from the Johns Hopkins Bloomberg School of Public Health.
Julia will be speaking as part of R User Day.

Lynn Pausic (Austin) @lynnpausic

As head of the Design team at Expero, co-principal and business strategist, Lynn Pausic takes multitasking to the next level. By combining expertise in strategy, innovation and design, Lynn brings the breadth and depth of complex problems to light and figures out how to break them down into useful, usable and manageable pieces that form a holistic experience.
Lynn’s extensive background in user experience ranges from designing user interfaces for wearable devices, to creating enterprise software solutions and mobile UIs, to innovating scenarios beyond the 2D screen. She has ever-growing expertise with timely topics such as Big Data, the Internet of Things, UI Design Pattern Libraries and High-Performance Computing, in industries as varied and diverse as Austin itself. Lynn’s recent clients are in agronomy, enterprise management, energy, biotechnology and other verticals.
Prior to founding Expero, Lynn earned a B.S. from Carnegie Mellon University and worked as a Director of Product Management, a Consulting Manager and a Director of Human-Computer Interaction (HCI). At Trilogy, she led the HCI team and established user-centered design as an integral part of the company’s software development process.
Lynn often speaks on user experience and design, including at Nielsen Norman Group conferences around the world. Lynn created the popular tutorial “Complex Applications & Websites” (which she co-presents with John Morkes). Lynn also has presented at Carnegie Mellon University’s HCI Institute, Cornell University’s Media Lab and ACM’s SIGCHI conference.
Lynn will co-present the following AI Weekend session: Vital Role of Humans in Machine Learning .

Aaron Ploetz (Minneapolis) @APloetz

Aaron Ploetz (Linkedin) has a professional software developer since 1997, and has been named a DataStax MVP for Apache Cassandra three times (2014-17). While not at work, he has been a computer hobbyist since 1987 (when his Mother first brought home a Tandy 1000 EX). He still works on a variety of projects in his home lab, including (but not limited to) building Linux servers, gaming machines, and test Cassandra clusters. Aaron received a Bachelor of Science degree in Management/Computer Systems from the University of Wisconsin - Whitewater in 1998, and a Master of Science degree in Software Engineering (emphasis on Database Technologies) from Regis University in 2013. He and his wife Coriene live with their three children in the Twin Cities. When not in front of a computer he enjoys amateur astronomy, writing, and coaching his sons' baseball and ice hockey teams.
Aaron will be presenting the following Cassandra session: Performance Data Modeling at Scale

Jason Plurad (Raleigh-Durham) @pluradj

Jason Plurad is a software developer on IBM's Open Technologies team. He is a committer on Apache TinkerPop, an open source graph computing framework. Jason engages in full stack development (including front end, web tier, NoSQL databases, and big data analytics) and promotes adoption of open source technologies into enterprise applications, service, and solutions. He has spoken previously at IBM conferences (Innovate, Insight) and Triangle Hadoop Users Group meetups.
Jason will be presenting the following Graph Day session: Powers of Ten Redux

Steve Purves (Tenerife, Islas Canarias) @stevejpurves

Steve Purves, Senior Software Developer at Expero, describes himself as an engineer first and foremost. He is comfortable working full-stack, cross-platform in a range of languages and is happiest when there is some mathematical or scientific analysis sprinkled in. He graduated in electrical engineering specializing in signal and image processing, which he took into the scientific computing field in the Oil and Gas industry.
During that time his work was largely split into three: development of low-level number-crunching libraries (C, C++, CUDA) and the cross-platform desktop application with 3D visualization to drive it; applied research in signal processing, numerical analysis algorithm development for 3D seismic analysis, during which he was an IEEE journal geek; and finally management of R&D and Product development teams as CTO, championing practices like TDD, BDD and Agile to get it done.
Around 5 years ago, the excitement of daily binary builds wore thin and Steve got hooked on building applications for the web, starting out with web-desktop integration work for seismic analysis on the iPad. Since then activities have included working on full-stack web applications, with and without desktop integration, for startups in sectors such as Dental, TV Production and Software Micro-Consulting.
Today, he builds reactive web applications with Expero, which feeds his desire to learn and work on industrial-strength projects. Steve waits patiently, with ES6 JavaScript and Jupyter Notebooks at the ready, for the imminent explosion of scientific computing on the web.
Steve will be giving the following Graph Day presentation: Graph Convolutional Networks for Node Classification

Gabriela de Queiroz (San Francisco) @gdequeiroz

Gabriela de Queiroz (LinkedIn / GitHub) is the Lead Data Scientist at SelfScore. Formerly Gabriela was data scientist at Sharethrough, where she developed statistical models from concept creation to production, designed, ran, and analyzed experiments, and employed a variety of techniques to derive insights and drive data-centric decisions. Gabriela is the founder of R-Ladies, an organization created to promote diversity in the R community, which now has over 25 chapters worldwide. Currently, she is developing an online course on machine learning in partnership with DataCamp.
Gabriela will be speaking as part of R User Day.

Karthik Ramasamy (San Francisco) @karthikz

Karthik Ramasamy (LinkedIn) is the co-founder of Streamlio - a company that focuses on building next generation real time infrastructure. Before Streamlio, Karthik was the engineering manager and technical lead for real-time infrastructure at Twitter where he co-created Twitter Heron. He has two decades of experience working in parallel databases, big data infrastructure, and networking. He co-founded Locomatix, a company that specializes in real-time streaming processing on Hadoop and Cassandra using SQL, that was acquired by Twitter. Before Locomatix, he had a brief stint with Greenplum, where he worked on parallel query scheduling. Greenplum was eventually acquired by EMC for more than $300M. Prior to Greenplum, Karthik was at Juniper Networks, where he designed and delivered platforms, protocols, databases, and high availability solutions for network routers that are widely deployed on the internet. Before joining Juniper, at the University of Wisconsin he worked extensively in parallel database systems, query processing, scale out technologies, storage engines, and online analytical systems. Several of these research projects were later spun off as a company acquired by Teradata. Karthik is the author of several publications, patents, and Network Routing: Algorithms, Protocols and Architectures. He has a Ph.D. in computer science from the University of Wisconsin, Madison with a focus on big data and databases.
Karthik will be giving two presentations: Next Generation Real Time Architectures and Autopiloting #realtime processing in Heron

Andrew Ray (Bentonville, Arkansas)

Andrew Ray is a Senior Technical Expert at Sam’s Club Technology. He is passionate about big data and has extensive experience working with Apache Spark and Hadoop. Andrew is an active contributor to the Apache Spark project including SparkSQL and GraphX. At Walmart Andrew built an analytics platform on Hadoop that integrated data from multiple retail channels using fuzzy matching and distributed graph algorithms. Andrew also led the adoption of Spark at Walmart from proof-of-concept to production. Andrew earned his Ph.D. in Mathematics from the University of Nebraska, where he worked on extremal graph theory.
Andrew will be giving the following Graph Day presentation: Writing Distributed Graph Algorithms

David Robinson (NYC) @drob

David Robinson (LinkedIn / GitHub) is a data scientist at Stack Overflow with a PhD in Quantitative and Computational Biology from Princeton University. He enjoys developing open source R packages, including broom, gganimate, fuzzyjoin and widyr, as well as blogging about statistics, R, and text mining on his blog, Variance Explained.
David will be giving the following R User Day presentation: We R What We Ask: The Landscape of R Users on Stack Overflow.

Emily Robinson (NYC) @robinson_es

Emily Robinson (LinkedIn / GitHub) works as a Data Analyst at Etsy with the search team to design, implement, and analyze experiments on the ranking algorithm, UI changes, and new features. Emily earned her masters in Organizational Behavior from INSEAD in 2016 and her bachelor’s in Decision Sciences from Rice University (where she took classes from Hadley Wickham). She's a co-organizer of the R-Ladies NYC chapter, a global organization to promote gender diversity in the R community. She enjoys blogging about A/B Testing, conferences, and data science projects on her blog, Hooked on Data.
Emily will be giving the following R User Day presentation: The Lesser Known Stars of the Tidyverse.

Juan Sequeda (Austin) @juansequeda

Dr. Juan Sequeda is the co-founder of Capsenta, a spin-off from his research, and the Senior Director of Capsenta Labs. He holds a PhD in Computer Science from the University of Texas at Austin. His research interests are on the intersection of Logic and Data and in particular between the Semantic Web and Relational Databases for data integration, ontology based data access and semantic/graph data management. Juan is the recipient of the NSF Graduate Research Fellowship, received 2nd Place in the 2013 Semantic Web Challenge for his work on ConstituteProject.org, Best Student Research Paper at the 2014 International Semantic Web Conference and the 2015 Best Transfer and Innovation Project awarded by Institute for Applied Informatics. Juan is the General Chair of AMW 2018, was the PC chair of the ISWC 2017 In-Use track, is on the Editorial Board of the Journal of Web Semantics, member of multiple program committees (ISWC, ESWC, WWW, AAAI, IJCAI) and co-creator of the Consuming Linked Data Workshop series. Juan is a member of the Graph Query Languages task force of the Linked Data Benchmark Council (LDBC) and has also been an invited expert member and standards editor at the World Wide Web Consortium (W3C).
Juan will be giving two presentations: Integrating Semantic Web Technologies in the Real World: A journey between two cities and,G-CORE: A Core for Future Graph Query Languages, designed by the LDBC Graph Query Language Task Force

Julia Silge (Salt Lake City) @juliasilge

Julia Silge (LinkedIn / GitHub) is a data scientist at Stack Overflow. She enjoys making beautiful charts, the statistical programming language R,
black coffee, red wine, and the mountains of her adopted home here in Utah. She has a PhD in astrophysics and an abiding love for Jane Austen. Her work involves analyzing and modeling complex data sets while communicating about technical topics with diverse audiences.
Julia giving the following R User Day presentation: Text Mining Using Tidy Data Principles.

David Smith (Chicago) @revodavid

David Smith is the R Community Lead at Microsoft. With a background in data science, he writes daily about applications of predictive analytics at the Revolutions blog (blog.revolutionanalytics.com), and is a co-author of Introduction to R.
David will be speaking as part of R User Day.

Nick Strayer (Nashville) @NicholasStrayer

Nick Strayer (LinkedIn / GitHub) has worked in many different realms, including as a Journalist at the New York Times, data scientist at Dealer.com in Vermont, and as a "data artist in residence" at tech startup Conduce in California. Currently, he is a PhD student in biostatistics at Vanderbilt University and also an intern at the Johns Hopkins Data Science Lab. Recently (May '15), he graduated from the University of Vermont where he majored in mathematics and statistics and minored in computer science.
Nick likes data. Manipulating it, modeling it, making it (simulation), visualizing it and yes, even cleaning it. He does these things with some combination of R, Python and Javascript (d3.js in particular). Most recently he has been fascinated with conveying complex statistical topics and methods using intuitive and interactive graphics.
Nick's current research interests include: data gathering, extracting inference from machine learning, data visualization and scientific communication. When not in "school mode" Nick loves to bike places, read science fiction and wander around gardens/musuems.
Nick will be presenting the R User Day session: Making Magic with Keras and Shiny

Denis Vrdoljak (SF Bay)

Denis Vrdoljak (Co-Founder and Managing Director at the Berkeley Data Science Group (BDSG)): Denis is a Berkeley trained Data Scientist and a Certified ScrumMaster (CSM), with a background in Project Management. He has experience working with a variety of data types-- from intelligence analysis to electronics QA to business analytics. In Data Science, his passion and current focus is in Machine Learning based Predictive Analytics and Network Graph Analysis. He holds a Master's in Data Science from the UC Berkeley and a Master's in International Affairs from Texas A&M.

Claudius Weinberger (Köln, Germany) @weinberger

Claudius Weinberger is the CEO and Co-founder of ArangoDB GmbH - the company behind identically named NoSQL multi-model database. Claudius has been a serial entrepreneur for the majority of his life. Together with his co-founder, he has been busy building databases for more than 20 years. He started with in-memory to mostly memory databases, moved to K/V stores, multi-dimensional cubes and ultimately graph databases. Throughout the years he focused mostly on product and project management, further sharpening his vision of the database market. He has co-founded ArangoDB in 2012. Claudius studied economics with business informatics as key aspect at the University of Cologne. He spends all his free time with his two little daughters, is a judo enthusiast and occasionally enjoys gardening.
Claudius will present the following Graph Day session: Fishing Graphs in a Hadoop Data Lake.

Ted Wilmes (Oklahoma City) @trwilmes

Ted Wilmes, Data Architect at Expero, is a graduate of Trinity University where he studied computer science and art history. He started his professional career at a not-for-profit research and development institution where he performed contract software development work for a variety of government and commercial clients. During this time he worked on everything from large enterprise systems to smaller, cutting edge research and development projects. One of the most rewarding parts of each of these projects was the time spent collaborating with the customer.
As Ted’s career continued, he moved on to an oil and gas startup and continued to dig deeper into the data side of software development, gaining an even deeper interest in how databases work and how to eek as much performance out of them as possible. During this time he became interested in the application of graph databases to certain problem sets. Today, at Expero, Ted enjoys putting his deep knowledge of transactional graph computing to work as he helps customers of all types navigate the burgeoning property graph database landscape.
Outside of work, Ted enjoys spending time with his family out-of-doors, listening to and playing loud music, and contributing to the Apache TinkerPop project as a committer and PMC member.
Ted will be giving the Graph Day keynote: The State of JanusGraph 2018

Daniel Woodie (Austin) @DanielWoodie5

Daniel Woodie is founder and lead scientist of Bamboo Analytics, a data science services firm. He's trained originally as a statistician and has worked on applications ranging from systems neuroscience to global supply chains. With Bamboo Analytics he offers analytical consulting and training to early stage startups and Fortune 500 companies, alike.
Daniel will be emcee for R User Day at Data Day Texas.



Emil Eifrem of Neo4j describing the evolution of the property graph model.

Rob McDaniel (Seattle) of Lingistic was one of the highest rated speakers at Data Day Texas 2017