ESPM 157: Data Science & Global Change Ecology

An upper-division course for UC Berkeley undergraduates. This course is designed to appeal both to students pursuing degrees in environmental or life sciences and looking to improve their technical skills, and to students in data science, computer science or statistics seeking to learn more about possible application areas.

Instructor Carl Boettiger
GSI Dana Paige Seidel
Location Mulford 230
Times Tu/Th 12:30P - 2:00P
CCN 46582


Many of the greatest challenges we face today come from understanding and interacting with the natural world: from global climate change to the sudden collapse of fisheries and forests, from the spread of disease and invasive species to the unknown wealth of medical, cultural, and technological value we derive from nature. Advances in satellites and micro-sensors, computation, informatics and the Internet have made available unprecedented amounts of data about the natural world, and with it, new challenges of sifting, processing and synthesizing large and diverse sources of information.

In this course, students will learn the fundamentals of modern data science, computing, statistics and modeling to leading questions in ecology and environmental science. Students will master R programming techniques, relational database fundamentals, data management, version control, working with remote data and REST APIs, collaborative and reproducible research practices and workflow while analyzing issues pertaining to topics such as global climate change, ecological population dynamics, fisheries collapse and mass extinctions.


This course will use a flipped classroom model, with new material introduced in reading assignments prior to class while class time will focus on applying these skills to explore interesting data sets. We will move though four modules, each introducing a new data set and new scientific questions, while also introducing a new skill area and building on previous skills. Students will be expected to work collaboratively in and out of class, and course content and grading will emphasize communication and reproducibility of an analysis as much as scientific or technical completeness. The Course Syllabus provides an overview of the modules and topics covered as well as links to weekly reading, assignments, and any lecture material. This syllabus is preliminary and always subject to change.


We will use Grolemund and Wickham's R For Data Science as the primary text for this course. A hard copy of the book is not required and the openly licensed full text can be found on the author's website for this book.

Students will also be referred to sections in Wickham's Advanced R for more on fundamentals of R programming, as well as various other references as necessary.