Data4All workshop introduces high school students to data science research

May 3, 2023 (last updated on December 10, 2024)

The collaborative effort between the Center for Spatial Data Science (CSDS), the Data Science Institute (DSI), Argonne National Laboratory, and the Office of Civic Engagement opens high schoolers up to the “thrill of discovery” of using data science to solve research problems.

By Sarah Steimer

Right now, 24 individuals are conducting inquiry-driven research. They’re learning,Python, a popular programming language for data science; training on how to investigate data problems and present findings; and developing critical collaboration, problem-solving, and communication skills — all under the leadership of UChicago’s Data Science Institute,  Argonne National Laboratory, the Center for Spatial Data Science, and the Office of Civic Engagement. They’re also all high schoolers.

High school students learn the workshop content in 6 small groups of 4 students each.
High school students learn the workshop content in 6 small groups of 4 students each.

The program, called Data4All, is a bridge workshop to introduce Chicago high school students to data science research and inspire them to consider careers in data science and AI. It addresses three main gaps: only a third of students have been exposed to data analysis or data visualization in their schools; many data science students struggle to critically solve problems with data; and there is a need for more data scientists in general, and especially from underrepresented groups.

“Data4All is critically important to DSI’s efforts to broaden participation and increase equity and access to cutting edge data science education, ” says David Uminsky, executive director of DSI. “We are fortunate to have such outstanding experts to partner with in the CSDS, ANL and OCE.”

Currently in its third round, Data4All helps high school students see how data science is relevant to their interests and to critical problem-solving. Not only does the eight-week program improve their technical DS skills, but it also connects them with college and career preparation and internships at UChicago and Argonne.

“We know that data-centered skills and perspectives are becoming increasingly valuable in college and the workforce,“ says John Domyancich, Manager of STEM Education Learning at Argonne National Laboratory, who leads the curriculum development and instruction of the program. “Data4All has allowed us to envision what data science education could look like at the high school level. We hope to take what we learn from Data4All and expand on its successes to empower more students, especially ones from underserved communities.”

The program was developed to include gamified statistical concepts, small group activities, and field trips. For example, students have visited Ashlyn Sparrow, Assistant Director of the Weston Game Lab. Sparrow developed a spatial optimization game called Infection City that lets students experiment with where to put clinics to minimize STDs. Students played the game prior to their visit, where Sparrow explained how technology and reasoning are used to design these games.

“Students can really get an intuitive sense of how these concepts work and get some practice,” says Julia Koschinsky, who is on Data4All’s curriculum development team and operationally directs the Center for Spatial Data Science, with faculty direction by Luc Anselin. “Learning how to analyze data is more like skiing. As opposed to just memorizing something, you really need to get the practice down.”

Data4All participants discuss how to address one of the challenges in their programming exercise.
Data4All participants discuss how to address one of the challenges in their programming exercise.

For CSDS, Koschinsky says the workshop is part of a larger project (https://puttingscienceintodatascience.org/) to determine whether data science or spatial data science could be taught to high schoolers and undergraduates within a scientific reasoning framework: Does that engage students? Does it lead them to think more critically about solving problems with data? Does it broaden their understanding of what data science is from programming to quantitative reasoning and scientific analysis? And does it get students away from a mechanical application of statistical tools and methods, toward a thrill of discovery reminiscent of solving a mystery?

“Can we get to those outcomes if we don't just teach programming, but teach programming in the context of scientific reasoning?” Koschinsky asks.

The answer appears to be yes. From an internal evaluation report of last fall’s program, which included quotes from students, it’s clear that the connections between data and social science are clicking. Not only did students speak — unprompted — of how they were grateful for a greater understanding of data science, but they appreciated seeing how it's applicable in their own lives or potential careers.

The high schoolers are also mentored by UChicago students, such as Doug Williams — who himself entered the university as “not a math kid,” he says. But a work project after his freshman year helped him realize both his passion and capacity to pursue data science, and he saw Data4All as an experience that could invoke a similar passion for high schoolers. He sees his role as a mentor in the program as helping to tackle any technical skills needed for students to pursue their ideas.

Doug Williams, who double-majors in data science and public policy at the College, mentors high schoolers in his small group.
Doug Williams, who double-majors in data science and public policy at the College, mentors high schoolers in his small group.

“As the program progressed, students realized that good data science is driven by critical thinking and formulating strong questions,” Williams says. “Once they came to see the intimidating math and code as nothing but tools to aid their exploration of a problem, they became super proactive about trying new computational or statistical methods in pursuit of their ideas.”

Data4All is also hitting its goals of engaging underrepresented groups: 90% of the students in last year’s cohort were minorities or girls, and students hailed from 18 public schools across the Chicago region — students without prior experience in data analysis or data visualization who, after the program wrapped, reported increased confidence in these areas.

“It opened up students to how they approach data science in a way that also incorporates social, scientific, and humanities thinking,” Koschinsky says. “And that was what kept me hooked and why I'm still engaged, because I feel like we're onto something here: This special sauce of integrating these fields is working. It's exciting.”

Dr. Evelyn Campbell explains how she used data science to tackle the big data problems in her microbiology dissertation.
Dr. Evelyn Campbell explains how she used data science to tackle the big data problems in her microbiology dissertation.