NSF Awards: 1912408
North Carolina Central University (NCCU) Mathematics faculty members in collaboration with social science researchers from Cynosure Consulting partnered in the Data Science for Social Justice (DSSJ) project to develop resources and materials that leverage data science as a vehicle for identifying, describing, and addressing social inequities. The DSSJ project developed tools and resources that support the incorporation of social justice embedded data science exploration as part of freshman seminar courses at NCCU, a public HBCU, in an effort to provide students with an early introduction to data science exploration using of real world data sets on social justice topics that were identified as of great interest by NCCU students to highlight the power and relevance of data science as a means to promote the engagement of underrepresented students in STEM.
The DSSJ project leadership has worked to promote greater dissemination of the project tools and resources by establishing a project website to serve as a mechanism to promote the broader access to and engagement of high school-aged and undergraduate students beyond NCCU in exploring and visualizing stories within real world social justice embedded data sets. The DSSJ project website provides open access to the DSSJ curated datasets and supplemental resources.
This video highlights the way in which the DSSJ project leverages data science to promote important discussion and exploration of social justice issues and also provides an overview of the project key tools, data topics, and access information.
Rebecca Lowe
Senior Consultant
We are so excited to engage in discussion around our work and to garner feedback from the larger community of STEM educators and researchers interested in Data Science and Social Justice. We are particularly interested in learning about key introductory Data Science content that should be prioritized in our curriculum activities.
Iris Wagstaff
Remy Dou
Kristin Flaming
Marie Himes
Hi Rebecca, Ravanasamudram, and Adrienne,
I'm very excited that we have been able to connect through this community! Although my team's focus at NC State's Friday Institute is not data science, I have colleagues (Hollylynne Lee, Gemma Mojica, Emily Thrasher, among others) in the Hub for Innovation and Research in Statistics Education (HI-RiSE) that could be constructive thought partners in examining key introductory data science content as it relates to the Data Science for Social Justice Project.
With gratitude,
Marie
Remy Dou
Iris Wagstaff
Ravanasamudram Uma
Professor
Hi Marie! That would be great. Maybe we could set up a meeting after the showcase to discuss?
Remy Dou
Marie Himes
Absolutely! My email is mphimes@ncsu.edu
Remy Dou
Kristin Flaming
Hi Rebecca,
Our model is driven by the passion of the students for their research topics. As you know it is so powerful when students can answer questions of interest to them. We too find our students choose hot topic's like the datasets that were more popular from your project. Our Passion-Driven Statistics model is always looking for good public access archival datasets to use with our students. We only use datasets that have large number of variables and cases with a good code book and clean data. Based on your work with the curate datasets do you have any that you recommend I explore first?
In the summer of 2019 myself and Kristel Gallagher, a partner at Thiel College held a boot camp at Campbell University Medical School about an hour south of you. The partner at that university is hosting a boot camp using our model this summer for another cohort of medical students.
I have not heard of CODAP. I was thankful that your website links of resources for me to explore. This looks like a great option for our middle school students in STEM/STEAM classes that want to incorporate data but do not have the time or experience with coding and data.
I am sure as I explore your website I will have more thoughts and questions.
Remy Dou
Iris Wagstaff
Gabriele Haynes
Ravanasamudram Uma
Professor
Hi Kristin,
You can check out our datasets at https://sites.google.com/view/dssj/projects. How many variables would you consider for it to be "large"? In curating our datasets, for the most part we retained all the variables as is. However, for several of the datasets, we do limit the number of data points (rows) since CODAP prefers datasets with under 5000 rows and if you provide a dataset with more rows, it samples 5000 rows.
Please feel free to share your feedback if you use our datasets.
Best,
Uma
Kristin Flaming
Kristin Flaming
Thanks Uma. I did find your website and the data. I was just curious if there is one or two that y'all gravitate toward.
In terms of data points, 5000 is enough. We do use datasets with a lot larger because that is the size of it and the software's we use allow for it. However, the AddHealth I tend to gravitate toward has just of 5000 participants.
Ravanasamudram Uma
Professor
Hi Kristin - two datasets that most of our students are interested in is Fatal Police Shootings and Fatal Police Violence. A significant number of students are also interested in domestic violence and sexual assault. Another common ask is for LGBTQ+ data - I am working on that and hope to make it available by the end of the summer.
Remy Dou
Sarah Olsen
Hi Rebecca - thanks for sharing your work in this great video! Data science is a rapidly growing field with such important implications for addressing systemic issues, but it can be a challenge to engage students with data analysis—especially for younger students who might not have a lot of math skills or interests. Did you investigate students' perceptions of data analysis and if so, what did you find? What are some of the long term impacts you would hope to see for students who participate in this type of learning?
Thanks for taking a look at our video on community science investigations for social justice!
Best regards,
Sarah
Remy Dou
Adrienne Smith
President
Hi Sarah, I wanted to jump in and respond. We did look at students' perceptions of data science and we found statistically significant changes in pre-to-post measures of student interest in learning about data science as a career option and their desire to become more knowledgeable about data science. Ultimately, we hope interventions like these will seed interest in STEM careers and will have students view STEM as a vehicle for social justice, which is a powerful motivator in career selections, particularly for marginalized groups.
Remy Dou
Kathryn Kozak
This is great. As a teacher of introductory statistics, the biggest problem is finding datasets. Having datasets that focus on social justice issues is also very important and helpful in our classes. I don't have great datasets, but have you looked to see what Our World in Data has that may be social justice focused?
Remy Dou
Kristin Flaming
Ravanasamudram Uma
Professor
Hi Kathryn,
Yes, we found out about Our World in Data one year into our project. When we started this project, we wanted to keep the focus on the data on social justice issues that are closer to home for our students. So our focus was on data from North Carolina and the US. Now as we are planning to expand our work, we will be tying in relevant data from Our World in Data as well.
Best,
Uma
Kathryn Kozak
I would love to see more of your datasets. he hardest thing to do in statistics is find good clean data. I also want to have some that have a social justice focus. Thank you for putting this together. I look forward to seeing what other datasets you have.
Ravanasamudram Uma
Professor
Thanks Kathryn! Feel free to check out our current list of datasets at https://sites.google.com/view/dssj/projects. Also feel free to let us know if you are looking for datasets in any particular topic(s) and if possible, we can add it to this collection. You can email me at ruma@nccu.edu.
Kathryn Kozak
Thanks. This is a great site.
Barbara Hopkins
Hi Rebecca & Ravanasamudram! I love the CODAP project! We discovered it and considered integrating it with our high school storyline challenge. Although we are still editing the final document for the website, you can see the personalization teachers used to focus the storyline on 10th grade students' interests. Data is always more exciting when the students are asking the questions! The communication of that data plays a huge role in influencing public perception and decision-making. We found that students experienced that influence as they debated or critiqued others' arguments. I love that you use a data approach to visualize social justice issues. Demonstrating explicit evidence goes a long way in convincing opinions! Bravo!
Remy Dou
Remy Dou
Assistant Professor
Dr. Lowe and Team, thank you for your wonderful contribution to the showcase and for the tremendous work you all do. I love the focus on social justice issues as personally-relevant and culturally situated topics on which to support science learning. I especially liked how multiple points of entry to making personally relevant connections seemed to be present in your example topic, racial disparities in maternal mortality. I'm curious about the choice of topics and I'm encouraged by your student-centered approach to determining those. What other factors do/did you consider when deciding on which topics to prioritize in your program development? As a showcase facilitator, I'm also curious about others' experiences or thoughts on topic selection.
Remy Dou
Ravanasamudram Uma
Professor
Thank you Remy! Given that we are a HBCU, our students' choice of topics was the highest priority. Additionally, we added COVID-19 once the pandemic began and we started seeing how it was affecting African Americans and other minorities disproportionately. Our advisory board also suggested topics that not all students may think of but that impacts them nevertheless such as environmental justice.
Remy Dou
Remy Dou
Assistant Professor
Thank you for your thoughtful and informative response!
Remy Dou
Harrison Pinckney
Assistant Professor
The two components of this project that stands out are the use of relevant examples to display the injustice (e.g., mortality rates) and the focus on teaching students how find and communicate data in a way that is meaningful to others. Together these factors can contribute greatly to attracting scientists from marginalized communities. I would like to learn more about how the students were positioned to lead this study and advocate for the needs of their communities.
Ravanasamudram Uma
Professor
Hi Harrison - in its current implementation it is one project in a Freshman Seminar course. So with limited time devoted to the project, the focus is more on shedding light on the problem through data and helping the students visualize the problem and come up with some solutions that may address it. But due to time-constraints, the students don't have the bandwidth to discuss their solutions, refine them and use it to advocate for their communities. We are in the process of couching this in semester-long courses that we hope will enable the students to extend their analyses to advocacy.
LaShawnda Lindsay
Research Scientist
Greetings,
This seems like a very interesting project. I am very interested in learning more about methodologies used to engage high school and college students in data science. What were some out the outcomes of students' learning experiences?
Iris Wagstaff
Adrienne Smith
President
Hi LaShawnda - There were several techniques that were used to pique student interest and make data science more accessible, relevant, and interesting. This includes giving students autonomy in selecting data sets of interest to them (which were chosen based on student interest), using CODAP as a beginner-friendly data exploration tool, and situating the work within a problem-based project, which is supported by best practices. We learned early on that many students had very little background knowledge on data science so we added a series of videos that could fill in that gap. These are all available on the project website. Outcomes included increases in students' data science knowledge, interest in learning about data science as a career option, greater awareness of bid data use across employment sectors, and an increase in desire to become more knowledgeable about data science.
Iris Wagstaff
Daniel McGarvey
I love this. Pretty much everybody participating in this Video Showcase is thinking hard about engaging students and/or the public in scientific thinking. All sorts of neat approaches to a similar set of goals, but this project is the first one I've seen that lets college freshmen choose a topic of personal importance, then let that serve as the motivation to learn as much as they can. In my experience, grad students usually take the "minimum" amount of statistical coursework. But I wonder if their mentality would be different if they started with data that has deep personal relevance, then followed the trail of "think about how much more you could do with the data if you learned these other techniques" motivation. What a great idea. I'm going to think about how to implement this strategy in my own teaching and the grad program that I lead.
Remy Dou
Iris Wagstaff
Ravanasamudram Uma
Professor
Awesome Daniel! Would love to hear your experiences with using this data once you have implemented it.
Iris Wagstaff
Thanks for this work. Combining an emerging field like data science to address social justice issues provides students an opportunity to leverage the STEM expertise they are developing to address issues in their community and issues that they care about in the real world. I look forward to learning more about the potential impact this project will have.
Remy Dou
Ravanasamudram Uma
Professor
Thanks Iris! We will share our results at the conclusion of our project.
Josephine Louie
Hi DSSJ Team,
It is great to learn about your work! We had a video in the showcase last year sharing our social justice data literacy curriculum modules targeted toward high school students in non-AP mathematics classes from historically marginalized populations, also using CODAP (https://stemforall2021.videohall.com/presentati...). This year our video entry describes our project supporting data literacy in middle school science classes in low-income rural communities, with a focus on examining large-scale extreme weather data (https://stemforall2022.videohall.com/presentati...). Our missions are aligned!
For the age groups we've been working with, and given the limited time that teachers can give to our curriculum units (usually 3 weeks max per fall or spring half of the school year), we found we needed to constrain the questions that students could choose from to explore in their data investigations and problem-based projects. Although we give students access to dozens of variables and thousands of cases from the U.S. decennial census, the American Community Survey, and from NOAA, we have needed to create curated datasets to help students meet lesson and unit learning objectives. How much have students explored freely with the data you have provided, and what have been the results?
In addition, one of the research foci in our high school social justice statistics project has been to help students recognize and understand multivariable relationships (particularly interaction effects among three variables). We have found that analyzing data in the context of real and compelling social questions helps students with this type of thinking, but it's still hard for them. What efforts have you made in this domain, which is so critical when working with large, multivariable datasets?
Ravanasamudram Uma
Professor
Thanks Josephine for your thoughtful questions! Similar to your project, in our implementation as well (in a Freshman Seminar class), students had only limited time (3-4 lectures per semester) that they could devote to this project. Within that limited time, they had to learn how to use CODAP, understand the data, and visualize the data and hopefully do some analyses. And remote learning didn't make it any easier. So we don't know how much additional time students devoted to freely exploring the data. Anecdotally, we are aware of a few students doing it. And as Adrienne and Rebecca have noted in an earlier response: "We did look at students' perceptions of data science and we found statistically significant changes in pre-to-post measures of student interest in learning about data science as a career option and their desire to become more knowledgeable about data science."
Thanks for raising the second question. In the current implementation we have not had the time to discuss the interactions of multivariable relationships. Our freshman seminar students included both STEM and non-STEM majors. In future, we are planning to use these topics/datasets in a semester-long course for STEM majors where we hope to address these issues in more depth.
Ambika Silva
Hi,
Thank you for sharing about your project and for sharing your site! I really love that you're sharing your data sets, I have been wanting more cleaned up data sets to go into these topics that you have looked at.
Ravanasamudram Uma
Professor
Thanks Ambika! If you are looking for datasets in any particular topic(s) do let us know. If possible, we will add it to this collection. You can email me at ruma@nccu.edu.
Suzanne Dikker
I love this project! I was wondering if you had explored the possibility to connect with human behavioral datasets or survey data? (e.g., from mTurk) and if so: what your experience might have been? We are exploring ways to connect student authentic inquiry using behavioral tasks & surveys to existing datasets that they can explore for inspiration/iteration.
Ravanasamudram Uma
Professor
Thanks Suzanne! We are not familiar with mTurk (we will check it out). We do have plans to collect qualitative data. But given the nature of the social justice topics we are working with, we will need to be sensitive in how we design the questionnaire and to that end we are intending to collaborate with psychology researchers to help us design and collect such data.