ChatGPT is the fastest-growing software application ever, reaching about 100 million users just two months after launching due to its remarkable ability to generate complex, realistic text.
Educators and students quickly uncovered significant concerns about ChatGPT’s place in a learning environment. These include an increased potential for plagiarism and a tendency for the tool to produce incorrect claims that sound true, such as stating the opposite result of what researchers actually published or producing a wrong math solution.
And yet, the promise of ChatGPT-style systems is greater than its current very real drawbacks. This technology has already shown proficiency in difficult tasks such as synthesis across information domains, writing clear prose and even authoring computer code.
ChatGPT can provide rapid and personalized feedback. And the application is easy for most people to use: try it out here.
Technology Already Being Applied
A new Impact Research survey of K-12 schools reveals that 51% of surveyed educators and a third of students already use ChatGPT in the classroom, with an overwhelming majority stating that it has had a positive impact on their teaching and learning.
In the short term, there are many implications for learning with ChatGPT as it exists today. Chris Piech from Stanford is already using the tool to train teachers to interact with the models to practice their pedagogical moves. “ChatGPT is a bad teacher, but a great student,” Piech says.
Rather than having schools ban the technology like New York City recently did, the education community has an opportunity to guide the development of GPT-style systems.
We can both influence its product management (i.e., which people and tasks are these systems built by and for, and how can we ensure inclusivity?) and its technical development. Specifically, the field requires more high-quality education datasets so that GPT-style models can work for students and teachers.
GPT-style systems will continue to deliver subpar results for education-related tasks if they are not trained on education-specific data, such as teacher-student dialogue or feedback on student work. This is because AI systems learn to reproduce patterns in their training data. So for an AI system to perform well on a task, it must have seen similar tasks before.
But education is one of the wonderfully messy parts of humanity’s experience: it represents complex and always-changing information, mediated through a series of teacher-student and peer relationships.
Education is highly contextual: interventions and curricula work differently for different teachers and learners at different times. Data for AI systems should therefore carefully describe real education contexts, actions, and outcomes.
For instance, for a GPT-style system to work well for middle school math education, it needs more exposure to real middle school math curricula and learning objectives, concepts and misconceptions, useful pedagogical strategies and real student outcomes on a variety of important measures.
Supporting ChatGPT with Dataset Generation
To address these needs, the education field should support open-source datasets that are designed for large statistical systems. Not only will domain-specific training data help improve the performance of these models in an educational context, but open-source data will also catalyze research and development efforts for the entire learning science and education field. “Data as a public good” efforts allow everyone to make progress on critical education technology systems, not just the big tech companies.
One recent example is the “Feedback Prize: English Language Learners” project led by expert Scott Crossley. This project curated a dataset of essays written by English language learners and used AI to determine the level of English proficiency in student writing.
The project crowdsourced solutions on Kaggle, a leading data science platform. This example allows AI systems to better understand how school writing differs among students with different language proficiencies: a concept so central that it has its own office in the U.S. Department of Education.
At Schmidt Futures, we’re supporting efforts like high-quality dataset generation to help make breakthrough progress on key challenges such as significantly improving the rate of middle school math learning. This is a fundamental moment for the education community to play a role in guiding AI progress toward the most important needs of teachers and learners.
Just as slide rules became calculators, which became personal computers, which then gave us operating systems and the internet – now foundational to our educational experiences – new computing technologies like AI systems can provide significant educational opportunities. We just need to give them the data to do so.