For those following at home, I spent the end of my last post focused on thinking through the existing skills that I teach in my course. I now want to spend some time thinking about new skills, whether they exist, whether they’re worth teaching, and how they might be taught.
There’s a whole lot written about this and a whole lot of controversy surrounding it. But all of it feels related to an oft-cited Charles Babbage quote about his mechanical computer:
On two occasions I have been asked, ‘Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?’ I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
Let me explain what I mean. As far as I can tell there are two approaches to ensuring that my students can adapt to a gen AI world:
Approach 1: Keep doing what I’m doing
The main point of this approach is to say that all the underlying ideas needed to use AI well are already embedded in courses that teach the fundamental content of expertise. As evidenced in research and likely your own experience, AI appears to enhance the capabilities of experts who can critically evaluate its outputs, while potentially misleading or slowing down people who don’t know the substantive content required to direct the tools effectively.
Similarly, I’ve seen a fair number of arguments that boil down to “AI literacy is information literacy”. With the emergence of hallucinations as a real barrier to working with these tools (a problem which, conceived narrowly, I believe is rapidly going away), teaching people to critically evaluate information is the same as it ever was. In addition to basic skills of verifying information, more content expertise should move us in the right direction of evaluating content.
If that’s the case, then we should be doubling down on generating expertise, and treat generative AI as just another tool that enhances its value.
Returning to the Charles Babbage quote, I think of this position as arguing that entering “the wrong figures” into the machine is the problem, and understanding what “the right answers” means is where we should be focusing our attention.
Approach 2: Lean hard into AI skills
On the flip side, there are those who argue that using AI effectively is itself a skill distinct from the sort of expertise that we generally try to instill in our students. Just like any tool, the argument goes, LLMs have particularities that must be understood and practiced in order to use them well. Skills like effective prompting to reduce hallucinations (shameless plug), the use of few shot and chain-of-thought techniques, understanding temperature and reasoning level parameters, developing skills and connectors to enhance the quality of outputs, are all things that I have had to learn since 2022.
Empirically, there’s some evidence that AI-specific training enhances the work of engineering students. But I admit to being more convinced by anecdotal evidence in my own life. Time and time again I have seen brilliant colleagues being completely misled by an AI’s outputs (pro tip: asking “why did you answer that way” is not an effective way to understand a model’s behavior) despite having more substantive expertise than I could ever hope to have.
The Babbage quote is apt here: this position claims that the problem is the notion that someone could think a machine could turn wrong figures into right answers. AI has often been sold and interpreted as a black box where magic happens (which I understand, given that these things are pretty complicated!), and the implications of using such tools without understanding how they work are concerning.
As a result, several schools have started incorporating AI skills into their training. From my colleagues in the K-12 space, it seems that “AI literacy” is already an overused term coming from school administrators, governments, and the private sector.
A quick aside: It’s been interesting to see arguments opposing “AI literacy” as a freestanding skillset coming from both detractors and proponents of generative AI. I’ve seen AI detractors argue that Approach #2 is substantively thin and the product of AI marketing, distracting us from our real goals as educators. At the same time, I’ve had proponents of AI’s power argue that these tools may actually give the right answers to the wrong figures someday! Essentially, these tools will be able to understand what we really mean from our questions, probe us to better understand what we want, and produce the outputs that we were actually looking for to begin with. Regardless of the angle, both sides frame these skillsets as brittle, superficial, and changing too quickly to be worth teaching.
Operationalizing the Messy Middle
I have tried to avoid straw-manning these approaches, because I think they have real merit. The fundamentals of critical thinking, interpreting evidence to better understand the world, and expressing ourselves effectively remain as important as ever, and I’m highly skeptical of the kinds of “AI concentrations” that I linked above. On the other hand, I think it is folly to act as though generative AI doesn’t present genuine new and difficult challenges, or that it is somehow so good that you don’t need to know how to use it to capture its benefits.
But as you may have guessed, I view both as too extreme. In the Bayesian updating parlance that we teach in my course, I’d call Approach #1 under-updating our priors to the reality of a changing landscape, where everything we’ve been teaching until now just happens to be perfectly suited to a technology that is transforming or destabilizing almost every part of society. On the other hand, I’d call Approach #2 over-updating: we are scrambling to redesign our curricula to meet the moment, when the moment is itself very hard to pin down.
Coding in English
So how to find a middle ground in my course? One could imagine supplementing the existing curriculum with AI assistance, where we rely heavily on LLMs to help students learn the core ideas of the course (to that end, we do have a guardrailed tutorbot called StatGPT that students can ask questions to, and we as faculty can review the questions they ask). This might look like having students work with LLMs to do data analysis from the start of the course, essentially replacing Excel work. A fascinating effort to do this by Jacob Bien and Gourab Mukherjee in an intro data science course for MBAs suggests that it is in fact possible to abstract away the analysis machine (Excel, R, Python) and have AI be the medium through which they interact with that machine. This was more involved than “just ask AI what you want!”, suggesting the development of AI skills in the process (I quite like the three prompting principles they use).
On the other hand, one can imagine this approach also abstracting away the productive failure that helps solidify deep understanding. I can try my best to teach students to be critical of AI output, to ask questions about the underlying assumptions of analysis, and to request further sensitivity checks to check for robustness of results. But until a student sees the million little assumptions that they have to make when putting together an analysis and struggle to justify them, they won’t have developed a mindset to question assumptions before checking to see whether a result is agreeable to them or not. A recent arXiv publication by Favero et al. frames the risk as a cognitive death spiral:
Cognitive offloading reduces opportunities for effortful reasoning; diminished effort contributes to illusory learning and overtrust; overtrust weakens learner agency; and reduced agency exacerbates emotional harms such as anxiety, diminished self-efficacy, and dependency.
The literature on generative AI’s effect on learning outcomes is admittedly weak. But higher quality studies suggest that AI without heavy guardrails can quickly harm learning (Bastani et al. 2025). On the other hand, Kestin et al. found large benefits to guardrailed AI use, suggesting that these tools can be introduced well. So maybe guardrailed AI use, along the lines of what I’ve been doing for the past 1-2 years, is the right approach.
A Wrinkle: The Expertise Reversal Effect
But not so fast. There’s a well-documented concept in cognitive load theory called the expertise reversal effect. In a nutshell, it suggests that while a large amount of instructional scaffolding improves learning outcomes for novices, that effect diminishes and actually reverses for people with more expertise. That is to say: throwing novices in the deep end can create so much cognitive overhead that they fail to actually learn or internalize; on the flip side, hand-holding of students who have high familiarity with the underlying concepts can short-circuit existing pathways and lead to less benefit.
It’s worth noting that this effect is primarily documented in STEM courses, but it’s certainly compelling: my impulse to “protect” my students from the ease of AI tools through guardrails and strict AI policies is likely a good place to start, but not a good place to end.
A potential sequence
So what does this imply for my course? It’s making me lean toward the following structure:
- Start with productive struggle to build foundational schemas. Potentially use guardrailed AI to help provide scaffolding to students during exercises, but emphasize direct struggle with the underlying skills. The scaffolding here is about learning the concepts themselves;
- As competence develops, provide structured exercises where students use generative AI models to replicate and build off of their manual skills. Assignments should emphasize alternating between the two to further reinforce what they know and to get them used to what these tools can and can’t do at the moment. The scaffolding here is about learning to use AI;
- Gradually fade scaffolding in both concepts and AI tools. This means having them use commercially available, non-guardrailed tools to solve problems that they don’t necessarily know how to solve.
This actually reminds me of a sequence I introduced in the first problem set of our course, where we’re exposing students to creating scatterplots in Excel. It goes something like this:
Q1) Choose two of the ten variables in this dataset that interest you and that you believe might be related. Suggest two distinct reasons that could explain the relationship you found between the variables.
Q2) Use the
CORRELfunction in Excel to calculate the correlation coefficient. (Use StatGPT for help navigating Excel and troubleshooting your formulas. You should carry out the analysis yourself using Excel).Q3) AI tools can also be used to conduct analyses and generate figures, though they can also make mistakes and assumptions that you might not agree with. Try uploading the dataset to another AI tool and asking it to create a visual that builds on the scatterplot you made in Q2.
Q4) While these tools are extremely powerful, they are also known to make mistakes. Think of one spot check you can do of the graph to make sure that its data is accurate (e.g., pick a set of points and check the spreadsheet to make sure they are accurate). What did you check for and how did you validate it? Does it look like StatGPT was accurate?
In this sequence, I’m trying to do conceptual and technical scaffolding without (Q1) and with (Q2) AI, then have them use a commercial AI tool with additional scaffolding to help ground their use in existing knowledge and build new skills (Q3 and Q4). The unrestricted, “hands off” part of my proposed sequence isn’t in this problem set, but I think it’s a start.