Potions For Muggles: January 2026

Thursday, January 8, 2026

Student A.I. use: Fall 2025

The last two times I taught G-Chem 1, I briefly told students how a generative A.I. such as ChatGPT can be useful and what some of its limitations are. Last semester (Fall 2025), I made no mention of A.I. use in any of my classes until the last week of the term. I surveyed the students asking if they used A.I. in my class, how so, how often, and if they found it helpful. The questions were open-ended and students could answer (or decline to do so) in any way they wish. I prefaced by saying that I had no problem with A.I. use, and that their responses would help me provide guidance to my future classes. I taught two sections of G-Chem 1 and one section of Biochem. My musings on the results are mainly focused on G-Chem because of the larger class sizes.

In G-Chem, 12% of students said they did not use any A.I., while 88% did so. ChatGPT was by far the main source, with Gemini a distant second. (Other apps got only one or two mentions.) Only a small proportion of students said they used it a lot. Most used it sparingly or occasionally. A.I. was most often used shortly before exams (in conjunction with getting answers to my study guides) and on the stoichiometry unit where students wanted help on step-by-step calculations. From my limited tests, GPT-4o does noticeably better on stoichiometry than GPT-3 (which wasn’t very good) in providing a correct solution, although typically a verbose one.

Interestingly, a few students used the chatbot to recommend youtube videos to help them understand a topic. (Many students just use Google or go straight to youtube to look for such videos.) Most students said they found it helpful in “explaining” concepts or how to solve problems. Several students specifically said they used it to generate practice problems or to quiz themselves. One student said it helped them “decode” their notes and explain it in a simple way. Students said it was particularly helpful when they missed class, one even saying “I didn’t need to go to office hours… it gave me the answer from anywhere I liked.” While the majority of students said they found ChatGPT useful, a handful did not.

A number of students provided specific caveats in their usage. A student writes: “I would strongly recommend not heavily depending on it for homework, as it ends up being more harmful than beneficial. You must know how you obtained your answer, not just copy and paste.” Another student: “These models are constructive for learning as long as you use them productively and have them guide you instead of answering for you.” A student notes: “It was helpful, but some ideas it presented contradicted my notes, so I am not sure how accurate it is.” Another student: “While not always correct, I felt that it would usually get me started in the right direction to finish understanding the topic or solving the question on my own.” Interestingly, the students who made these types of comments were almost all students who earned A’s or B’s as their final grade. Also noteworthy, the 12% of students who did not use A.I. also earned A’s or B’s. (The average grade was in the C+ range so slightly less than 50% of the students earn A’s or B’s.) Of the students who used it sparingly or rarely, again these were the A or B students. This is perhaps not surprising. The students who knew the material felt less of a need to use A.I.

Since the best use of a generative A.I. is to generate test questions and study guides, I’m glad to see many students mention it in this way. Even more use it for explanations or answers which is more hit-or-miss, but I’m glad that students noticed this. Here’s one thoughtful student comment: “When it comes to studying equations, ChatGPT was very helpful because it showed me step-by-step how to solve it. I also used this model to create practice problems for me. In terms of elaborating the material from class, it was moderately helpful. It mostly gave me vague explanations.” This student also thought it was a limitation of the free version and mused that if they had used a paid version they may have had better results. One student would load the study guide in and then ask ChatGPT to provide timed quiz questions so that the student would feel like they were in an exam.

In Biochem, I saw similar trends: 15% of the students did not use A.I. (All three earned A’s and were among the top five.) There aren’t many math-related or calculation questions in Biochem so most of the students used it to clear up things they weren’t sure about, again usually pertaining to the study guides or my lecture slides (which I provide to the students). Since this is a smaller class, I’m not sure if any trends are significant.

My takeaways: Students are going to use A.I. in a chemistry class regardless of whether you have a policy or not. The majority of them already do so and feel that it is helpful, so they will keep doing so. The academically stronger students use it less, but likely because they feel they understand the material in class and are able to solve problems without outside help most of the time. Many students leverage the generative capabilities of a Large-Language-Model A.I. to generate test questions although whether they are generating sufficiently complex questions is less clear. Some students notice the weaknesses of A.I. answers yet still find it helpful as a guide. Students think A.I. helps to “simplify” some concept they are struggling with. Whether or not it is over-simplified is less clear. Students still gravitate to video explanations to supplement the text explanations of A.I., and youtube remains a key source for students.

Wednesday, January 7, 2026

Out of Book

In The Most Human Human, Brian Christian muses about his experience playing a “confederate” during the 2009 Loebner prize. It’s an annual Turing Test competition where entrants have designed artificial intelligence computer chatbots to mimic a human. A judge has a five-minute conversation over a computer terminal with a chatbot or an actual human being, and then has to decide which is which. As a “confederate” Brian is one of the humans. The chatbot that best fools the judges into thinking it’s a human is dubbed “The Most Human Computer”. The human that best convinces the judges of his or her humanity is “The most human human”.

Chapter 5 of Brian’s book, titled “Out of Book”, refers to where chess games at interesting. Chess openings and endings are often scripted. Over time a database builds up on effective opening moves and their variants. The same is true at the endgame where few pieces remain on the board and a brute-force analysis can determine who the winner will be. The Book refers to this wealth of knowledge in chess openings and endgames. It’s the midgame where things gets interesting, when players are forced into out-of-Book situations. Computer chess programs are loaded with The Book. All decent programs have them. What might make a program superior to others is how it handles the out-of-Book midgame. What makes a chess player a grandmaster is being able to successfully navigate the midgame.

At the novice level, memorizing more openings and endgames and practicing them a lot, often leads to victory over someone with a less prodigious memory. You get into a superior position by standing on the shoulders of the grandmaster giants who have gone before, and the game is about who blunders first by playing a known inferior move from the database. You don’t really need to understand the why behind the moves; you just need to execute them in the correct sequence. At the mastery level for humans, understanding the why is crucial to know when to effectively get out-of-Book and try to outplay your opponent with ingenuity. The challenge in playing against today’s computers is that they can store a huge Book and shrink the space between opening and ending. You might lose to a computer program before even getting out-of-Book.

All this makes me thinks of A.I. use in education. The majority of my students use chatbots to help them complete homework assignments and study for exams. They think it helps their learning, and that may be true some of the time, but I suspect it also leads to a dose of self-deception. They think they know something but when exam time comes (with no book or chatbot to consult), they don’t perform well. My exam questions should be no surprise based on what we cover in class and the study guides I provide, and the academically strong students have no problem doing well.

In contrast, I see a mishmash of nonsense written by the students who have no clue what’s going on. They don’t know what they don’t know. If they made the effort to memorize the worked examples and explanations in class, they might do a tad better on the exams, but that’s not what’s happening. Instead, they cut-and-paste a question to pose to a chatbot, and read the answer thinking they understand it. Compounding the issue is that if you have little understanding of the subject matter, you are unable to tell if the chatbot answer is correct, wrong, or not-even-wrong. Large Language Model chatbots are designed to sound plausible. Their prodigious memory means they are likely to string together words that sound like the right answer. Maybe it is, maybe it isn’t, maybe it’s simply misleading and neither here-nor-there.

A.I. is hailed as a potential disruptor and savior of education. Its champions are the tech companies trotting out deals to hook the young early in the hopes that they will shell out money for premium access later. Schools and universities are stirred into a frenzy by FOMO vibes and guilt-tripped about not preparing their students for the upcoming A.I. revolution. You’ve gotta be able to access The Book. Everyone’s doing it! But like the novice chess players who consult a book to advance their moves without understanding the deeper reasons, novice chemistry students consult an A.I. chatbot and now think they have advanced their understanding of the subject matter.

I think that A.I. coupled with knowledge-expertise can be an excellent tool for discovery and pushing the frontiers of knowledge. There lies the out-of-Book realm, ripe for discovery. For novices, though, chatbot education only provides a shallow and self-deceptive “learning”. If the world is headed towards an idiocracy, this might not matter. What pains me is that the bifurcation between the haves and have-nots will continue to expand, and it may devolve into total anarchy or a totalitarian state of affairs.

Interspersed with the stories of computer chess, Brian concludes his chapter by thinking about the conversations between humans. There’s an opening and closing, mostly scripted, and a potential interesting middle where two people might learn something new about each other and themselves. In a brief chat with a stranger or an acquaintance, one might never stray out-of-Book. Many top chatbots in the Loebner prize were able to steer the conversation to stay in The Book, banal yet effective. It encouraged me to think about getting out-of-Book and become a better conversationalist. Maybe I can try out some new surprising out-of-Book lines with my students and make a better connection this new year!