Friday, October 20, 2023

Chemistry Prompt Engineering

 

After my initial foray into ChatGPT in March and April, I haven’t used it much. The context in which I used ChatGPT was to explore how it can be leveraged in chemical education. Clearly it has many limitations, but it could be useful to students in generating study guides, test questions, initial ideas for a research topic, and bits of code. Critiquing ChatGPT’s answers to chemistry questions could potentially help sharpen a student’s conceptual understanding as they have to think carefully about the answers, differentiating the right from the wrong.

 

I had given little thought to how GPT might aid my research or the chemical enterprise more generally. I have seen A.I. methods used for materials discovery, retrosynthetic analysis, and cheminformatics; but these were usually optimized towards the particular problem to be solved rather than utilizing a more general LLM (large language model). So it was clickbait for me when I saw the following title and abstract for a recently published paper (shown below).

 



Reading through the paper, the “tests” are somewhat limited in scope, but I appreciated how the authors organized their investigation around different aspects of the multifaceted chemical research enterprise. Not surprisingly GPT-4 can do simple tasks reasonably well, though it gets some things wrong. The examples are interesting and somewhat illuminating. In one case, after inputting temperature and vapor pressure data, GPT-4 is asked to find the boiling point, and its results are also compared to a Bayesian optimization. (GPT does worse.) In another case, GPT-4 is used to generate python code to control a robot arm.

 

The authors include several caveats. Molecular recognition is still a problem for GPT-4, but one could get GPT-4 to talk to another modeling system (it already does so with Wolfram to do math) that can handle the coding and decoding of molecules cheminformatics-style. GPT-4’s database doesn’t always include the most recent chemical literature. Again, one could build a local model that includes this – which has the advantage of keeping data local and proprietary in chemical industry. GPT-4 still gets things “wrong” but that’s because GPT isn’t optimized to get things right but to sound plausible. Will GPT get better? Probably. But for cutting-edge work in the chemical enterprise, I expect local specifically-trained models with chemical goals in mind to do better. The authors also consider how one might define “language objects” pertaining to chemistry and this might be an approach worth further testing.

 

Here's a picture of what the authors think GPT-4 can and cannot do. I add the caveat that the example prompts they used were limited so even in the areas in which they have green check marks, I would say that it works better in some subareas than others. None of these have been “solved” by GPT.

 


 

No comments:

Post a Comment