Creating new molecules is one of the most difficult tasks in chemistry. Whether the goal is a life-saving drug or a cutting-edge material, each compound must be constructed through a carefully planned series of reactions. Planning these steps requires deep expertise and strategic thinking, so chemists often spend years mastering the processes.
The big hurdle is retrosynthesis. In this approach, chemists start with the molecule they ultimately want and work backwards to find simpler starting materials and possible reaction pathways. This involves many decisions, such as choosing the appropriate building blocks, deciding when to form the ring, and deciding whether sensitive parts of the molecule need to be protected. Although computers can scan vast “chemical spaces,” they still have difficulty matching the strategic judgment of experienced chemists.
Another challenge involves reaction mechanisms, which explain how reactions proceed step-by-step through the transfer of electrons. Understanding these mechanisms allows scientists to predict new reactions, improve efficiency, and avoid costly trial and error. Current computational tools can suggest many possible paths, but often lack the intuition needed to pinpoint the most realistic path.
A new AI approach to chemical reasoning
Researchers led by Philipp Schwaller at EPFL have developed a new method for using large-scale language models (LLMs) as an inference tool in chemistry. Rather than directly generating chemical structures, these models serve as evaluators to guide existing computational systems.
The new framework, called Synthegy, combines traditional search algorithms with AI that can interpret chemical strategies written in natural language.
“User interfaces are critical when creating tools for chemists, and previous tools relied on cumbersome filters and rules,” said Andres M Bran, first author of the Synthegy paper published in Matter. “Synthegy allows chemists to iterate faster and navigate more complex synthetic ideas by simply speaking.”
How Synthegy improves retrosynthesis planning
Synthegy begins with a target molecule and simple instructions written in everyday language. For example, chemists may require that certain rings be formed early or that unnecessary protecting groups be avoided. Standard retrosynthesis software generates many possible routes.
Each of these paths is converted to text and reviewed by the language model. Synthegy scores how well each option matches the chemist’s instructions and explains why. This makes it easy to rank and filter the best routes. By guiding searches with natural language, chemists can quickly focus on strategies that meet their goals.
Understanding reaction mechanisms with AI
Synthegy applies a similar approach to reaction mechanisms. We break down reactions into basic electron movements and explore various possibilities. A language model evaluates each step and guides the search toward chemically meaningful paths.
The system can also incorporate additional details such as reaction conditions and expert hypotheses provided as text. This flexibility allows researchers to refine their analyzes and explore more realistic scenarios.
Performance and verification by chemists
In synthetic planning, Synthgey was able to identify pathways that matched complex strategic instructions. In a double-blind study, 36 chemists provided 368 valid ratings, and their ratings matched the system’s results an average of 71.2% of the time.
This framework can flag unnecessary protection steps, determine how feasible a reaction is, and prioritize efficient solutions. We also demonstrate that LLM can function at multiple levels, from analysis of functional groups to evaluation of entire synthetic routes. Larger models performed best, while smaller models showed more limited capabilities.
The new role of AI in chemistry
This research focuses on another way AI can support chemistry. Rather than replacing human decision-making, Synthegy positions language models as guides to help interpret and refine computational results. Chemists can explain their goals in plain language and get solutions that reflect their strategies.
This approach has the potential to accelerate drug discovery, improve reaction design, and make advanced tools more accessible to scientists.
“The relationship between synthetic planning and mechanisms is very interesting. We usually use mechanisms to discover new reactions that allow us to synthesize new molecules,” says Andres M Bran. “Our job is to bridge that gap computationally through a unified natural language interface.”
other contributors
- National Center for Research in Catalysis (NCCR Catalysis)
- b12 lab

