AI systems learn from multiple types of scientific information and run experiments to discover new materials

Machine learning models can accelerate the discovery of new materials by making predictions and suggestions experiments. However, most models today only consider some specific types of data or variables. Comparing with human scientists, they work in a collaborative environment and consider experimental results, a broader scientific literature, imaging and structural analysis, personal experience or intuition, and the input of colleagues and peer reviewers.

Now, MIT researchers have developed a method to optimize material recipes and plan experiments that combine information from different sources, such as insights in the literature, chemical composition, microstructure images, and more. The method is part of a new platform called Copilot of Real World Experimental Scientist (Crest), which also uses robotic equipment to perform high-throughput material testing, and the results are fed back into large multi-model models to further optimize material recipes.

Human researchers can talk to the system in natural language without coding, and the system makes its own observations and assumptions in the process. Camera and visual language models also allow the system to monitor experiments, detect problems and propose corrections.

“In the field of AI in science, the key is to design new experiments,” said Ju Li, professor of dynamic engineering at Carl Richard Soderberg, School of Engineering. “We use multimodal feedback – for example, information on the performance of palladium in fuel cells in this temperature and human feedback was provided from previous literature to supplement experimental data and design new experiments. We also use robots to synthesize and characterize the structure and test performance of materials.”

The system is described in published papers nature. The researchers used the crest to explore more than 900 chemicals and perform 3,500 electrochemical tests to discover a catalyst material that delivers record power density in fuel cells running on formate salts to generate electricity.

As first authors, joining LI are doctoral students Zhen Zhang, Zhichu Ren PhD ’24, PhD students Chia-Wei Hsu and Postdoc Weibin Chen. Their co-authors are Iwnetim Abate, assistant professor at MIT; Pulkit Agrawal; Yang Sha-Horn, professor of Eastern Engineering at JR; Aubrey Penn, researcher at MIT.NANO; Zhang-Wei Hong PhD ’25, Hongbin Xu PhD ’25; Daniel Zheng PhD ’25; MIT graduate students Shuhan Miao and Hugh Smith; MIT Postdocs Yimeng Huang, Weiyin Chen, Yungsheng Tian, Yifan Gao and Yaoshen Niu; former MIT postdoctoral Sipei li; and Chi-Feng Collaborators including Lee, Yu-Cheng Shao, Hsiao-Tsu Wang and Ying-Rui Lu.

Smarter system

Materials science experiments can be time-consuming and expensive. They asked researchers to carefully design the workflow, make new materials and perform a series of tests and analysis to understand what is going on. These results are then used to decide how to improve the material.

To improve the process, some researchers have turned to a machine learning strategy called active learning to effectively utilize previous experimental data points and explore or exploit this data. Active learning helped researchers identify new materials such as batteries and late semiconductors when paired with statistical techniques called Bayesian optimization (BO).

“Bayesian optimization is like Netflix recommends the next movie to watch based on your viewing history, but it suggests the next experiment to do,” Li explained. “But the basic Bayesian optimization is too simple. It uses boxed design spaces, so if I say I’m going to use platinum, palladium, and iron, it just changes the proportion of these elements in this small space. But real materials have more dependencies, and Bo is often lost.”

Most active learning methods also rely on a single stream of data that does not capture everything that happens in the experiment. To provide computing systems with more human-like knowledge, while still leveraging the speed and control of automated systems, Li and his collaborators established CREST.

Crest’s robotic equipment includes liquid handling robots, a carbon-heat shock system that can quickly synthesize materials, automated electrochemical workstations for testing, characterization equipment including automatic electron and optical microscopes, and auxiliary equipment such as pumps and gas valves, can also be remotely controlled. Many processing parameters can also be adjusted.

With the user interface, researchers can chat with Crest and tell it to use active learning to find promising recipes for ingredients for different projects. Crane can include up to 20 precursor molecules and substrates in their recipes. To guide material design, Crest’s model searches scientific papers for descriptions of elements or precursor molecules that may be useful. When human researchers tell Crest to pursue new recipes, it starts robotic symphony for sample preparation, characterization, and testing. Researchers can also ask Crest to perform image analysis from scanning electron microscope imaging, X-ray diffraction and other sources.

Information in these processes is used to train active learning models that use both literature knowledge and current experimental results to propose further experiments and accelerate material discovery.

“For each recipe, we use previous literature text or databases, which created these huge representations of each recipe based on the previous knowledge base even before the experiment was conducted,” Li said. “We perform major ingredient analysis in this knowledge embedding space to reduce the search space to capture the variability of most performance. We then use Bayesian optimization in this simplified space to design new experiments. After the new experiment, we provide large language models for newly obtained multimodal experimental data and human feedback to enhance the knowledge base and revalidate the search space, thus reducing the search space, thus making our search range large.

Materials science experiments may also face the challenge of repeatability. To solve this problem, Crest monitors its experiments through cameras, looks for potential problems and proposes solutions to human researchers through text and sound.

Using the crest, the researchers developed an electrode material for a high-density fuel cell called a direct formate fuel cell. After exploring more than 900 chemicals in three months, Crest discovered a catalyst material made of eight elements that has a 9.3-fold increase in power density per dollar than Pure Palladium, an expensive precious metal. In further testing, even if the battery contains only one quarter of the precious metals from previous devices, crown materials were used to deliver record power density to an effective direct thyroid fuel cell.

The results show that Crest has the potential to find solutions to real-world energy problems that have plagued the materials science and engineering communities for decades.

“A major challenge with fuel cell catalysts is the use of precious metals,” Zhang said. For fuel cells, researchers have used various precious metals, such as palladium and platinum. We used a multivariate catalyst that also combines many other cheap elements to create optimal catalytic activity and resistance to poisoned species, such as carbon carbon monoxide and adsorbed hydrogen atoms. People have been looking for options for years.

Helpful assistant

Early on, poor repeatability was a major problem, limiting the ability of researchers to execute their new active learning techniques on experimental datasets. Material properties may be subject to the way precursors are mixed and processed, and any number of problems can be cleverly altered experimental conditions and require careful examination to correct.

To partially automate this process, the researchers correlate computer vision and visual language models with domain knowledge in the scientific literature, which allows the system to assume non-repeatable reproducible sources and propose solutions. For example, the model might notice when there is a millimeter-sized deviation in the shape of the sample, or when moving something is out of place. The researchers included some recommendations for models, which increased consistency, suggesting that these models have become good experimental assistants.

The researchers noted that humans still conducted most of the debugging in the experiment.

“For human researchers, Crest is an assistant, not a replacement,” Lee said. “Human researchers are still essential. In fact, we use natural language so that the system can explain what it is doing and propose observations and hypotheses. But it’s a step towards a more flexible, autonomous driving lab.”

Source link

AI systems learn from multiple types of scientific information and run experiments to discover new materials | MIT News

Recent Posts