Making AI-generated code more accurate in any language

Programmers can now generate computer code faster using large language models (LLMS). However, if the code follows the rules of the programming language and does not cause the computer to crash, it will only make the programmer’s life easier.

There are methods to ensure that the LLM complies with the rules of any language in which it generates text, but many of them either distort the expected meaning of the model or are too time consuming to be feasible for complex tasks.

A new approach developed by researchers at MIT and elsewhere automatically directs LLM to generate text that complies with relevant language rules, such as specific programming languages, and is also error-free. Their approach allows LLM to allocate efforts for the most likely effective and accurate output while discarding unclaimed output early in this process. This probability method improves computational efficiency.

Due to these efficiency improvements, the researchers’ architecture enables small LLMs to surpass larger models for generating accurate structured outputs for several real-world use cases, including molecular biology and robotics.

In the long run, this new architecture can help control AI-generated content without XPERTS. For example, it can allow merchants to write complex queries in SQL (a database operation language that uses only natural language prompts).

“This work goes beyond the meaning of the research. It can improve programming assistants, AI-driven data analytics, and scientific discovery tools by ensuring that AI-generated outputs are both useful and correct.”

Loula was co-led by Benjamin Lebrun, a research assistant at the Mila-Quebec Institute of Artificial Intelligence and graduate student Li du Du, a research assistant at John Hopkins University, and research assistant at John Hopkins University. Co-Fellows Author Vikash Mansinghka ’05, Meng ’09, PhD ’09, lead research scientist and leader in the Probability Computing Program in the Department of Brain and Cognitive Sciences of MIT; Alexander K. Lake, assistant professor at Yale University, 20; Tim Vieira, a postdoctoral fellow at Eth Zurich; Timothy J. O’Donnell, an associate professor at McGill University and Canadian Cifar AI Chairman at Mila, who leads the international team; and several others. The study will be presented at the International Conference on Learning Performance.

Execution structure and meaning

A common way to control structured text generated by LLMS involves checking the entire output, such as a block of computer code, to ensure it is effective and will run error-free. If not, the user must start over, thereby increasing the computing resources.

On the other hand, programmers can stop checking output along the way. While this ensures that the code is in compliance with the programming language and is structurally effective, step-by-step correction of the code can cause it to deviate from the meaning expected by the user, which will damage its accuracy in the long run.

“Implementing structures is much easier than meanings,” Loula said. “We can quickly check if something is the correct programming language, but to check its meaning you have to execute code. Our job is also to deal with these different types of information.”

The researchers’ approach involves engineering knowledge in LLM to turn it to the most promising output. These outputs are more likely to follow user-defined structural constraints and have the meaning of the user’s intention.

“We are not trying to train LLM to do this. Instead, we are designing the knowledge that some experts will have and combining it with the knowledge of LLM, which provides a very different scaling method than you see in deep learning,” Mansinghka added.

They achieved this using a technique called sequential Monte Carlo, which enables parallel generation of LLMs to compete with each other. The model dynamically allocates resources to different threads of parallel computing based on the appearance of promising output.

Given the weight of each output, indicating the possibility that it is structurally effective and semantically accurate. In each step in the calculation, the model focuses on those steps with higher weights and throwing the rest.

In a sense, it’s like LLM has an expert looking down at its shoulders to make sure it makes the right choice in each step while keeping it focused on the overall goal. The user specifies its required structure and meaning, and how to check the output, and then the researcher’s architecture guides the LLM to do the rest.

“We’ve identified the hard math, so for any limitations you want to integrate, you’ll get the proper weight. In the end, you get the right answer.”

Enhanced small models

To test their approach, they applied the framework to LLM, a task that generates four outputs: Python code, SQL database queries, molecular structures, and plans to follow the robot.

Compared to existing methods, the researchers’ methods are performed more accurately while requiring fewer calculations.

For example, in Python code generation, the researchers’ architecture enables a small open source model to outperform a dedicated commercial closed currency model that is more than twice the size.

“We’re excited that we can get these small models out of their weight,” Loula said.

Going forward, researchers want to use their technology to control large amounts of generated text, rather than a single job. They also want to combine the method with learning so that it will learn to be more accurate when controlling the output generated by the model.

In the long run, the project may offer a wider range of applications for non-technical users. For example, it can be used in conjunction with a system for automatic data modeling and query the generative model of the database.

Mansinghka added that the method can also enable machine-assisted data analysis systems where users can talk to the software to accurately model the meaning of the data and the questions raised by the user.

“One of the basic problems of linguistics is how the meaning of words, phrases, and sentences is based on a world model, taking into account uncertainty and ambiguity of meaning and reference. LLMS, predicting possible sequences of markers, do not solve this problem. Our paper shows that in a narrow domain of symbols, within a narrow range of symbols, on a narrow symbol, on a ground, on a narrow symbol. Linguistics and artificial intelligence need to understand how machines communicate with the world like we do,” said O’Donnell.

The research was funded in part by the Canadian CIFAR AI Chair Program and the Siegel Family Foundation through gifts to MIT Siegel Family for Intelloince.

Source link

Making AI-generated code more accurate in any language | MIT News

Recent Posts