Imagine a coffee company trying to optimize its supply chain. The company comes from three suppliers, roasts them in two facilities to dark or light coffee, and then ships the roasted coffee to three retail locations. Suppliers have different fixed capacity, baking costs and shipping costs vary from place to place.
The company tries to minimize costs while meeting demand increases by 23%.
Isn’t it easy for companies to ask Chatgpt to come up with a best plan? In fact, despite all their incredible abilities, the performance of large language models (LLMs) often performs poorly when solving such complex planning problems directly.
Instead of trying to change the model to make LLM a better planner, MIT researchers took a different approach. They introduced a framework that guides LLM to break down problems like humans and then automatically solve them using powerful software tools.
Users only need to describe the problem in natural language – no task-specific examples are needed to train or prompt LLM. The model encodes the user’s text prompts into a format that can be revealed by an optimization solver designed to effectively crack extremely difficult planning challenges.
During the recipe process, the LLM checks its work in several intermediate steps to ensure that the plan is correctly described to the solver. If an error is found instead of giving up, LLM tries to fix the broken part of the formula.
When researchers tested their framework on nine complex challenges, such as minimizing distances from warehouse robots that had to travel to complete tasks, the success rate reached an 85% success rate, while the optimal baseline could only reach a 39% success rate.
The multi-function framework can be applied to a range of multi-step planning tasks, such as scheduling crew members or managing machine time at the factory.
“Our research introduces a framework that is essentially an intelligent assistant to planning problems. It can find out the best plan for all your needs, even if the rules are complex or unusual.”
Yang Zhang, a research scientist at the MIT-IBM Watson AI Laboratory, added to the paper. and senior writer Chuchu fans, associate professor of aerospace and astronautics and lead researcher of the cap. The study will be presented at the International Conference on Learning Performance.
Optimization 101
Fan groups have developed algorithms that will automatically solve the so-called combination optimization problem. These huge questions have many interrelated decision variables, each with multiple options that quickly add billions of potential options.
Humans solve these problems by narrowing them down to some choices and then identifying which problem leads to the best overall plan. The researchers’ algorithm solver applies the same principle to optimization problems that are too complex for human rupture.
But the solvers they developed tend to have a steep learning curve, usually only used by experts.
“We think LLM can allow non-experts to use these solutions. In our lab, we will solve the problem of domain experts and formalize it into a problem our solver can solve. Can we teach LLM to do the same?” said the fan.
Using a framework developed by researchers, called LLM-based formal programming (LLMFP), one provides a natural language description of the problem, background information about the task, and queries describing its goals.
LLMFP then prompts LLM to reason about the problem and determines the decision variables and key constraints that will shape the best solution.
LLMFP requires LLM to introduce the requirements of each variable in detail and then encode the information into mathematical formulas for optimization problems. It writes code for coding problems and calls the attached optimization solver, which achieves the ideal solution.
“It’s similar to how we teach undergraduates who optimize problems at MIT. We don’t just teach them one field. We teach them the method.”
As long as the input to the solver is correct, it will give the correct answer. Any errors in the solution come from errors in the recipe process.
To make sure it found a work plan, LLMFP analyzed the solution and modified any incorrect steps in the formulation of the problem. After the plan passed this self-evaluation, the solution was described to the user in natural language.
Improve the plan
Hao said the self-assessment module also allows LLM to add any implicit constraints that are missed for the first time.
For example, if the framework optimizes the supply chain to minimize the cost of coffee shops, humans know that coffee shops cannot ship, but LLM may not realize it.
The self-evaluation step will mark the error and prompt the model to fix it.
“Also, LLM can adapt to user preferences. If the model realizes that a particular user does not like to change the time or budget of their travel plan, it can be suggested to change what suits the user’s needs,” Fan said.
In a series of tests, their framework achieved an average success rate of 83% to 87% in nine different planning problems using multiple LLMs. Although some baseline models are better in some issues, the overall success rate of LLMFP is about twice that of baseline techniques.
Unlike these other approaches, LLMFP does not require domain-specific examples for training. It can find the best solution for planning issues out of the box.
In addition, the user can adjust the LLMFP of different optimization solvers by adjusting the prompts fed to the LLM.
“Using LLM, we have the opportunity to create an interface that allows people to use tools in other domains to solve problems in ways they have never considered before,” Fan said.
In the future, researchers hope to enable LLMFP to supplement the description of the planned problem with images as input. This will help the framework solve special tasks fully described in natural language.
This work is funded in part by the Office of Naval Research and the MIT-IBM Watson AI Lab.