Unlocking the Big Language Model's Bias

Research shows that large language models (LLMs) tend to overemphasize information at the beginning and end of a document or conversation, while ignoring the middle.

This “position bias” means that if a lawyer uses an LLM-driven virtual assistant to retrieve a phrase in a 30-page affidavit, the LLM is more likely to find the correct text on the initial or final page.

MIT researchers have discovered the mechanism behind this phenomenon.

They created a theoretical framework to study how information flows through machine learning structures that form the backbone of LLM. They found that some design selection models deal with design selections of how input data leads to position deviations.

Their experiments show that model architectures, especially those that affect how information is distributed in the model, can cause or exacerbate position bias, and training data can also lead to problems.

In addition to pointing out the origin of position deviations, their frameworks can also be diagnosed and corrected in future model designs.

This could lead to more reliable chatbots that keep topical during long conversations, medical AI systems that are more fairly recommended when processing large amounts of patient data, while code assistants keep an eye on all parts of the program.

“These models are black boxes, so as an LLM user, you probably don’t know that position bias can cause your model to be inconsistent. You just feed it your documents in whatever order you want and expect it to work. But by understanding the underlying mechanism of these black-box models better, we can improve them by addressing these limitations,” says Xinyi Wu, a graduate student in the MIT Institute for Data, Systems, and Society (IDSS) and the Laboratory for Information and Decision Systems (LID), and the first author of the paper on this study.

Her co-authors include Yifei Wang, a postdoctoral fellow at MIT; Stefanie Jegelka (EECS), an associate professor of electrical engineering and computer science, a member of the IDSS and the Computer Science and Artificial Intelligence Laboratory (CSAIL); Ali Jadbabaie, a professor and head of the Department of Civil and Environmental Engineering, a core faculty member of IDSS and a principal investigator at Lids. The study will be presented at the International Machine Learning Conference.

Analyze attention

LLMs like Claude, Llama, and gpt-4 are powered by a neural network architecture called transformers. The transformer is designed to process sequential data, encodes sentences into blocks called tokens, and then learns the relationships between tokens to predict the following words.

These models become excellent due to the attention mechanism of the attention mechanism, which uses an interconnected data processing node layer to understand the context by allowing the token to selectively focus or focus on the relevant token.

However, if each token can participate in all other tokens in a 30-page document, this will quickly become computationally tricky. Therefore, when engineers build transformer models, they often employ attention masking techniques that limit the words that the token can participate in.

For example, the Causal Mask allows only words to visit the previous words.

Engineers also use position coding to help the model understand the position of each word in the sentence, thereby improving performance.

MIT researchers have established a graph-based theoretical framework to explore how these modeling choices, attention masks, and position coding affect position deviations.

“Everything is coupled and entangled in the attention mechanism, so it’s hard to study. The graph is a flexible language that describes the dependencies between words in the attention mechanism and tracks them across multiple layers,” Wu said.

Their theoretical analysis shows that even if there is no bias in the data, causal masking gives the model an inherent bias to the input start.

If earlier words are relatively unimportant to the meaning of the sentence, causal masking will cause the transformer to pay more attention to its beginning.

“While earlier words and later words in sentences are often more important, these biases can be very harmful if LLM is used on tasks that are not produced by natural language, such as ranking or information retrieval,” Wu said.

As the model grows, this bias is amplified with other levels of attention mechanisms, because early parts of the input are used more frequently during the inference of the model.

They also found that using position coding to more strongly link words to nearby words can alleviate position bias. The technique refocuses the model’s attention in the right place, but its effect can be diluted in models with more attention layers.

These design choices are just one reason for position bias – some can come from training data that the model uses to learn how to prioritize words.

“If you know that your data is biased in some way, you should also fix the model based on adjusting your modeling choices,” Wu said.

Get lost in the middle

After establishing a theoretical framework, the researchers conducted experiments in which they systematically changed the position of the correct answers in the text sequence to perform the information retrieval task.

The experiment shows the “middle loss” phenomenon, where the search accuracy follows the U-shaped pattern. If the correct answer is at the beginning of the sequence, the model performs best. If the correct answer is close to the end, the performance reduces the closer it is to the middle.

Ultimately, their work shows that using different masking techniques, removing additional layers from attention mechanisms, or strategically adopting position coding can reduce position bias and improve model accuracy.

“By combining theory and experiments, we are able to study the consequences of model design choices that are not clear at the time. If you want to use models in high-risk applications, you have to know when it will work, when it won’t work, and why,” Jadbabaie said.

In the future, researchers hope to further explore the impact of location coding and to examine how position bias can be strategically exploited in certain applications.

“These researchers offer a rare theoretical lens into the attention mechanism at the heart of the transformer model. They provide a compelling analysis that clarifies longstanding quirks in transformer behavior, showing that attention mechanisms, especially with causal masks, inherently bias models towards the beginning of sequences. The paper achieves the best of both worlds — mathematical clarity paired with insights that reach into the guts of real-world systems,” says Saberi, Professor and Director of the Center for Computing Market Design at Stanford University, is not involved in the work.

The study was supported in part by the U.S. Office of Naval Research, the National Science Foundation and Professor Alexander von Humboldt.

Source link

Unlocking the Big Language Model’s Bias | MIT News

Recent Posts