Over the past 50 years, some researchers have made extensive views on scientific advancements and have reached an equally disturbing conclusion that scientific productivity is declining. Spend more time, more money and bigger teams to make discoveries that were once faster and cheaper. Although various explanations have been provided for slowing down, one is that as research becomes more complex and specialized, scientists must spend more time reviewing publications, designing complex experiments and analyzing data.
Now, charitable funded research labs will seek to accelerate scientific research through AI platforms, aiming to automate many key steps to achieve the path to scientific advancement. The platform consists of a series of AI agents dedicated to tasks, including information retrieval, information synthesis, chemical synthesis design and data analysis.
Future House founders Dr. Sam Rodriques and Andrew White believe that by bringing every scientist into their AI agents, they can break through the biggest bottlenecks in science and help solve some of the most pressing problems in humanity.
“Natural language is the real language of science,” Rodericks said. “Others are building the basic models for biology, where machine learning models speak the language of DNA or proteins, which is powerful. But discovery is not represented in DNA or proteins. The only way we know how to represent discovery, hypothesis, and rationality is natural language.”
Looking for big problems
For his PhD study at MIT, Rodericks tried to understand the internal functioning of the brain in Professor Ed Boyden’s lab.
“The impression that I got from my PhD at MIT is inspired by, even if we have all the information about how the brain works, we won’t know because no one has time to read all the literature,” Rodriques explained. “Even if they can read everything, they can’t assemble it into a comprehensive theory. This is the foundation of the future house puzzle.”
Rodriques wrote that as the final chapter of his 2019 PhD dissertation, a new type of large research collaboration was needed, and although he spent some time running a lab at the Francis Crick Institute in London after graduation, he found himself trapped in a wide range of problems in the field of science and none could accept it.
“I’m interested in how to automate or expand science, what new organizational structures or technologies will unlock higher scientific productivity,” Rodriques said.
When Chat-GPT 3.5 was released in November 2022, Rodriques saw the path to a stronger model that could generate scientific insights on its own. Around that time, he also met Andrew White, a computing chemist at the University of Rochester, who had been granted early access to CHAT-GPT 4. White established the first large-scale language agent for science, and researchers teamed up to start the Future Future House.
The founders began to hope to create different AI tools for tasks such as literature search, data analysis, and hypothesis generation. They started with data collection and eventually released PaperQA in September 2024, which Rodriques calls the world’s best AI agent for retrieving and summarizing information from scientific literature. Around the same time, they published anyone, the tool that allowed scientists to determine if someone had conducted a specific experiment or explored a specific hypothesis.
“We just sat around and asked, ‘What are the questions we always asked as scientists?'” Rodericks recalled.
When Future House officially launched its platform on May 1 this year, it renamed some tools. Paper Qa is now a crow, and now some people call it an owl. Falcon is an agent that can compile and review more resources than Crow. Another new agent, Phoenix, could use professional tools to help researchers plan chemistry experiments. Finch is a proxy designed to automate data-driven discovery in biology.
On May 20, the company demonstrated a multi-agent scientific discovery workflow to automate the critical steps of the scientific process and identified new treatment candidates for age-related macular degeneration (DAMD), the main cause of irreversible blindness worldwide. In June, Future House released Ether0, a 24B chemical reasoning model.
“You really have to think of these agents as part of a larger system,” Rodriques said. “Soon, literature search agents will integrate with data analytics agents, assuming a prod, an experimental plan agent, and they will be engineered to work seamlessly.”
Everyone’s agent
Today, anyone can access Future House’s agents at platform.futurehouse.org. The company’s platform launch has sparked excitement in the industry, and stories about scientists using agents to accelerate research have begun to be told.
One of the future scientists uses these agents to identify genes that may be associated with polycystic ovary syndrome and proposes new therapeutic hypotheses for the disease. Another Lawrence Berkeley National Laboratory researchers used Crow to create an AI assistant that was able to search PubMed research databases for information related to Alzheimer’s disease.
Scientists at another research institute used these agents to systematically evaluate genes related to Parkinson’s disease and found that agents at Future House performed better than general agents.
Rodriques said those scientists who think agents are not like Google Scholar, but more like smart assistant scientists, thus maximizing the platform.
“People looking for guesses tend to get more miles from CHAT-GPT O3 in-depth research, while those looking for truly faithful literary commentary tend to get more money from our agents,” Rodriques explained.
Rodriques also believes that Future House will soon arrive at its agents that can use the raw data from the research paper to test the repeatability of its results and verify the conclusions.
In the long run, in order to keep scientific progress forward, Rodriques said Future House is working to incorporate its agents into default knowledge so that it can perform more complex analyses while also enabling agencies to use computational tools to explore hypotheses.
“A lot of progress has been made around the basic models of science and language models of proteins and DNA, and we now need to get agents to access these models and all other tools that people usually use science,” Rodericks said. “It is crucial to build infrastructure to allow agents to use more specialized scientific tools.”