Logo
FrontierNews.ai

How AI Workflows Can Cut Energy Use by 73 Percent Without Sacrificing Speed

A new system called Murakkab can slash the energy demands of complex AI workflows by nearly three-quarters, according to researchers at MIT and Microsoft. The breakthrough addresses a critical inefficiency in how artificial intelligence applications are currently designed and deployed across cloud platforms, where wasted computation translates directly into wasted electricity and money.

What Are AI Agent Workflows and Why Do They Waste So Much Energy?

AI agent workflows are sophisticated software systems that chain together multiple AI models and external tools to handle complicated tasks. Think of a video question-answering application that extracts key frames, generates a transcript, and then answers user questions about the video. These workflows are becoming the backbone of what major cloud providers offer, but they're often inefficient by design.

The problem stems from how developers currently build these systems. They must hard-code technical choices upfront, specifying which AI models to use, in what order, and what hardware configuration to run them on. This manual approach leaves enormous room for waste. "Even if you wanted to do all this manually, it is unlikely that you'll be able to configure the workflow optimally because the space of possible configurations is so large," explained Gohar Chaudhry, an electrical engineering and computer science graduate student and lead author of the research.

"Agentic workflows are getting very complicated and quickly becoming the backbone of what cloud providers are doing. Energy usage is a huge concern, so we need to be very careful about how efficient these workflows are. It is very easy to over-allocate resources, wasting energy and money," said Gohar Chaudhry.

Gohar Chaudhry, EECS Graduate Student, MIT

When a new AI model is released that could improve performance or efficiency, developers often need to start from scratch to integrate it. Meanwhile, cloud data centers deploying these applications can't see inside the workflows to allocate hardware resources intelligently in real time.

How Does Murakkab Automatically Optimize These Workflows?

Murakkab takes a fundamentally different approach. Instead of requiring developers to specify every technical detail, it lets them describe what they want the workflow to accomplish in plain language. The system then automatically identifies the best existing models and tools to combine, determines which components can run in parallel versus sequentially, and figures out the ideal hardware configuration.

The system makes configuration decisions dynamically, meaning if a new GPU accelerator or AI model becomes available tomorrow, developers don't need to manually redesign their application. Murakkab also gives cloud providers visibility into multiple workloads simultaneously, enabling them to share computational resources more efficiently while respecting each user's constraints, whether that's minimizing cost or maximizing speed.

What Are the Real-World Energy and Cost Savings?

When tested on diverse agentic workflows for video question-answering and code generation, Murakkab delivered dramatic efficiency gains. The system met user performance requirements while using only about 35 percent of the computation required by traditional approaches. More importantly for sustainability, it consumed only about 27 percent as much energy for less than 25 percent of the cost.

The system also enables users to balance tradeoffs dynamically. In one test case, Murakkab lowered energy consumption of an agentic workflow by more than tenfold with only about a 2 percent drop in accuracy for the customer. The researchers also discovered that Murakkab identified an unexpectedly ideal configuration for a model that selects video frames, optimizing performance in ways that would be nearly impossible for a developer to achieve manually.

Ways to Understand Murakkab's Impact on AI Efficiency

  • Automatic Model Selection: Murakkab identifies the best existing AI models and tools to combine without requiring developers to manually evaluate every option, saving both time and computational resources.
  • Dynamic Hardware Allocation: The system configures hardware in real time based on user priorities like cost or speed, preventing the over-allocation of resources that typically wastes energy in cloud deployments.
  • Parallel Processing Optimization: Murakkab determines which workflow components can run simultaneously versus sequentially, reducing overall execution time and energy consumption.
  • Real-Time Adaptation: When new models or accelerators become available, the system automatically incorporates them without requiring developers to redesign their applications from scratch.

The research will be presented at the USENIX Symposium on Operating Systems Design and Implementation, a major venue for systems research. The work was supported in part by the Semiconductor Research Corporation and the U.S. Defense Advanced Research Projects Agency.

Why Does This Matter Beyond Just Saving Money?

As AI applications become more complex and widespread, their energy footprint grows proportionally. Cloud data centers already consume enormous amounts of electricity, and inefficient workflow design multiplies that burden. By automatically optimizing how these workflows are structured and deployed, Murakkab addresses a critical sustainability challenge at scale.

The researchers plan to expand their system to handle even more complex workflows and larger computing clusters while exploring opportunities to optimize new agentic applications. "There is a lot of potential to make these workflows more resource-optimal so they consume far less energy, but we need to be thinking about this at the scale of major cloud platforms," Chaudhry noted.

This development aligns with broader industry efforts to make AI more sustainable. Universities and research institutions are increasingly focused on green AI practices, from optimizing laboratory operations to developing more efficient computing systems. The challenge now is scaling these innovations across the cloud platforms that power millions of AI applications worldwide.