
2you
Add a review FollowOverview
-
Founded Date april 11, 2025
-
Sectors Health Care
-
Posted Jobs 0
-
Viewed 6
Company Description
MIT Researchers Develop an Effective Way to Train more Reliable AI Agents
Fields varying from robotics to medicine to government are trying to train AI systems to make meaningful decisions of all kinds. For instance, using an AI system to smartly control traffic in an overloaded city might assist motorists reach their destinations much faster, while enhancing safety or sustainability.
Unfortunately, teaching an AI system to make good decisions is no simple job.
Reinforcement learning designs, which underlie these AI decision-making systems, still typically stop working when faced with even little variations in the tasks they are trained to carry out. When it comes to traffic, a design might struggle to manage a set of intersections with different speed limits, varieties of lanes, or traffic patterns.
To improve the reliability of support learning models for complicated jobs with irregularity, MIT researchers have presented a more efficient algorithm for training them.
The algorithm strategically picks the best jobs for training an AI agent so it can efficiently carry out all jobs in a collection of related tasks. In the case of traffic signal control, each job could be one crossway in a task space that includes all intersections in the city.
By concentrating on a smaller sized number of intersections that contribute the most to the algorithm’s overall efficiency, this technique maximizes efficiency while keeping the training expense low.
The researchers found that their strategy was between 5 and 50 times more efficient than standard techniques on an array of simulated jobs. This gain in effectiveness assists the algorithm discover a better option in a faster way, eventually enhancing the efficiency of the AI representative.
”We were able to see incredible performance improvements, with an extremely easy algorithm, by thinking outside the box. An algorithm that is not very complex stands a better opportunity of being embraced by the neighborhood since it is simpler to carry out and simpler for others to comprehend,” states senior author Cathy Wu, the Thomas D. and Virginia W. Cabot Career Development Associate Professor in Civil and Environmental Engineering (CEE) and the Institute for Data, Systems, and Society (IDSS), and a member of the Laboratory for Information and Decision Systems (LIDS).
She is signed up with on the paper by lead author Jung-Hoon Cho, a CEE graduate student; Vindula Jayawardana, a college student in the Department of Electrical Engineering and Computer Science (EECS); and Sirui Li, an IDSS graduate student. The research will be presented at the Conference on Neural Information Processing Systems.
Finding a middle ground
To train an algorithm to manage traffic lights at many intersections in a city, an engineer would normally choose in between two main approaches. She can train one algorithm for each intersection separately, using only that crossway’s information, or train a bigger algorithm using information from all crossways and then apply it to each one.
But each approach features its share of drawbacks. Training a different algorithm for each job (such as a provided crossway) is a time-consuming process that requires a massive amount of data and computation, while training one algorithm for all tasks often results in substandard efficiency.
Wu and her collaborators looked for a sweet spot between these 2 approaches.
For their approach, they pick a subset of tasks and train one algorithm for each job separately. Importantly, they strategically select private jobs which are probably to improve the algorithm’s overall performance on all jobs.
They leverage a common trick from the reinforcement knowing field called zero-shot transfer learning, in which an already trained design is used to a new task without being more trained. With transfer learning, the model frequently carries out extremely well on the new next-door neighbor task.
”We know it would be ideal to train on all the jobs, but we questioned if we could get away with training on a subset of those jobs, use the result to all the tasks, and still see a performance increase,” Wu says.
To recognize which tasks they should pick to take full advantage of predicted efficiency, the scientists developed an algorithm called Model-Based Transfer Learning (MBTL).
The has two pieces. For one, it designs how well each algorithm would carry out if it were trained separately on one task. Then it designs how much each algorithm’s performance would degrade if it were moved to each other job, a concept referred to as generalization efficiency.
Explicitly modeling generalization performance enables MBTL to estimate the value of training on a brand-new job.
MBTL does this sequentially, selecting the job which results in the greatest performance gain initially, then choosing additional jobs that offer the biggest subsequent limited enhancements to overall performance.
Since MBTL only concentrates on the most appealing jobs, it can significantly improve the effectiveness of the training process.
Reducing training costs
When the scientists evaluated this method on simulated tasks, consisting of managing traffic signals, handling real-time speed advisories, and performing a number of classic control jobs, it was five to 50 times more efficient than other techniques.
This means they might come to the very same option by training on far less information. For circumstances, with a 50x effectiveness increase, the MBTL algorithm could train on just two jobs and achieve the same performance as a basic technique which uses data from 100 jobs.
”From the point of view of the two primary techniques, that means data from the other 98 jobs was not essential or that training on all 100 jobs is confusing to the algorithm, so the performance ends up worse than ours,” Wu states.
With MBTL, including even a little amount of additional training time might cause much better performance.
In the future, the scientists prepare to create MBTL algorithms that can reach more complicated issues, such as high-dimensional task spaces. They are likewise interested in using their technique to real-world issues, particularly in next-generation movement systems.