Why Human Guidance Still Matters for AI Agents
AI agents are getting better at performing tasks, but they still rely heavily on human input to learn and improve. Recent research shows that these AI systems need specific procedural knowledge—skills—to do their jobs well. However, AI can’t teach itself these skills and still depends on human guidance to reach higher performance levels.
Introducing SkillsBench: A New Way to Test AI Skills
Researchers created a new benchmark called SkillsBench to evaluate how well AI agents perform across different tasks. The benchmark includes 84 tasks spanning 11 industries, like healthcare, manufacturing, cybersecurity, and software engineering. The goal was to see how AI performs under different conditions and what kind of support it needs to succeed.
In their tests, the researchers looked at each task in three different scenarios. The first was with no skills—meaning the AI only received instructions. The second involved curated skills, where the AI was given resources like code snippets, documentation, and tools to help it perform better. The third scenario had the AI asked to develop its own skills without prior guidance, prompted to do so by the system.
Findings Show Human Input Still Crucial
The results indicated that AI agents performed best when given curated skills. On average, these agents scored 16.2 percentage points higher than those with no skills at all. This suggests that AI systems still need human involvement to develop the procedural knowledge necessary for complex tasks.
Interestingly, in some cases—about 16 out of the 84 tasks—adding human guidance actually made performance worse. This shows that human intervention isn’t always helpful and can sometimes confuse or mislead the AI. Performance also varied a lot across different industries. The biggest impact of curated skills was seen in healthcare tasks, while in software engineering, the benefits were smaller.
When asked to generate their own skills, the AI agents didn’t show any improvement. This highlights that AI still relies on human prompts and curated resources to perform well. Simply asking an AI to “figure it out” doesn’t lead to better results yet, emphasizing the ongoing need for human guidance in AI development.
Overall, these findings reinforce that AI agents are not yet at a point where they can fully learn and teach themselves. Human expertise remains a vital part of training AI to handle real-world tasks effectively. As AI continues to develop, the balance between human input and machine autonomy will be key to unlocking their full potential.















What do you think?
It is nice to know your opinion. Leave a comment.