AI is getting better at doing things, but still bad at deciding what to do?

Reddit r/artificial / 5/6/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical Usage

Key Points

  • The author reports experimenting with AI workflows/agents and finding that AI’s execution skills (writing, summarization, multi-step task handling) are strong, but failures stem from weak decision-making rather than raw capability.
  • Common breakdowns include choosing the wrong context, missing edge cases, continuing when the system should ask for clarification, and applying logic in the wrong situation.
  • A lead qualification and outreach automation example worked on clean data but behaved incorrectly on messy, incomplete, or ambiguous inputs without failing loudly, indicating fragile “judgment” under real-world conditions.
  • The post suggests the bottleneck is less about improving model outputs via prompts or retrieval alone, and more about structuring context and decision layers (or orchestration logic) across the workflow—citing approaches like “60x AI.”
  • The author asks readers whether the primary bottleneck is better model output quality or better system-wide decision-making mechanisms such as context, logic, and orchestration.

i've been experimenting with AI workflows/agents over the past few weeks, and sth keeps coming up that i cant quiet figure out. on one hand, AI is incredibly good at execution like writing content, summarizing, even handling multi step workflows, but the failures i keep seeing arent really about capability. they're about small decisions like:

- choosing the wrong context

- missing edge cases

- continuing when it should stop and ask for clarification

- applying the right logic in the wrong situation

whats weird is these arent hard problem, they're the kinds of judgement calls human make without thinking. a simple example i ran into was i tried automating basic lead qualification + outreach flow using AI. it worked great on clen data, but as soon as inputs got messy (incomplete info, slightly ambiguous intent) the system didnt fail loudly, it just kept executing, incorrectly. it feels like execution is mostly solved, but decision making inside workflows is still very fragile. i recently came across approaches like 60x ai that seem to focus on structuring context and decision layers around workflows, rather than just improving prompts or chaining tools. im curious how people think about this. do u see the main bottleneck now as:

- improving model outputs (better prompts, better retrieval) or

- improving how decisions are made across a system (context, logic, orchestration)?

would love to hear from people who've tried building or running these in real world scenarios

submitted by /u/Tough_Daikon_4321
[link] [comments]