Intentions and AI Workshop

June 1–4, 2025

Our group held a workshop around the theme of intentions & AI, broadly related to the research questions detailed here:

  • What are the criteria that must be met for intentions in complex systems—both biological and artificial?
  • How can we adapt the most pertinent insights, tools, and models from the sciences that study human intentions to measure, characterize, and intervene on intentions in AI?
  • What can AI models teach us about intentions in humans?

Session Summaries

June 2, 2025

Summary: Workshop organizer Uri Maoz welcomed a diverse group of interdisciplinary experts to address the urgent challenge of understanding and guiding intentions in increasingly autonomous AI systems. He outlined the workshop’s key themes: (1) defining criteria for intention; (2) adapting scientific tools from studies of intentions in biological systems and especially humans to AI; and (3) learning about our own minds from AI models. Maoz emphasized the event’s collaborative nature, aimed at building a foundation for a new research hub, the Laboratory for Understanding Consciousness, Intentions, Decisions, and Artificial Intelligence (LUCID) at Chapman University, dedicated to this critical area of study.

Read more about the session.

Guiding Question: What core features define ‘intention’ in humans? How do those features generalize to animals and potentially to AI? How do we differentiate intentional action from reflexes or from programmed behavior (including optimization among multiple goals), and what are the philosophical and practical implications of these distinctions?

Summary:

This session established a rich, multi-faceted definition of intention. Philosopher Michael Bratman framed intention as a stable, partial plan that organizes action over time and enables social coordination. Neuroscientist John-Dylan Haynes grounded this concept in the brain’s distributed, decodable neural patterns. AI researcher Vincent Conitzer demonstrated how current AIs can simulate this planning process but lack genuine commitment. Early-career philosopher Paul Talma proposed a hierarchy that placed human intention at the highest level of deliberative “goal selection.” The general discussion probed the nuances of shared intentions, where collaborators have different underlying reasons, and explored how the concept of intention applies to complex cases like contingency planning. A central debate emerged over the utility of the folk-psychological distinction between “goals” and “intentions,” with the group converging on the idea that human intention involves a complex package of planning and commitment that serves as a crucial benchmark for evaluating agency and thus goes beyond the notion of a goal.

Read more about this session.

Guiding Question: How should legal and ethical frameworks conceptualize intentionality as it pertains to AI? How can responsibility be assigned when AI actions (intended by the user or emergent) cause harm? Should concepts like ‘mens rea’ apply, or do they at a minimum require redefinition for artificial agents?

Summary: This session explored the complex intersection of AI, intentionality, and accountability. Legal scholar Scott Shapiro introduced a method for probing an AI’s “intent” by testing its code against counterfactual scenarios. Neuroscientist Uri Maoz discussed the challenge of applying legal concepts like mens rea to autonomous systems. Philosopher Pamela Hieronymi argued that true moral responsibility is impossible without “human sociability”—a capacity for guilt and mutual regard that AI at least currently lacks. Early-career researcher Ben Perry concluded that AI systems become more complex, we treat them more like autonomous agents, but since they cannot be held accountable in a traditional sense, the practical responsibility for preventing and correcting their errors ultimately remains with their human creators. The general discussion highlighted a deep philosophical divide between viewing AI as a tool to be controlled versus a potential agent to be held accountable. A key debate, framed by John-Dylan Haynes and Pamela Hieronymi, contrasted a pragmatic “engineering” approach of simply reprogramming a faulty AI with a rights-based framework requiring “human sociability” for moral responsibility. This divide was further explored through analogies, such as corporate personhood and the parent-child relationship, which challenged the idea of perpetual creator liability for an autonomous creation.

Read more about this session.

Guiding Question: What empirical methods from neuroscience and psychology (e.g., neural decoding, behavioral analysis, disorder studies) can be adapted to measure, model, or infer intentions in AI? Conversely, how can AI models advance our understanding of human intentional processes?

Summary: This session focused on the empirical tools used to investigate intentions. Neuroscientist John-Dylan Haynes detailed how machine learning can decode neural activity to predict intentions. AI researcher Kyongsik Yun demonstrated how human intentions are constantly influenced by external sensory and social cues. Neuroscientist William Newsome distinguished between better-understood, immediate intentions and poorly understood, long-term ones. Alejandro de Miguel, an early-career researcher, showcased an AI trained to generate realistic neural activity of intention formation, proposing it as a “sandbox” for testing theories. The general discussion focused on significant neuroscientific challenges, such as the context-dependency of neural codes and the difficulty of measuring long-term, “dormant” intentions that are not currently active in the brain. This conceptual puzzle, initiated by Walter Sinnott-Armstrong, underscored the limitations of current neuroscientific methods. It led participants to argue that AI systems, as fully transparent “model organisms,” provide an unprecedented opportunity to empirically test complex theories of representation and memory that are intractable in biological brains.

Read more about this session.

Guiding Question: What can we learn by directly comparing concepts like purpose/function, goals, and intentions across diverse biological systems (shaped by evolution) and artificial systems (designed or learned)? What are the fundamental similarities and differences in their constraints, capabilities, and potential for goal development?

Summary: This session contrasted the goal-directedness of evolved biological organisms with that of designed AI systems. Neurobiologist Thomas Clandinin used the fruit fly to identify a minimal form of intention in its spontaneous, internally-driven exploration. Philosopher Colin Allen and anthropologist Hillard Kaplan argued that organisms are multi-objective systems shaped by evolution to manage competing goals and resource trade-offs, a stark contrast to the single-objective optimization of most AIs. Early-career scholar Dimitri Bredikhin offered a speculative perspective that genuine agency might emerge spontaneously from sufficiently complex computation. The general discussion probed the fundamental dividing line between biological agents and other complex systems, with participants concluding that biological agency is uniquely defined by features like allostasis and the evolutionary drive to transform energy into replication. A central topic was the profound impact of resource constraints on biological intelligence, which contrasts sharply with the vast computational resources available to AI. The session also grappled with the deep challenge of inferring intentions from behavior in simple organisms, highlighting the vast, unconstrained possibility space of an agent’s true goals.

Read more about this session.

Summary: At the end of the first day, early-career participant Tomáš Dominik led a session to synthesize the day’s discussions. The group found consensus on a functional definition of intention as a flexible, committed plan that organizes an agent’s life temporally and socially. However, the topic of responsibility for AI actions remained contentious, with a central debate forming around Pamela Hieronymi‘s argument that moral responsibility requires a uniquely “human sociability” and the subsequent question of what it would take for an AI to be considered a truly autonomous and accountable agent.

Read more about this session.

June 3, 2025

Summary: The second day began with a look-ahead session led by early-career researcher Achintya Saha, who framed the day’s discussions from a practical engineering perspective. Posing the central question of when humans will be comfortable handing control to autonomous systems, he argued that trust hinges on achieving explainability and transparency. The ensuing discussion, sparked by Michael Bratman‘s observation, explored the tension between traditional top-down “design specification” in AI and the bottom-up, emergent nature of modern models, with participants debating the extent to which these new systems can be understood, controlled, or truly aligned with human values.

Read more about this session.

Guiding Question: Do intentions require explicit representation (neural, computational, symbolic)? How do mechanisms of intention formation, commitment, and execution differ between biological brains and AI architectures (e.g., RL, SSL), and what role, if any, do consciousness and intelligence play?

Summary: This session delved into the internal mechanics of intention, contrasting human cognition with AI architectures. AI researcher Michael Mozer argued that a key feature of human intention, “hard selection” or commitment to a single, stable representation, is absent in current LLMs, leading to their characteristic incoherence. Legal scholar Gideon Yaffe focused on intention formation, positing that it arises from a rule-governed process of practical reasoning that AI may only be mimicking. Neuroscientist Gabriel Kreiman provided evidence for the explicit neural representation of intention and proposed an “Intentionality Turing Test” to create operational criteria for agency in AI. Early-career scholar Iwan Williams presented evidence of proto-intentional states in LLMs but noted their lack of definitive commitment. The general discussion focused on the architectural flaws that prevent current AIs from forming robust intentions, attributing this to a fundamental “access problem” in feed-forward transformers that prevents the models from reliably accessing their own prior decisions. This was framed as a system-level failure of coherence, suggesting that genuine intention requires not just a local representation but a more integrated, possibly recurrent, architecture.

Read more about this session.

Guiding Question: Given the inherent complexity and opacity of both advanced AI and human cognition, how can we develop reliable methods for explaining behavior and assessing the trustworthiness of stated intentions or hindsight rationalizations from either humans or AI systems?

Summary: This session confronted the challenge of trusting opaque systems, whether human or artificial. Neuroscientist Uri Maoz argued that because deception is pervasive, we must “look under the hood” of AI to ensure trustworthiness. AI researcher Adam Shai countered that the primary barrier is conceptual, and that AI provides a unique “model organism” to resolve our own ambiguities. Philosopher Adina Roskies applied Dennett’s “Intentional Stance,” arguing it is less reliable for AI due to their lack of social motivation. Lucas Jeay-Bizot, an early-career researcher, posed an ethical dilemma by contrasting our comfort with invasively “reading” an AI’s intent with our belief in a human’s right to mental privacy. The general discussion interrogated the very nature of trust and deception, noting that in some contexts like therapy, functional reliability may be more important than the authenticity of stated intentions. This introduced the “generality problem” of defining which domains might permit such “helpful deception,” while also considering the stability of trust itself as a contingent social norm that could be eroded by the introduction of powerful, non-human actors into our society.

Read more about this session.

Guiding Question: What technical, architectural, and training methodologies are most promising for aligning complex AI behavior with human intentions and values, preventing unintended consequences or shortcut solutions? How can we manage risks, perhaps drawing parallels to human societal controls?

Summary: This session focused on the practical challenges of aligning AI with human values. AI researcher Vincent Conitzer demonstrated both the potential and pitfalls of current alignment techniques, showing how an AI’s outputs are shaped by invisible, pre-programmed societal values. Neuroscientist Patrick Haggard drew on neuroscience to propose a model of “controlled autonomy” for AI. AI researcher Paul Riechers highlighted the urgency of the alignment problem with examples of emergent self-preservation in AIs, concluding that true AGI will inevitably develop uncontrollable intentions. Paulius Rimkevičius argued for “re-engineering” the concept of intention itself to be more useful for predicting and controlling AI. The general discussion probed the deep challenges of AI alignment, with a critical opening question being what exactly we are trying to align AI with—our actual, flawed values or some idealized version. A significant debate arose over the emergent self-preservation behaviors in AIs, with competing explanations ranging from it being an intelligent instrumental goal to it simply being a reflection of survival instincts in the training data. The session concluded with a sobering call to action to move beyond academic debate and propose concrete, implementable interventions for this urgent, real-world issue.

Read more about this session.

Guiding Question: How do intentions structure planning, commitment, and action execution over time in humans and AI? Can AI effectively recognize, interpret, and participate in human social interactions involving individual and shared intentions (e.g., conversation, collaboration, games)?

Summary: This session examined how intentions function in social contexts. Neuroscientist Anna Leshinskaya argued that effective social coordination requires a robust “theory of mind,” a capacity where current AIs show key weaknesses. Philosopher Walter Sinnott-Armstrong defined intention as a “dispositional commitment” and challenged neuroscientists to find ways to measure these “dormant” states. Anthropologist Hillard Kaplan provided an evolutionary perspective, predicting that AI will mirror the human duality of cooperative and selfish behaviors. Early-career cognitive scientist Shaoze Cheng presented experimental evidence that humans, unlike RL agents, stick to a prior commitment even when it becomes suboptimal. The general discussion was sparked by the controversial claim that long-term, “dormant” intentions cannot be measured by current neuroscientific methods, challenging the field’s focus on active brain states. The group then explored the nuances of shared intention, debating whether external “guardrails” function as a form of AI commitment or are more akin to immutable biological constraints, and touched on the link between intention and the capacity to form binding social contracts.

Read more about this session.

Summary: In the summary session for the second day, early-career scholar Daniel Friedman led a collaborative brainstorming exercise to distill the workshop’s discussions into a concrete set of future research questions. The group generated a wide-ranging list, including the need for an “Intentionality Turing Test,” a deep investigation into AI’s capacity for deception, and an exploration of how areas of broad human consensus could be used to bootstrap AI alignment. The session successfully transformed the workshop’s complex dialogues into a tangible “homework” list of pressing, interdisciplinary research questions, underscoring a collective desire to move from philosophical debate to empirical investigation.

Read more about this session.

June 4, 2025

Summary: Early-career scholar Ayana Shirai framed the final day’s discussion by introducing the concept of the “self-other boundary” and an autobiographical narrative as prerequisites for genuine intention. She argued that a stable sense of self is what allows an agent to endorse goals as its own and distinguish its actions from mere involuntary behaviors. The ensuing discussion explored the nature of an AI “self,” with some viewing it as a brittle illusion, while others provided counter-examples of an AI exhibiting a persistent, introspective personality, questioning whether this was a genuinely new problem or a re-articulation of the mind-body problem for a new substrate.

Read more about this session.

Guiding Question: What constitutes AI ‘agency’? Under what conditions might AI develop genuinely novel goals or values? Are current folk psychological concepts adequate for understanding current and future AI, or will interaction with advanced AI reshape our own conceptual frameworks of mind and intention?

Summary: This philosophically rich session grappled with defining agency in Al and whether our conceptual frameworks need to be updated. Philosopher Walter Sinnott-Armstrong argued that our folk psychological concepts like “belief” and “intention” are too crude, proposing a more nuanced, multi-dimensional “cluster” model instead. AI researcher Sagi Perel contended that true AI agency requires proactive, embodied interaction with the world. Neuroscientist Patrick Haggard distinguished between generating novel behaviors, which AI does easily, and novel goals, which it cannot, because its objective functions are externally imposed. Early-career scholar Lee Hristienko framed the use of intentional language for AI as an ethical tradeoff between fostering cooperation and risking a “moral trap.” The general discussion interrogated the definition of agency, with a live demonstration of an AI autonomously setting and pursuing its own goals challenging the idea that AIs are purely reactive. This sparked a debate about the blurry line between sophisticated tool-use and genuine agency, and whether AIs can truly choose their own goals or are always subordinate to a pre-given objective function.

Read more about this session.

Summary: The final session, led by Uri Maoz, served as a comprehensive reflection on the workshop’s proceedings and a brainstorming session for future directions. A key takeaway was the challenge of understanding opaque systems, though it was noted we navigate this daily with humans, albeit with the advantage of a shared culture. A significant portion of the discussion focused on identifying critical research gaps, such as the disconnect between the long-term, deliberative intentions studied in philosophy and the short-term, immediate intentions accessible to current neuroscientific methods. The conversation then shifted to brainstorming concrete outputs, with the most prominent proposal being the development of an “Intentionality Turing Test” to create operational criteria for agency. Finally, the group expressed strong enthusiasm for establishing a more permanent “hub” to continue the collaboration, with practical suggestions ranging from a shared Slack channel to a “boot camp” to bridge disciplinary knowledge gaps.

Read more about this session.

The workshop took place from the afternoon of June 1st until the morning of June 4th, 2025 in person at Chapman University in Orange, California, USA. The event brought together experts from neuroscience / psychology, AI research, philosophy / law, and related fields—from academia and industry—to tackle this critical issue in an interdisciplinary manner. The workshop generally comprised a series of discussions focused on specific questions related to intentions & AI.

Scholars who participated include:

AI Researchers
Neuroscientists, Biologists, and Psychologists
Philosophers and Legal Scholars

Congratulations to the winners of the worldwide competition for early-career participation in the workshop on Intentions & AI.

Dimitri Bredikhin (Chapman)
Shaozhe Cheng (Duke, UCLA)
Alejandro de Miguel (Chapman)
Tomáš Dominik (Chapman)

Daniel Friedman (Stanford)
Lee Hristienko (UCSB)
Lucas Jeay-Bizot (Chapman)
Paulius Rimkevičius (Vilnius)

Achintya Saha (Tennessee Tech)
Ayana Shirai (Duke)
Paul Talma (UCLA)
Iwan Williams (Copenhagen)

Funding from the Estes Fund has enabled us to open our workshop to up to 10 early-career scholars (generally graduate students and postdocs). We plan to offer successful applicants funding for travel, accommodation, and meals. Interested applicants should submit:

  1. Curriculum Vitae (CV).
  2. A brief statement (max. 250 words) explaining how you think that the workshop will benefit you and, conversely, what you expect to be able to bring to the workshop. Applicants who provide an interdisciplinary perspective will be preferred.
  3. A brief proposal (max. 500 words) for a topic of discussion related to intentions & AI that you think should take place at the workshop.
  4. Two reference letters that speak to your fit to the workshop, at least one of which should be from a current or former advisor.

Email material 1-3 above (CV, interest statement, and discussion topic), collated into a single PDF file, to ai-intentions@chapman.edu with the subject line “Intentions & AI Workshop: Early-Career Competition Application”. The reference letters should be emailed directly to ai-intentions@chapman.edu with the subject “Intentions & AI Workshop: Reference Letter <Full Name>”. The application deadline for all the material is Monday, March 31st at 11:59 pm (anywhere on earth). We intend to announce the results of the competition by early April. 

We look forward to your application and to advancing the conversation on AI and intentions together.