SAP AI agents and ERP data: data quality & governance

Author: Laura Parri Royo, Marketing Director at TJC Group

This article is part of the SAP AI and ERP Data Series. Subscribe to read the next issue.

SAP AI agents can retrieve information, recommend actions, and automate parts of a business process. But an agent can only work with the records, permissions, rules, and context available to it. Missing dependencies, duplicate master records, or outdated status information can cause an agent to recommend the wrong action even when the model operates as designed. AI intelligence depends on data governance and data quality, combined with clear limits on what an agent may retrieve, recommend, and execute.

Introduction
Why ERP data quality matters for SAP AI agents
How a governed SAP AI agent workflow works in practice
Why access controls must apply to AI-driven workflows
How lifecycle and compliance rules apply to AI workflows
Why AI-driven decisions must remain traceable and auditable
Why governed data is necessary but not sufficient
How TJC Group can support a governed SAP data foundation
Conclusion
Sources of information

Introduction

Traditional SAP workflows usually follow predefined rules. A user performs an action, the system checks the relevant authorisation, and the transaction is recorded.

AI agents introduce a more dynamic layer. They may gather information from several applications, interpret a situation, recommend the next step, or trigger an action as part of a wider workflow.

This creates new questions around whose permissions the agent uses, which sources it can trust, when human approval is required, and what evidence must be retained afterwards.

SAP describes Joule Agents as using business-process context and information from connected applications. That context matters because business data, such as the supplier status, delivery delay, or customer balance, is rarely meaningful in isolation.

Blog article: Is SAP Joule for consultants a great assistant? Our opinion

Why ERP data quality matters for SAP AI agents

An AI agent can process information quickly, but it cannot automatically resolve every weakness in the underlying ERP landscape. The agent may operate as designed while still reaching the wrong conclusion if the information available to it does not represent the full business situation.

Three problems are particularly important: incomplete process information, duplicate or outdated records, and data that has lost its business context.

TJC Group SAP_AI_Agents_Need_Quality_Data

Incomplete and inconsistent information

ERP processes often span several records and applications.

An invoice may appear in the finance system while its purchase order, goods receipt, contract, and supplier correspondence sit elsewhere. If the agent sees only part of that chain, it may misinterpret the cause of an exception.

An apparently overdue invoice, for example, may result from a missing goods receipt rather than a payment problem. An agent that sees only the invoice status could escalate the wrong issue.

BLOG ARTICLE: AI IN SAP – WHY QUALITY DATA IS A MUST

Inconsistency creates a similar risk. Business units may use different definitions for the same status, while regional systems may record currencies, units, dates, or organisational structures differently. Custom SAP fields may also have meanings understood locally but not documented centrally.

Before an agent uses information from several sources, the workflow should address the following questions:

Data question	Risk if unresolved
Are all records required for the process available?	The agent may act on an incomplete view
Do connected systems use the same definitions?	Similar fields may carry different meanings
Are dates, currencies, and units aligned?	Values may be compared incorrectly
Are transaction dependencies preserved?	Records may be separated from the documents or master data needed to interpret them
Are custom fields documented?	Organisation-specific logic may be misunderstood
Which source is authoritative?	Conflicting records may produce an unreliable decision

Where systems disagree, the workflow should define which source takes priority, whether the conflict stops automation, and who must resolve it.

The information does not necessarily need to sit in one database. It does need to be reconciled and suitable for the task.

Duplicated or outdated records

Duplicate master data can give an agent conflicting versions of the same business entity.

A supplier may appear under several vendor numbers after an acquisition, regional rollout, or incomplete cleansing exercise. One record may contain recent quality incidents, while another contains the current contract and payment details.

If the agent treats them as separate suppliers, it may underestimate risk or recommend action against the wrong record.

Outdated information creates a related problem. An expired contract, inactive customer, closed company code, former employee, or obsolete bank account may remain technically accessible even though it should no longer influence an operational decision.

Such records cannot always be removed immediately. They may remain connected to historical transactions or be subject to retention requirements. Depending on the situation, they may need to be harmonised, blocked, marked as inactive, or excluded from the workflow.

This is why SAP data archiving and data cleansing should not be treated as the same activity.

Archiving controls the volume of eligible completed data in live SAP tables. Cleansing and master-data governance address duplication, accuracy, consistency, and ownership. A reliable foundation may require both.

Data without sufficient business context

ERP data is shaped by the process that created it.

A delayed invoice could indicate a payment issue. It could also result from a dispute, pricing error, missing proof of delivery, incorrect tax code, or internal approval delay.

Historical information creates an additional challenge. Organisational structures, account mappings, exchange rates, approval rules, and custom logic may change over time.

A model comparing several years of transactions cannot assume that every value was produced under the same definitions and rules.

The necessary context may include relationships between transactions and master data, the meaning of process statuses, the responsible organisational unit, the configuration in effect at the time, and supporting documents or approvals.

SAP Knowledge Graph can provide semantic relationships and business context for AI. Organisations must still address their own undocumented customisations, duplicate identities, local terminology, and historical exceptions.

How a governed SAP AI agent workflow works in practice

Consider an agent reviewing a supplier after a series of delayed deliveries.

First, the workflow verifies the requester’s entitlements and confirms that the technical identity used by the agent is authorised to retrieve the required records and perform the permitted action.

It then retrieves the approved information needed for the task. This may include current purchase orders, delivery history, quality incidents, contract status, payment information, and supplier master data.

The workflow identifies that the supplier appears under two vendor numbers. It applies an approved entity-matching rule. If the result remains uncertain, the records remain separate and the discrepancy is escalated for review.

The agent also finds conflicting status information. S/4HANA shows the supplier as active, while the contract repository indicates that the current agreement has expired.

The workflow has been configured not to resolve this type of conflict automatically. It flags the discrepancy and limits the output to a recommendation.

A procurement manager reviews the evidence and approves, rejects, or modifies the proposed action. Any resulting change in SAP is recorded together with the requester, approver, source records, technical identity, and workflow version.

TJC Group how the governed SAP AI Agents workflow works

The extraction, recommendation, approval, and final transaction then follow the applicable retention and audit rules.

This scenario shows how reliable agent behaviour depends on connected controls. Data quality alone is not enough. The workflow also needs task-specific permissions, authoritative-source rules, limits on autonomy, human intervention, and traceable evidence.

Find out more about SAP S/4HANA’s AI Potential by watching the replay of this webinar with Andreas Welsch, Former SAP AI Executive, hosted by TJC Americas.

Why access controls must apply to AI-driven workflows

An agent should never expand the requester’s authorised access or use information beyond the approved scope of the workflow.

The same principle applies when the agent performs an action rather than only retrieving data.

If an agent can create a purchase requisition, change a payment block, update master data, or trigger an approval workflow, its authority must be clearly defined.

Organisations need to govern both the user requesting the task and the technical identity carrying it out. The agent must not become a route around controls that would apply to the same action in SAP.

TJC Group access controls for AI-Driven workflows

User permissions and sensitive information

Agent access should reflect the requester’s authorised scope and the minimum data required for the task.

A user may have access to a transaction without being authorised to view every field, legal entity, employee group, or supporting document connected to it.

Sensitive ERP information can include payroll data, bank details, tax identifiers, pricing agreements, and confidential commercial terms.

The issue becomes more complex when a workflow combines information from several systems. Permission to view data in one application does not automatically permit its reuse in another process.

ebook: Driving GenAI innovation in S/4HANA transformation with decommissioning

A governed design should define:

Control	Question to answer
Requester identity	Who initiated the task?
Data entitlement	Which records, fields, and entities may the requester access?
Task scope	What information is necessary for the workflow?
Action authority	Which transactions may the agent create or change?
Execution identity	Which technical credentials carry out the action?
Approval rule	When must a person approve the result?
Escalation route	What happens when the request exceeds the authorised scope?

These controls should be designed before the agent enters a live process.

Purpose-based access to ERP data

Technical access does not permit information to be used for every purpose.

Customer data collected to fulfil an order may not be suitable for an unrelated AI use case. Employee information used for payroll may require additional approval before it supports workforce analysis. Historical transactions retained for tax purposes should not automatically become training data.

Each workflow should therefore have a defined business objective, approved data sources, intended users, and permitted actions. It should also state whether the information may be used for model training or improvement.

Data minimisation applies as well. An agent investigating an invoice exception may need the invoice, purchase order, goods receipt, and approval status. It is unlikely to need the customer’s complete historical record.

Restricting the workflow to the information required for the task reduces exposure and makes the outcome easier to explain.

Accountability for automated actions

Not every agent action carries the same level of risk.

Providing a summary of overdue invoices is different from changing a supplier’s bank details. Recommending a payment block is different from applying it automatically.

A useful approach is to divide agent authority into three levels:

Agent role	Example	Typical control
Inform	Summarise overdue invoices	Authorised access and source references
Recommend	Suggest which invoices require review	Supporting evidence and human decision
Act	Create, change, release, or approve a transaction	Strict permissions, validation, and risk-based approval

Approval requirements should reflect financial value, reversibility, use of sensitive information, impact on customers, suppliers or employees, legal or fiscal consequences, master-data changes, and conflicts between source records.

High-impact or irreversible actions may require human approval even when the agent can prepare the transaction.

Accountability must also remain clearly assigned. The organisation needs a process owner who defines the workflow, a data owner who approves the information used, a control owner who monitors compliance, and an escalation route when the workflow cannot proceed safely.

How lifecycle and compliance rules apply to AI workflows

ERP information does not become exempt from retention, privacy, or legal-hold rules when it enters an AI workflow.

AI workflows may create extracts, indexes, prompt histories, cached records, recommendations, analytical data sets, and transaction logs. If these copies are not included in lifecycle management, an organisation may remove information from SAP while leaving the same data elsewhere.

TJC Group lifecycle and compliance rules AI workflows

Retention, legal holds, and lifecycle management across derived copies

The workflow design should incorporate the lifecycle status and permitted-use rules attached to the source information.

It should account for whether a record remains within its approved retention period, whether personal fields need to be masked, and whether a legal hold prevents its removal.

The same rules may need to cover derived copies. When a customer record becomes eligible for removal under the applicable retention policy, related extracts, indexes, cached data, or prompt histories may also need to be addressed.

The treatment can differ by copy. A temporary extract may be removed when the task ends, while an audit record may need to remain longer.

The organisation should know where each copy is stored, which rule applies to it, whether the lifecycle action must propagate, and what evidence confirms that the process was completed.

A legal hold requires additional care. It may suspend normal lifecycle processing for records connected to litigation, an investigation, or another legal matter. If held information appears in an AI-generated extract or recommendation, the organisation must be able to identify and preserve that copy.

Preservation does not permit unrestricted reuse. Information retained for litigation may still need to be excluded from model training or unrelated automation.

Privacy requirements affect how information is handled throughout the workflow.

Personal data should be used for a defined purpose, limited to what is necessary, protected from unauthorised access, and not kept longer than justified.

An AI workflow may therefore require field exclusion, masking, restricted retrieval, or shorter storage periods for temporary extracts and prompts.

TJC Group’s work around SAP ILM and GDPR focuses on retention, blocking, and controlled end-of-life processing within SAP landscapes.

Fiscal requirements create a different obligation. Invoices, accounting records, and supporting evidence may need to remain readable, complete, and traceable for defined periods.

An analytical copy may be transformed for a particular model, but the governed source required for audit or tax evidence should remain intact.

TJC Group’s guidance on SAP data archiving and legal compliance explains why accessibility, retention, and protection must work together.

Related records may also have different end-of-retention dates. An invoice, customer record, approval log, and attached correspondence may not become eligible for lifecycle processing at the same time.

TJC Group’s guidance on managing different retention periods in SAP ILM addresses these dependencies.

How SAP ILM supports lifecycle enforcement

SAP Information Lifecycle Management provides controls for retention, blocking, legal holds, and end-of-life processing within SAP environments.

For AI governance, the workflow should use the lifecycle status and permitted-use rules associated with the source information when deciding whether data can be retrieved, reused, restricted, or processed further.

AI-related copies should not operate under a disconnected lifecycle policy.

Why AI-driven decisions must remain traceable and auditable

An organisation should be able to reconstruct a significant AI-driven action.

If an agent recommends placing a supplier on hold, reviewers may need to know which supplier records, delivery failures, contracts, incidents, and business rules influenced the result.

“The AI decided” is not sufficient evidence.

Depending on the risk, the audit trail may need to record:

Evidence category	What it records
Identity and timing	Requester, execution identity, date, and time
Workflow configuration	Agent version, instructions, rules, and connected tools
Source evidence	Records consulted and any transformations applied
Agent outcome	Recommendation or proposed action
Human oversight	Approval, rejection, or modification
Business result	Final SAP transaction, exceptions, and reversals

Traceability becomes harder when information moves between SAP, SAP BTP, external applications, and other data or orchestration platforms.

Each handoff can create a gap if source identifiers, timestamps, lineage, and action logs are not preserved. Auditability must therefore cover the complete workflow rather than only the final SAP transaction.

Change management is also part of traceability. An agent may behave differently after changes to its model, prompts, retrieval sources, business rules, permissions, transformations, or connected tools.

Material changes to a high-impact workflow should trigger testing, review, and approval before the new version returns to production. The organisation should also be able to identify which version produced a past recommendation.

Controls should remain proportionate. A low-risk summary does not need the same evidence package as a transaction affecting money, personal data, employment, tax, or master data.

SAP’s responsible AI principles include transparency, accountability, privacy, security, and human oversight. Organisations still need to translate those principles into controls suited to their own SAP processes and risk levels.

Blog: a guide to unlocking AI potential in your S/4HANA migration

Why governed data is necessary but not sufficient

A governed SAP data foundation improves the conditions under which an agent operates. It does not guarantee that every recommendation or action will be correct.

Even with accurate and properly controlled information, an agent may misinterpret instructions, select an unsuitable action, respond with excessive confidence, or fail when a connected system is unavailable.

Organisations therefore need controls around both the information and the agent.

These include testing agent behaviour, defining action limits, monitoring outcomes, setting exception thresholds, and providing fallbacks when the workflow cannot complete safely.

Human intervention should remain available when evidence conflicts, a required system is unavailable, or the potential impact exceeds the approved level of autonomy.

TJC Group’s role sits primarily in the governed data foundation. Data archiving, ILM, historical access, and lifecycle controls support reliable automation, but they remain one part of a wider agent-governance programme.

How TJC Group can support a governed SAP data foundation

TJC Group can help organisations strengthen the information foundation used by SAP AI agents.

Its role can be summarised through three connected outcomes:

Agent-governance requirement	TJC Group contribution
Managed live-data footprint	Archiving and ASC remove eligible completed data from active SAP tables
Enforced lifecycle rules	SAP ILM provides retention, blocking, legal-hold, and end-of-life controls, while ASC automates supported SAP archiving and ILM processes
Governed historical access	ELSA preserves controlled access after SAP and non-SAP applications are retired

Through its S/4HANA data management services, TJC Group can assess SAP data volumes, archiving objects, retention requirements, and historical-access needs.

The Archiving Sessions Cockpit supports the recurring execution and monitoring of SAP data archiving and supported ILM processes. It helps turn an approved policy into an ongoing operational process rather than an irregular manual exercise.

ELSA supports governed access to data and documents from decommissioned SAP and non-SAP systems. Where a future AI or analytical workflow needs historical information, selected data can be supplied through a controlled process without keeping the original application operational.

Together, these capabilities help organisations manage the volume, lifecycle, and availability of SAP information. They provide part of the governed foundation required for reliable automation, while model behaviour, business decisions, and agent authority remain within the organisation’s wider AI governance framework.

Conclusion

SAP AI agents offer significant potential to accelerate and improve business processes, but their effectiveness depends directly on the quality, governance, and lifecycle management of the underlying ERP data. Organisations that invest in a governed SAP data foundation will be better positioned to deploy AI agents reliably and responsibly.

The key points to take away from this article are:

Prioritise data quality and consistency. Incomplete, duplicated, or outdated records will undermine AI agent recommendations regardless of how well the model performs. Address data cleansing and master-data governance alongside SAP data archiving.
Apply access controls to every AI workflow. An agent must never expand a user’s authorised access. Define requester identity, data entitlements, task scope, action authority, and approval rules before any agent enters a live process.
Enforce lifecycle and compliance rules across all copies. AI workflows create extracts, cached records, and logs that must follow the same retention and privacy rules as the source data. SAP ILM can support this enforcement within SAP environments.
Ensure traceability for every significant action. Organisations must be able to reconstruct how an AI-driven decision was made, including the source records, workflow version, and human oversight involved.
Recognise that governed data is necessary but not sufficient. Data quality and lifecycle management form the foundation, but organisations also need controls around agent behaviour, autonomy limits, and exception handling.

TJC Group, with over 25 years of expertise in SAP data volume management, helps organisations build the governed data foundation that reliable AI automation requires. Contact us today to discover how we can support your SAP data governance strategy.

Sources of information

This article is a research-led piece based on official SAP product pages, SAP Learning resources, Thierry Julien technical inputs and TJC Group’s own review of SAP Joule for Consultants.

SAP Joule Agents
https://www.sap.com/products/artificial-intelligence/ai-agents.html

SAP Joule Studio
https://www.sap.com/products/artificial-intelligence/joule-studio.html

SAP Knowledge Graph
https://www.sap.com/products/artificial-intelligence/knowledge-graph.html

SAP Responsible AI principles
https://www.sap.com/products/artificial-intelligence/ai-ethics.html

TJC Group’s review of SAP Joule for Consultants
https://www.tjc-group.com/blogs/is-sap-joule-for-consultants-a-great-assistant-our-opinion/

FAQ's

Answer:

No. The agent should never exceed the requester’s authorised scope. Each workflow should also be limited to the minimum data and actions required for the task, while the technical execution identity remains separately controlled.

Answer:

No. Retention establishes that information must or may remain preserved. It does not automatically permit the data to be used for training, analysis, or automated decisions. Its purpose, permissions, quality, and privacy requirements must still be assessed.

Answer:

The workflow should define which source is authoritative and whether the discrepancy prevents automation. Material conflicts should be escalated to an identified business or data owner rather than resolved without an approved rule.

Answer:

The applicable rule may need to cover extracts, indexes, prompt histories, cached records, and other copies created by the workflow. The organisation should know where those copies exist and be able to demonstrate that eligible information was processed in accordance with the retention policy.

Answer:

The evidence depends on risk, but it may include the requester, technical identity, source records, workflow version, instructions, agent outcome, human approval, final SAP transaction, and any later exception or reversal.

Back to all Blogs

Table of contents

FAQ's

Q1. Should an SAP AI agent inherit all of a user's permissions?

Q2. Can retained SAP data automatically be reused for AI?

Q3. What should happen when connected ERP systems disagree?

Q4. What happens when data used by an agent reaches the end of its retention period?

Q5. What evidence should be retained for an automated SAP action?

SAP data archiving: Top things to keep in mind

Data privacy laws: What you need to know about the Middle East

Data privacy in 2026: How GDPR compliance landscape is evolving

About TJC Group

Solutions

Software

Contact

Certifications

Consulting

Resources

Events

Find all our advice

SAP AI agents and ERP data: data quality & governance

Table of contents

Introduction

Why ERP data quality matters for SAP AI agents

Incomplete and inconsistent information

Duplicated or outdated records

Data without sufficient business context

How a governed SAP AI agent workflow works in practice

Why access controls must apply to AI-driven workflows

User permissions and sensitive information

Purpose-based access to ERP data

Accountability for automated actions

How lifecycle and compliance rules apply to AI workflows

Retention, legal holds, and lifecycle management across derived copies

GDPR, purpose limitation, and fiscal requirements

How SAP ILM supports lifecycle enforcement

Why AI-driven decisions must remain traceable and auditable

Why governed data is necessary but not sufficient

How TJC Group can support a governed SAP data foundation

Conclusion

Sources of information

FAQ's

Q1. Should an SAP AI agent inherit all of a user's permissions?

Q2. Can retained SAP data automatically be reused for AI?

Q3. What should happen when connected ERP systems disagree?

Q4. What happens when data used by an agent reaches the end of its retention period?

Q5. What evidence should be retained for an automated SAP action?

Recent Articles

SAP DRC and the S/4HANA deadline: Will organisations be ready by 2027?

Early data deletion: Is it a risky and non-compliant strategy?

More Like This

SAP data archiving: Top things to keep in mind

Data privacy laws: What you need to know about the Middle East

Data privacy in 2026: How GDPR compliance landscape is evolving