Security & Architecture Overview

How EngAIge Protects Your Data

EngAIge is built with security, transparency, and data integrity as foundational principles. It operates directly within the Data Talks CDP environment and does not rely on uncontrolled data generation or unrestricted database access. Every response is grounded strictly in your organisation's live CDP data model.

Controlled Data Access

EngAIge enforces strict boundaries on all data operations. Natural language requests are handled by a dedicated Data Explorer agent that fetches your live schema and translates intent into SQL - which is then validated before execution.

Security mechanisms include:

SELECT-only enforcement — write operations (INSERT, UPDATE, DELETE, DROP, TRUNCATE, MERGE) are explicitly detected and rejected at the application layer before any query reaches the database
Automatic row limits — queries are capped at 50 rows by default, with a hard maximum of 100 rows, enforced automatically
Query complexity limits — a maximum of 8 JOIN operations per query is enforced to prevent resource exhaustion
Query deduplication — identical queries within a session are detected and blocked from re-execution
Schema-grounded generation — the AI works directly from your live CDP schema, fetched from your environment at runtime, ensuring it cannot reference tables or fields that do not exist in your data model
Multi-tenant isolation — all data access is scoped to your organisation's context; cross-client data access is structurally prevented

Multi-Agent Architecture

EngAIge is powered by a structured multi-agent system built on the Microsoft Agent Framework (a pure implementation replacing the legacy Semantic Kernel V1 module). Responsibilities are cleanly separated across four specialised agents:

Agent	Responsibility
Customer Success Agent	General business insights, campaign performance, segment effectiveness, and calendar intelligence
Data Explorer Agent	Schema exploration, SQL generation, query execution, and data visualisation
Campaign Agent	Structured campaign planning, suggestion workflows, and monthly plan generation
Help Documentation Agent	Platform feature guidance and documentation search

Requests are routed based on intent, guided by explicit routing rules within each agent's system prompt. This separation ensures that analytical logic and activation logic remain isolated, reducing the risk of cross-domain errors.

Each agent exposes a curated set of tools — thin wrappers registered via AIFunctionFactory that delegate business logic to dedicated service classes. This keeps AI interaction surfaces minimal and auditable.

Deterministic Workflow Execution

Complex multi-step operations are handled through deterministic workflows built with a WorkflowBuilder pattern. Rather than relying on the AI to sequence each step, workflows are explicitly wired with typed edges and executed in order — the AI only interprets the final aggregated result.

Examples include:

Campaign Suggestion Workflow — deterministically gathers match data, commercial focus, segment data, and email templates before AI content generation
Monthly Planning Workflow — sequentially collects sport match schedules, campaign performance history, and existing campaigns to inform a full-month plan
Deep Analysis Workflow — orchestrates multi-step data enrichment before delivering structured insights

This architecture prevents the AI from making ad hoc tool calls in unpredictable sequences for high-stakes operations.

Transparent Processing

EngAIge provides a real-time reasoning timeline that surfaces processing steps, tools invoked, and workflow execution phases as they happen. This is powered by an EventManager that pushes named checkpoints to the frontend via SignalR as each step completes.

A structured effects system drives UI rendering: agent responses carry typed effect payloads (campaign suggestions, data tables, navigation actions, deep analysis results, etc.) serialised as polymorphic JSON. The frontend renders each effect type through a dedicated component — ensuring the display is always structured and traceable, never free-form.

Observability & Tracing

All agent interactions are fully traced through OpenTelemetry with LangFuse integration. Every request captures:

Token usage and model latency
Full prompt and completion content (with sensitive data tracing enabled in development)
Agent identity, user identity, organisation context, and session thread ID
Custom metadata including customer code, organisation name, and CUID

This provides complete audit trails for every insight generated, covering both the reasoning process and the data accessed.

Secure Infrastructure

The system uses:

Streaming responses via SignalR for secure real-time agent interaction and live event push
Distributed cache-backed conversation threads with a 1-hour sliding expiration window — the cache is the authoritative source for active conversation state
Query deduplication logic to prevent redundant execution within a session
Strict application-layer enforcement of read-only data operations at every entry point in the Data Explorer service

EngAIge V2 fully replaces the legacy Intelligence V1 module (which was built on Semantic Kernel) with a hardened Microsoft Agent Framework implementation designed for scalability, auditability, and reliability.

Data Isolation & Organisational Context

EngAIge operates exclusively within the context of your specific organisation. It does not access cross-client data and does not train on your proprietary information. All insights are generated dynamically from your live CDP instance using your schema at the time of the request.

Localisation logic — including language and geography detection — is applied based on your configured organisation settings, ensuring contextual alignment without external data exposure.

Summary

EngAIge combines conversational AI with a secure, validated data execution layer. Its architecture enforces application-level read-only access, schema-grounded query generation, row and complexity limits, query deduplication, and agent-level responsibility separation — ensuring that every insight is accurate, traceable, and generated from your own data environment.

Security & Architecture Overview - AI