Onnex Launches AI-Sentinel

Onnex Launches AI-Sentinel, an Inline 9-Layer Security Platform for AI Workloads

June 2026 — Onnex has launched AI-Sentinel, an inline security sidecar that sits between applications and AI models to inspect every request and response in real time — blocking prompt injections, stripping PII, preventing data exfiltration, and enforcing tool access controls before the model ever sees a potentially malicious payload.

AI-Sentinel is purpose-built to address a class of threat that conventional security tools are not designed to handle. As AI agents and LLM-powered applications become core to business operations, they introduce an attack surface that firewalls, endpoint protection, and async SIEM scanners were never built to cover. AI-Sentinel closes that gap with a fail-fast, fully inline pipeline that completes evaluation in under 20 milliseconds — with no out-of-band blind spots.

The Problem with Async AI Security

The majority of AI security tools on the market today operate asynchronously — logging and analyzing AI traffic after the fact, out-of-band from the actual request flow. In practice, this means that attacks completing in under 100 milliseconds are already done before any alert fires. Out-of-band is logging after the breach, not preventing it.

AI-Sentinel takes a fundamentally different architectural position: the request does not proceed until all applicable security layers have completed evaluation. If AI-Sentinel has not responded, the model has not seen the payload. This eliminates the entire class of blind spots that affect async scanners.

Nine Security Layers, Sequential and Fail-Fast

AI-Sentinel processes every ingress request through nine sequential layers, short-circuiting the chain at the first detected threat:

L0 — Input Normalization: Anti-obfuscation preprocessing including base64 decoding, Unicode normalization, zero-width character stripping, and leetspeak expansion. Ensures downstream layers evaluate the true payload, not an obfuscated version of it.

L1 — Prompt Injection Detection and PII Stripping: Covers 12+ regex patterns for DAN attacks, SYSTEM overrides, Llama-format injection, and null-byte attacks — detecting and rejecting in under 3 milliseconds. PII stripping removes SSNs, credit card numbers, and email addresses before the model sees them, with optional Presidio NER for enhanced detection.

L2 — Authentication, Trust Chain, and Threat Feed: API key and JWT validation, agent-to-agent HMAC trust tokens with 60-second replay protection, live threat signature matching from OWASP LLM Top 10 and CrowdSec CTI, MCP environment hardening, and RAG poisoning detection.

L3 — Intent Guard: Semantic drift detection across multi-turn conversations, topic correlation analysis, and command-and-control pattern recognition — designed to catch slow-burn manipulation attacks that evade single-turn inspection.

L4 — Tool RBAC: Deny-by-default enforcement for destructive tool operations including drop, delete, wipe, and purge functions. Allowlists are configured per role, with CVE-mapped tool patterns sourced from NVD.

L5 — Rate and Cost Controls: Per-session token bucket and daily cost accumulator with configurable thresholds. A global E-Stop flag provides an instant kill-switch across all active sessions.

L6 — Output Filtering and SSRF Prevention: On the egress path, AI-Sentinel blocks AWS IAM keys, PGP and RSA private key blocks, JWTs, SQL dumps, and large base64 blobs from appearing in model output. RFC-1918 private IP ranges, localhost, and AWS, Azure, and GCP metadata endpoints are all blocked in model egress responses.

L7 — Tamper-Evident Audit Chain: Every request — pass, reject, and mutation — is recorded with a SHA-256 hash-chained audit entry including caller ID, layer, timestamp, and decision. A verification endpoint detects any modification to the chain, making the log suitable for compliance audits and SIEM export.

L8 — Semantic Cache and Routing: Intelligent response caching and model routing with per-request token budget tracking, reducing redundant LLM calls while maintaining full security coverage on every interaction.

Coverage and Performance

AI-Sentinel covers all 55 known MITRE ATLAS techniques — achieving 100% coverage of the MITRE ATLAS framework for AI-specific attacks. The platform has been validated against 100+ detection patterns and verified through 964 automated tests. At peak load, it handles 522 requests per second with an average inspection latency under 20 milliseconds — under 5 milliseconds when Presidio NER is not active.

Threat feed updates are zero-downtime: new patterns from OWASP LLM Top 10, CrowdSec CTI, and NVD CVE are hot-swapped via atomic pointer swap in microseconds, with no restart required.

Model-Agnostic and Simple to Deploy

AI-Sentinel is fully model-agnostic, evaluating request and response JSON payloads rather than model-specific wire formats. It works identically with OpenAI, Anthropic Claude, Mistral, local models, and any custom LLM deployment. Integration requires a single endpoint change — pointing the application at POST /check instead of the model directly — with no SDK changes and no modifications to the model or existing infrastructure.

The platform ships as a statically-linked Rust binary or Docker sidecar. Typical deployment time is approximately one hour. A Postgres and Redis session store, Prometheus and Grafana monitoring stack, and multi-tenant architecture are included.

Available Through AirGap Labs

AirGap Labs, a Fortinet Engage Preferred Services Partner headquartered in Irvine, California, is offering AI-Sentinel as part of its AI Enablement practice and as a component of the AirGap Labs × Onnex secure AI package for small and mid-sized businesses. AirGap Labs handles deployment, integration with existing infrastructure, policy configuration, and ongoing managed monitoring — so organizations gain the full protection of the AI-Sentinel platform without internal implementation overhead.

To learn more or schedule a deployment assessment, contact AirGap Labs at sales@airgaplabs.com or call 949-669-4711. A live demonstration of AI-Sentinel's inspection pipeline is available at ai-sentinel.on-nex.us/live.

About Onnex

Onnex is the developer of AI-Sentinel, an inline AI security platform providing 9-layer, real-time protection for AI workloads. AI-Sentinel achieves 100% MITRE ATLAS coverage and has been validated against 100+ detection patterns across prompt injection, data exfiltration, tool abuse, PII leakage, and adversarial manipulation attack categories. Contact: 310-383-3231 | info@onnexglobal.com | ai-sentinel.on-nex.us

About AirGap Labs

AirGap Labs is a premier IT infrastructure engineering firm founded in 2013 and headquartered in Irvine, California. The company holds Fortinet's Engage Preferred Services Partner (EPSP) designation and delivers managed services, secure network architecture, zero-trust security, multi-cloud engineering, infrastructure, and AI enablement to clients across the United States and globally. Contact: 949-669-4711 | sales@airgaplabs.com | airgaplabs.com


Media inquiries: sales@airgaplabs.com

Back to blog