Security-first AI gateway · litellm-compatible

The AI gateway with the firewall built in, not bolted on.

LLMSecure is a private, drop-in replacement for your LLM proxy. The injection, PII and secrets firewall runs inside the gateway process — so prompts never leave your boundary, there is no separate port to route around, and you remove a CVE-prone component instead of adding another.

Request a Pro trial Why inline matters

Private / self-hosted Prompts & responses stay in your environment Shadow mode & tunable thresholds

Built for security-forward teams in finance, AI, Web3 and regulated infrastructure

FINANCEAI LABSWEB3FINTECHREGULATEDSAAS

The problem

A firewall in front of a vulnerable gateway is a firewall you can walk around.

The common pattern today is an LLM firewall chained in front of an open-source gateway like litellm. That leaves two gaps the chained model can't close:

Network bypass

Compromise any host on the same segment and you can talk to litellm directly — the front firewall never sees the request. SSRF inside the environment does the same.

Data leaves to scan

External guardrails (SaaS scanners) require shipping the prompt out to a third party to inspect it. For regulated and financial workloads that is often a non-starter.

Infra still exposed

Content guardrails only protect the conversation. The gateway's own SSRF / auth / injection CVEs remain — none of them are defended by a content filter.

LLMSecure collapses gateway and firewall into one hardened, private component. The firewall is inline by architecture — there is no bare-gateway port to skip — and the scanning happens in-process, so nothing is sent to a third party to be inspected.

Architecture

Two pillars, one hardened component

The name says it: LLM Secure + Secure LLM. Securing the gateway itself, and securing the calls that pass through it.

Pillar 1

LLM Secure — the gateway's own security

AI infrastructure security.

Memory-safe Rust core, minimal dependencies, no Python supply chain.
Parameterized queries, enforced authorization, least privilege by default.
SSRF egress allow-listing, signed builds, SBOM, isolated CI/CD.
Replacing litellm removes a component with a real critical-CVE history.

An add-on firewall can't give you this — it sits outside the thing that's exposed.

Pillar 2

Secure LLM — securing every call

AI runtime security.

Prompt-injection & jailbreak detection on input.
Secrets, credentials and PII detection on the way to the model.
Response data-leak / DLP, content moderation & malicious-URL checks (output — Enterprise).
Shadow mode + adjustable thresholds with transparent false-positive/negative reporting.

All in-process — your prompts are never forwarded to an external scanner.

Competitors bolt Pillar 2 onto a fragile Pillar 1 (litellm). We fuse both into a single hardened component — and we solve the infrastructure security that content filters ignore.

Why inline

Three things only an in-process design can do

Inline by architecture

The firewall lives in the gateway process. There is no standalone "bare gateway" listener for an attacker to reach around to. (Honest scope: gateway-host compromise is addressed by Pillar 1 hardening, not by this claim.)

Scanning never leaves

Detection runs in-process on your infrastructure. Prompts and responses are not sent to any external service — including ours. The control plane only ever receives metadata and verdicts.

Solves infra security

By replacing the gateway rather than wrapping it, you shrink the attack surface instead of growing it — one hardened component instead of two.

Comparison

How we compare

Against gateways and AI-security vendors, on the dimensions that matter for regulated deployments.

Capability	LLMSecure	Cloudflare	Kong	Palo Alto AIRS	F5
Private deploy / data stays in your boundary	Yes	SaaS	Partial	Partial	Enterprise-heavy
Native firewall depth	Yes	Limited	Bolt-on	Yes (chained)	Yes (chained)
Solves infrastructure security (Pillar 1)	Yes	No	No	No	Partial
Firewall inline / no port to route around	In-process	No	No	Chained	Chained
Scanning never sent to a third party	Yes	No	No	No	No
Replaces litellm (shrinks attack surface)	Yes	No	General GW	Add-on	Partial
litellm-compatible drop-in	Yes	No	No	No	No

Comparison reflects typical deployment models as positioned by each vendor; verify against your own requirements.

vs litellm — the drop-in we replace

A litellm-compatible API surface, so existing SDK code keeps working — what changes is the security posture underneath.

Capability	LLMSecure	litellm
litellm-compatible API surface (drop-in base-URL swap)	Yes	Reference
Implementation	Rust — memory-safe core, minimal deps	Python — large dep tree, history of critical CVEs
Inline LLM firewall (prompt-injection / jailbreak)	Built-in, in-process	Not built-in
Secrets & PII detection in prompts	Built-in (regex + ML)	Not built-in
Output firewall (response DLP & moderation)	Enterprise tier	Not built-in
SSRF egress hardening	Default-deny + DNS-pinned IP; blocks 6to4 / NAT64 smuggling	Standard outbound, no pinning
Tamper-evident audit log	Hash-chained; control plane sees metadata only	Basic logs
SOC / SIEM integration (syslog RFC 5424)	Enterprise — Splunk / Elastic / Sumo / ArcSight / QRadar	Not built-in
One-command deployment	Single docker compose unit (gateway + detector)	Python app + dependencies
Provider breadth	OpenAI / Azure / Anthropic / Bedrock at MVP, growing	100+ providers
Virtual keys, budgets & rate limits	Yes — fail-operational; refund on upstream failure	Yes
Streaming (SSE) & multi-provider fallbacks	Yes	Yes
Update channel	Signed bundles (Enterprise)	pip / git

litellm is the closest open-source peer; this is positioning, not a slight. Verify against your own requirements.

Capabilities

Gateway parity, plus security

Gateway, aligned with litellm

OpenAI-compatible unified API across providers
Multi-provider routing with fallbacks and retries
Virtual keys, per-key budgets and rate limits
Usage metering and cost tracking
Streaming (SSE) responses
Drop-in base-URL swap — point your SDK at LLMSecure

Security firewall

Input firewall — Pro & Enterprise

Prompt-injection & jailbreak detection
Secrets & credential detection in prompts
PII detection on the way to the model

Output firewall — Enterprise

Response data-leakage / DLP & redaction
Content moderation & malicious-URL checks in output

Both tiers

Shadow mode, tunable thresholds, transparent FP/FN reporting
Immutable audit trail; one-click bypass for troubleshooting

Tiers

Two ways to run it

No public self-serve. Start on Pro, move to Enterprise when you're ready. Both are priced per engagement.

Pro

Private data plane, SaaS control plane — close to a SaaS experience, without the data exposure.

One-line install — one command / docker run
Input firewall — injection, secrets & PII on the way to the model
Auto-registers with the control plane, pulls policy, demo in minutes
Data plane stays private — control plane sees metadata & verdicts only
Dashboard surfaces metadata + decisions only — plaintext stays in your boundary by design.
Hosted admin dashboard with magic-link sign-in (no SSO at this tier)
Per-rule fail-open / fail-closed, one-click bypass
Runs disconnected (cached policy) if the control plane is unreachable

Request a Pro trial

For regulated & air-gapped

Enterprise

Fully private or air-gapped, with a curated update channel and compliance artifacts.

Everything in Pro, fully self-contained
Adds the output firewall — response DLP, data-exfiltration & content moderation
SSO / SAML / OIDC for the admin console, plus a self-hosted admin with prompt retention for forensic case analysis (plaintext stays inside your boundary)
SOC / SIEM integration — syslog forwarder (RFC 5424 over TCP/TLS) for Splunk, Elastic, Sumo, ArcSight, QRadar or any compliant collector
Air-gapped operation with a curated, signed update channel
Compliance artifacts (SBOM, signed builds), BYOK
SLA, priority support, and onboarding
Try Pro first, then graduate to Enterprise

Talk to us

Honest scope

We tell you what it does — and what it doesn't.

Detect & mitigate

Prompt-injection defense is detection and mitigation, not a guarantee. We run in shadow mode first and report false positives and negatives openly.

Data residency by default

Prompts and responses never leave your deployment. The control plane receives metadata and verdicts only; sample sharing is strictly opt-in and redacted.

Defense in depth

Inline means no network port to route around — not "unbreakable." Host hardening, least privilege and signed builds carry the rest.

Get started

Request access

Pro and Enterprise are both by application. Tell us about your setup and we'll get you into a Pro trial.