Where is my data stored and processed?

All inference and any associated data stays inside the platform we operate for you, in EU or German data centers. Nothing is forwarded to public AI APIs, and nothing is used to train shared models.

Is the API really OpenAI-compatible?

Yes. We expose OpenAI-compatible endpoints, so most existing client libraries, SDKs, and integrations work by changing the base URL and key. We help with the rollout.

Which integrations are supported today?

Nextcloud, GitLab, and any tool that speaks the OpenAI-compatible API — including Claude Code, Cursor, Continue, and CLI agents. Custom integrations are part of the engagement.

How does pricing work?

Three packages (Starter 490€, Business 1490€, Enterprise from 2490€) with a monthly base fee and an included credit quota. Token usage is billed via the credit system — see the pricing section. A 12-month commitment adds +20% credits, prepayment adds +30%.

Can it run on-premise or in our own cloud?

Yes. The platform can be deployed in our managed environment, on dedicated bare metal we operate for you, or — for some setups — inside your own infrastructure as a managed service.

Which models are available?

A curated set of production-grade open-source models from the Llama, Qwen, Mistral, and DeepSeek families — grouped into Standard, Advanced, and Premium classes. The model list is refreshed regularly; we'll share the current set on request.

How long are logs and data retained?

Operational logs are kept short-term and rotated automatically. Inference content is never used to train models and is not retained beyond processing. Exact retention windows are set contractually.

Is a DPA (Data Processing Agreement) available?

Yes. We provide a GDPR-compliant DPA that covers EU hosting, sub-processors, and technical & organizational measures.

How does billing and cancellation work?

Monthly billing (standard) or 12-month commitment / prepayment with bonus credits. Monthly plans can be canceled at the end of the month. Commitment plans run for the agreed term; renewal is optional.

Is there a pilot or trial program?

Yes. Pilot programs let you validate in your environment — we tailor scope, models, and limits to your use case. Get in touch and we'll size one for you.

What's the support response time per tier?

Starter: email support during business hours. Business: email plus optional Slack Connect / Teams. Enterprise: priority support including phone, with response times per the agreed SLA.

Managed AI API Gateway · EU-hosted · GDPR-compliant

Private AI for your business — hosted in Europe

OpenAI-compatible endpoints, curated open-source models, and transparent credit-based billing. Pilot with Starter from 490€ / month and scale to Business and Enterprise — without sending data to US providers.

Pilot programs available — talk to us.

Talk to us See pricing See use cases

Germany & EU
Hosting: GDPR-compliant
Compliance: OpenAI-compatible
API

Use cases

Real AI, in the tools you already use

We don't just host models — we bring AI directly into your existing platforms, so your data stays where it lives.

Nextcloud AI

Document summarization, semantic search, and AI chat over your files — built into the Nextcloud you already operate.

GitLab AI

Code explanations, merge request summaries, and a private dev assistant — running inside your GitLab, not someone else's cloud.

Agentic coding & IDE assistants

A private endpoint for Claude Code, Cursor, Continue, and CLI agents. Your code stays in your environment.

Platform

Your private AI platform — fully managed

Built for regulated environments where data sovereignty and predictable operations matter more than benchmarks.

Privacy-first

GDPR-compliant by design. No data leaves your environment, and nothing is shared with public AI APIs.

EU & Germany hosting

Operated in European data centers. Choose Germany or another EU region depending on your compliance needs.

OpenAI-compatible API

Drop-in compatible endpoints. Point your existing tooling at our platform without rewriting your integrations.

Modern open-source models

Curated, production-grade open-source models. Updated and operated by us — no model sprawl on your side.

High-performance inference

Powered by modern AI infrastructure on H100-class hardware, tuned for real-world workloads — not benchmarks.

Transparent usage

Clear usage tracking and predictable cost structure. No surprise bills from token spikes.

Drop-in compatible

OpenAI-compatible — switch in 5 lines

Same SDKs, same endpoints. Point your base URL and API key at b'nerd and your existing tooling runs on private AI.

bnerd@de-muc1:~⎇ main

$ curl https://api.bnerd.com/v1/chat/completions \
  -H "Authorization: Bearer $BNERD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from openai import OpenAI

client = OpenAI(
    base_url="https://api.bnerd.com/v1",
    api_key=os.environ["BNERD_API_KEY"],
)

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello!"}],
)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.bnerd.com/v1",
  apiKey: process.env.BNERD_API_KEY,
});

const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [{ role: "user", content: "Hello!" }],
});

0:api de-muc1 b'nerd

Example model name; we'll share the current set of available models on request.

Architecture

Built for control and transparency

A platform you can grow into — from a first pilot to production-grade AI features across your tools.

Shared infrastructure with clear isolation: Workloads run on shared platform infrastructure with strict tenant isolation and tier-based prioritization. No surprise model swaps, no foreign data in your inference.
Hosted in Europe: Operated in EU and German data centers, under European jurisdiction. Data residency is a deployment choice, not a footnote.
Open architecture: Open-source models, OpenAI-compatible API, and standard integrations — so you can move, swap, or self-host later. No vendor lock-in.
From Starter to Enterprise: Pilot with Starter, scale to Business for production load, move to Enterprise for governance, compliance, and custom SLAs.

See pricing

Managed AI API Gateway

Pricing

Three packages for any workload. Token prices and the credit system apply uniformly across all tiers.

All prices excl. VAT. B2B only.

Which package fits you?

Starter

For smaller production workloads, internal assistants, RAG prototypes, and controlled API usage.

Business

For team and enterprise workloads with higher throughput, more stable usage, and prioritized processing.

Enterprise

For business-critical AI workloads with governance, compliance, integration, and custom operational requirements.

Starter

Prototypes & smaller production workloads.

490€ / month

Start a pilot

What's included

20M credits / month
Shared best-effort AI infrastructure
OpenAI-compatible API
Standard queue priority
Standard API limits & context
Baseline monitoring
GDPR-compliant hosting in the EU
Availability: up to 99.5%
Email support

Recommended

Business

Prioritized processing for team workloads.

1490€ / month

Request a demo

What's included

50M credits / month
Prioritized processing within shared infrastructure
Extended API limits
Higher requests / tokens per minute
Extended context limits
Full access to Premium models
Optional VPN access · SSO available
Monitoring & extended usage reporting
Availability: up to 99.9%
Email support · Slack Connect / Teams optional

Enterprise

Business-critical · Compliance · Custom.

from

2490€ / month

Talk to sales

What's included

Custom credit quotas
Highest priority within shared infrastructure
Custom API limits & concurrency
Extended context limits
Full access to Premium models
Private networking available
VPN / SSO integration
Audit logging & extended reporting
Custom model integrations optional
Custom SLA agreements
Priority support (email, phone, Slack Connect / Teams)

Token pricing

Prices per 1M tokens. Applies uniformly across all packages.

Model class

Included usage

On-demand

Standard

Chatbots · RAG · automations

Included usage 1.90€

On-demand 2.90€

Advanced

Coding assistants · agents · complex assistants

Included usage 4.90€

On-demand 6.90€

Premium

Reasoning · high-end AI · complex analysis

Included usage 9.90€

On-demand 14.90€

Credit system

The platform bills based on credits. Different model classes consume different amounts of credits per token.

Standard 1× credits
Advanced 3× credits
Premium 6× credits

Example

10M Standard tokens = 10M credits
2M Advanced tokens = 6M credits
0.5M Premium tokens = 3M credits

Total usage: 19M credits

Credit calculator

Estimate your usage and see what each package would cost — including on-demand overage.

Estimated usage

Standard tokens Chatbots, RAG, automations

0

Advanced tokens Coding, agents, assistants

0

Premium tokens Reasoning, high-end

0

Enter millions of tokens per month. Credit factors: Standard 1×, Advanced 3×, Premium 6×.

Packages compared

Total usage

0

Starter

Business

Enterprise

Request this package

Non-binding estimate. Usage beyond the included quota is billed at on-demand rates (allocated proportionally across model classes).

Term & prepayment

Longer terms or prepayment increase your credit quota — list prices stay the same.

Monthly

Standard

Standard pricing
Flexible usage
No minimum term

12-month commitment

+20% credits

+20% additional credits per month
Stable price baseline for 12 months

12-month prepayment

+30% credits

+30% additional credits per month
Prepaid — one invoice per year

Fair usage & performance

An API-first managed service: you focus on integration, we run the infrastructure and models. Tier-based limits keep performance stable under load.

Requests per minute

Tier-based RPM limits protect the platform and ensure predictable response times.

Tokens per minute

TPM limits scale with your package — Business and Enterprise have significantly higher throughput.

Context limits

Maximum context size per request, depending on package and model.

Queue prioritization

Prioritized lanes for Business and Enterprise keep response times stable under load.

Private AI — FAQs

Common questions about data residency, integrations, and the engagement model.

: All inference and any associated data stays inside the platform we operate for you, in EU or German data centers. Nothing is forwarded to public AI APIs, and nothing is used to train shared models.
: Yes. We expose OpenAI-compatible endpoints, so most existing client libraries, SDKs, and integrations work by changing the base URL and key. We help with the rollout.
: Nextcloud, GitLab, and any tool that speaks the OpenAI-compatible API — including Claude Code, Cursor, Continue, and CLI agents. Custom integrations are part of the engagement.
: Three packages (Starter 490€, Business 1490€, Enterprise from 2490€) with a monthly base fee and an included credit quota. Token usage is billed via the credit system — see the pricing section. A 12-month commitment adds +20% credits, prepayment adds +30%.
: Yes. The platform can be deployed in our managed environment, on dedicated bare metal we operate for you, or — for some setups — inside your own infrastructure as a managed service.
: A curated set of production-grade open-source models from the Llama, Qwen, Mistral, and DeepSeek families — grouped into Standard, Advanced, and Premium classes. The model list is refreshed regularly; we'll share the current set on request.
: Operational logs are kept short-term and rotated automatically. Inference content is never used to train models and is not retained beyond processing. Exact retention windows are set contractually.
: Yes. We provide a GDPR-compliant DPA that covers EU hosting, sub-processors, and technical & organizational measures.
: Monthly billing (standard) or 12-month commitment / prepayment with bonus credits. Monthly plans can be canceled at the end of the month. Commitment plans run for the agreed term; renewal is optional.
: Yes. Pilot programs let you validate in your environment — we tailor scope, models, and limits to your use case. Get in touch and we'll size one for you.
: Starter: email support during business hours. Business: email plus optional Slack Connect / Teams. Enterprise: priority support including phone, with response times per the agreed SLA.

Do you have questions or would you like a personalized offer? We are happy to advise you.

Submit Inquiry

Contact

Our cloud experts are happy to provide personalized advice.

Sillemstraße 76A

20257 Hamburg, Deutschland

Mon - Fri: 09:00 AM - 06:00 PM

+49 40 239 69 754 0

hello@bnerd.com