A concrete scenario
Your team is building an agent that books appointments for end-users. The agent sees pages full of names, emails, phone numbers, and date-of-birth fields. You want to use JD Codec to keep token cost down — but you need to be confident that:- PII never reaches our infrastructure.
- Your sessions can’t see another customer’s data, even by accident.
- Traffic in flight can’t be intercepted.
- Old sessions don’t sit around waiting to be exfiltrated.
The four pillars
1. The on-device privacy boundary
The full treatment is in Privacy posture. The summary:- The connector runs on your machine. Before any snapshot crosses the network, a regex-and-rule pack matches structured PII (emails, phone numbers, credit cards, dates of birth, addresses, API keys, tax IDs, IP addresses, and more) and replaces matches with category-level placeholders like
{{REDACTED_EMAIL}}. - The cloud never sees raw PII. It sees the redacted snapshot plus category-level counts (e.g.
{ email: 2, CC_GENERIC: 1 }). - The cloud refuses requests that arrive without the audit signal proving the redaction layer ran. A misconfigured connector fails loudly on the first request — the unsafe path is impossible to take by accident.
- The connector source is public. You can inspect the redaction rules or run the package locally with the privacy layer in audit-only mode to see what would be redacted before sending anything to us.
2. Authenticated transport
- HTTPS only. All connections to
api.jdcodec.comgo over TLS. The connector refuses to send to a non-HTTPS endpoint (loopback excepted for local development). - Bearer-token authentication. Every request carries
Authorization: Bearer <api_key_id>.<api_key_secret>. Yourapi_key_idis the public half (it appears in logs and request IDs); yourapi_key_secretis the private half. - Secret half is hashed, not stored. When we issue an API key, we store a hash of the secret using a salted, iteratively-stretched key-derivation function. The raw secret value is shown to you once at issuance time and is not recoverable from our side. If you lose it, we issue a new key — we cannot recover the old one.
- Comparison is constant-time. Authentication uses a constant-time comparison so timing attacks can’t enumerate valid keys.
3. Tenant isolation
- Each session runs inside its own isolated compute instance. The instance is keyed by a combination of your
api_key_idand thesession_id— two customers cannot land on the same instance even if they happened to pick the samesession_id. - A request that presents a
session_idfrom a differentapi_key_idthan the one that created it is refused. - The instance holds in-memory state only for the lifetime of the session (see §4). When the session ends, the state is gone.
4. Minimal-by-design retention
The product is designed to keep as little as possible, for as short as possible:| Data class | What we keep | How long |
|---|---|---|
| Raw snapshot YAML | Nothing — never written to disk | Zero — never persisted |
| Compressed snapshot output | Nothing — never written to disk | Zero — never persisted |
| In-memory session state | Only during an active session | Cleared at idle TTL, absolute TTL, or process restart |
| Usage metadata (no content) | Session/task/step IDs, character counts, timings, redaction category counts, redacted URL, error codes | 90 days, then hard-deleted |
| Logs | Metadata only (request IDs, key IDs, timings, error codes) | Platform default; revisited as we harden |
session_id fails cleanly with a “start a fresh session” error. Per-key overrides are available for use cases that need different bounds — reach out.
The 90-day usage-metadata window is configurable per-deployment and contains zero content — only counts and identifiers. We use it for compression analytics and support conversations.
What logs and error messages contain
- Logs include request IDs, key IDs, session/task/step IDs, character counts, timings, and error codes.
- Logs do not include snapshot bodies, compressed output, redacted values, or the secret half of any API key.
- Error response bodies follow the same rule. Error messages are category-level — they never echo any portion of the submitted snapshot.
Audit trail — what’s enforced, not just documented
- The on-device redaction layer is mandatory. The connector cannot ship a snapshot without it. The cloud refuses requests that don’t carry the audit signal.
- Automated checks on every connector commit prevent codec-internal vocabulary or sibling-repo references from leaking into customer-visible code.
- A weekly audit rescans every published artefact (npm, PyPI, public source repo) for the same patterns — last-line defence if anything slips past the local checks.
- The connector source is public at github.com/jdcodec/connector. You can verify the auth, transport, and redaction logic yourself.
What’s not yet covered
We aim to be honest about the gaps so you can plan around them.- Server-side secondary check for PII. A defence-in-depth layer where the cloud scans for PII patterns as a backstop (in addition to refusing requests without the audit signal). Reserved in our API contract but not enforced today. The on-device layer is the authoritative boundary.
- Per-customer retention overrides for the 90-day usage-metadata window. Currently a single global default; per-key override is on the roadmap.
- Custom redaction rule packs. The on-device rule pack is shared across all customers today. Allow-listing custom domains or adding custom redaction rules is on the roadmap.
- SOC 2 / ISO 27001 attestations. We are not certified today. We can share our security posture and engage with your vendor-review process; reach out.
- Penetration test reports. Available on request under NDA.
Reporting a security issue
Email [email protected] withSecurity: in the subject line. We’ll respond within one business day. Please do not file issues on the public repo for security-sensitive reports.
Related
- Privacy posture — the deep-dive on the on-device redaction layer.
- What is JD Codec — what the product does and how it fits in.
- Sessions, tasks, steps — the unit model.
- API Reference — error codes, request shapes, headers.