VGC v1 Live Optimization Demo

Try the public VGC demo, or request protected access for paid-model testing.

Run free hosted open-model tests here, then request testing tokens when you want expanded commercial/provider comparisons in the protected live lab.

Why care: instead of forcing one model everywhere, VGC turns model choice into operating leverage you can see. Access boundary: public runs use hosted open-weight models; commercial/provider runs require protected access, testing tokens, server-side credentials, or an approved BYOK workspace.

Free public testing

Use hosted open-weight models to see the VGC workflow, scorecard, sample cost read, and claim gates without entering credentials.

Protected paid-model testing

Request testing tokens for OpenAI, Claude, Gemini, DigitalOcean, private connector, or larger repeat batteries with ledger-backed reporting.

Private/BYOK workspace

Provider keys stay server-side or in an authenticated workspace vault. The public demo never collects provider API keys.

Gated Inference Evidence
61.98%

Protected benchmark-map high-water evidence, dated in the current claim scorecard; not a guarantee from one free public sample run.

Gated Token-Cost Evidence
68.54%

Protected benchmark-derived posture. Hard-dollar claims require provider usage or billing reconciliation.

Gated Latency Evidence
44.19%

Measured protected p99/floor evidence, not a universal provider latency claim.

Fit Preservation
98.69%

Measured bounded fit-preservation evidence from protected claim artifacts, not a broad truth guarantee.

Optimization Results - Test VGC v1 Savings Live

This is the fastest way to understand the page and why it matters. Run one prompt across real hosted models, then watch VGC show where the route, savings, and operating advantage come from.

1

Select a model, or let VGC choose the lane.

Use compare mode for side-by-side raw model results, or switch to architecture-selected modes to let VGC route the task.

2

Create a prompt, or use one we provide.

Paste a real use case, use the current preset, or generate another sample prompt to fit the story you want to show.

3

Get a real response from hosted models.

The server runs the selected models, captures the outputs, and updates the run monitor and event feed as work completes.

4

See VGC optimization and savings live.

The scoreboard and delta view show whether the VGC route improves on a flat raw-model choice for the current task.

2. Prompt and Response Lab

Start here. This is the operator-facing workflow for the run: the current prompt, the latest response, and the exact text VGC is routing through the benchmark surface.

Current Prompt
The current prompt will appear here.
Latest Routed Response
Run the live benchmark to capture a response.
3. Live Run Monitor

Then watch the run happen. This section is the live state layer for the hosted benchmark: progress, per-model bars, and what the architecture is doing right now.

4. Run Event Feed

This event log keeps the benchmark legible. It shows model starts, completions, timeouts, and stop events so the demo feels like a transparent product run, not a black box.

5. Commercial Rollout Read

This is the private-lab commercial view. It answers which provider lane should come next, how it unlocks, and what private-preview posture we should use while the public demo stays clean.

Choose a provider action above to see the recommended BYOK, subscription, or private-lane rollout read for that lane.
6. Private Request Ledger

This is the private reporting layer behind the benchmark. It shows recent recorded runs, average savings signals, and the model lanes winning most often on this host.

7. What Changed In This Run
8. What This Run Indicates
9. VGC Savings Scoreboard

This is the VGC layer in measurable form: inference savings, token-cost reduction, latency improvement, and fit preservation, tied back to the current prompt, route, and model fingerprint. When control mode is baseline core, treat this section as the no-layer reference before comparing against a governed pass.

10. Claim Evidence Packet

This is the current proof ledger for visible claim posture. Public demo hosts summarize session and public-gate evidence; the protected live lab carries detailed commercial/provider rows.

Primary/Secondary Benchmark Map

This is the production scoreboard contract: primary rows carry work, cost, truth, and p99/floor latency gates; secondary rows carry checkpoint, prefix, session, and attribution evidence without lowering the protected hold-line benchmark.

Model-Specific Claim Packets

These packets keep OpenAI, Claude, DigitalOcean fixed, router, and mechanism evidence separated by profile so strong results are not blended into broader claims or dropped when the benchmark map changes.

Commercial Gate Readiness

This is the authoritative launch-readiness view. It lists each lane against the scope, comparator, truth, work, cost, p99/floor latency, GUI evidence, and hold-line gates.

Claim Gate Scorecard
Tail/Spike Mechanism Gate

This shows whether the selector can react to repeat-backed tail, spike, and dip signals before the next comparable call. It is mechanism proof only until paired live A/B runs prove p99/floor, token, work, cost, and semantic outcomes.

Tail/Spike Paired Proof Gate

This shows the current promotion shape for turning tail and spike reaction into commercial evidence. It stays a rehearsal until deterministic rows are replaced with live paired A/B provider rows.

Tail Predictor Report

This separates harmless comparator-fast floor events from VGC-tail events and shows which signals are predictive enough for tail-watch gating.

VGC Starter Profile Catalog

VGC is the adaptive control layer. This catalog is only a distribution wrapper: publish starter profiles for common model surfaces, then send customer-specific lanes through the live lab before custom sidecar or SDK work.

Profile Export Packet

This is the bridge from lab proof to deployment: a candidate VGC profile with the control defaults, metering boundary, promotion gates, and resource-allocation read needed before sidecar, SDK, or plugin rollout.

Volume Savings Calculator

Enter a workload amount and choose one measured lane. The calculator extrapolates the same lane percentage over that volume so savings stay tied to the benchmark result that produced them.

11. Blind Spot Radar

This is the production-readiness guardrail. It shows what VGC can claim now, what is still proxy signal, where measurement may be altering the response, and which gates must pass before a broader public or commercial claim.

12. Frontier Limits

This shows the practical ceiling we can currently justify and the tests that would extend it. It keeps today's exact-lane wins visible while showing why stricter ClaimGateV2 production claims are still held.

13. Raw Vs VGC Delta View

This gives the fastest visual answer to why the architecture matters on the current run: what a flat raw choice would optimize for, what VGC routes instead, and where the measurable deltas actually show up.

14. VGC Control Matrix

This is the explicit with-VGC versus baseline-core read for each lane on the current run. It shows where VGC is active, where it is only matching a governed lane, and where a lane is still running in neutral-core posture.

15. VGC Phase Stack

This shows the full VGC operating path on the current run: anticipation before generation, governance during generation, mirror/audit after generation, and whether the trajectory/forecast layer is inline or still lab-attached.

16. Execution Substrate Read

Both open-weight and commercial lanes are currently benchmarked as request/response runs. This section keeps the honest boundary in view: host-local runtime control versus remote provider API opacity.

17. Packet-Aware Read

This shows whether the current comparison is an isolated lane or a mixed live packet, and whether VGC is using a packet-aware commercial posture on the focus lane.

18. Objective Tuning And Fit Cage

This separates what VGC is optimizing for from what happened to bounded fit on the run. Use it to tune for budget, latency, or stability while keeping the fit cage transparent and isolated from the savings claim.

19. Objective Sweep Comparison

This runs the same selected packet through explicit VGC objectives so we can compare budget, latency, and stability on one surface instead of assuming that token trimming will automatically become latency savings.

20. Human Spot Check

This is the clean human-in-the-loop view. It keeps the response text, fit cage, claim posture, and direct runtime numbers together without filtering the underlying row into a prettier story.

21. Anomaly Watch

This flags where VGC is active but flat, where a provider is saturating the fit proxy, and where controller activity may be outrunning measurable payoff.

22. Winners
23. Where VGC v1 Fits

The raw models matter, but the product value comes from the VGC layer above them. This section explains what a flat model choice would look like versus the routed VGC decision.

Raw Models Through VGC

Each row begins as a raw model execution on the same prompt. VGC then evaluates that raw output and turns it into a routing, savings, and deployment decision surface.

Model Role Raw Latency (s) Visible / Work Tokens Router Policy Raw Chars VGC Action Fires VGC Action Delta VGC Substitution Fires Baseline Substitution VGC Sub Delta VGC Fit Preservation VGC Deployment Use
Per-Model Savings and Fit

These cards translate the benchmark into model-specific operating choices. They help the person running the demo show why an agentic system should switch models by task instead of forcing every request through one compromise model.

Six-Layer Architecture In Operation

This live lab keeps the full Specialized Stack in view. We are not trading away the architecture to get the demo.

Signal Strips

Each strip shows controller fit preservation, structure fit preservation, and lane events over the evaluated token sequence for that model.

Response Previews
What The Numbers Mean
Operating Read

This lab is already model-in-the-loop and live. The hosted version runs on the server so the person receiving the demo only needs a browser. The architecture, benchmark backend, and model runtime all stay inside the hosted product surface.

What we gain: real prompt-matched model switching, live benchmark packets, grounded model-role comparisons, and a place for prospects to test their own use cases on the spot.

Current boundary: public demo hosts run open/public lanes and hide private connector controls; the protected live lab can run configured commercial/provider lanes, with authenticated GUI E2E still required before public claim promotion.

Raw Packet (Backend Naming)

This is the direct backend packet for operators and debugging. Some internal field names still use the legacy truth-preservation schema even though the demo surface now presents that metric as fit preservation.

Show full benchmark JSON
Run the live benchmark to inspect the packet.