SMALL MODEL MARKETPLACE

// The problem

Why nothing works

Broken Infrastructure

Apple Intelligence is locked to new hardware. Android depends on cloud-based Gemini. CoreML is unreliable in production. The ANE chip sits unused in most iPhones because Apple's own tooling can't expose it consistently.

Flat Cloud Costs

Developers want AI that's invisible — OCR, intent parsing, classification. Cloud APIs create ongoing costs and a hard dependency. Every call costs money. It doesn't have to.

Privacy & Compliance

Hard or impossible to implement GDPR, CCPA, HIPAA, SOC 2 through a normal server. On-device processing eliminates entire categories of data transfer requirements. Data that never leaves the device is data regulators can't touch.

Hardware Fragmentation

Apple calls it Neural Engine. Qualcomm calls it Hexagon. Google ships TPUs on Pixel phones. Every chipset is different. No universal standard. Gets worse every year as new silicon ships.

// What we build

The Solution

An SDK that reliably runs small models on any device hardware — handling chip routing, optimization, and reliability so developers don't have to. On top of that, a marketplace where model creators distribute specialized models directly to developers.

Pre-Ship

Optimize & Predict

Automatic model optimization for target device profiles. Chip split visibility — GPU vs CPU vs NPU per device class. Estimated latency and memory footprint before you ship. Flags models that won't run acceptably before they reach users.

At-Ship

Dead Simple Integration

Single SDK regardless of device or OS. Write once — SDK handles CoreML / NNAPI / QNN routing underneath. Model bundling, download management, and staged loading built in.

Post-Ship

Production Observability

Real user device telemetry — actual latency, memory, thermal events, failures. Segmented by device model, OS version, and chip generation. Drift detection when performance degrades after OS updates.

Put your model in.
We handle every device.
You get full visibility into how.

// Competitive landscape

Why the alternatives fail

The closest existing solution is ONNX Runtime (2018). Every major alternative is open-source with no monetization incentive to stay current. When Apple ships a new chip generation, wrappers lag. We guarantee Day 1 support because our revenue depends on it.

Tool	Technical Reality	Sustainability	Gap
ONNX Runtime	Wraps CoreML — no direct ANE access. Static workload routing. Cloud-first design.	Community-maintained. No revenue incentive to update for new hardware.	No production telemetry. No chip split visibility. Inconsistent device reliability.
CoreML	Apple-native. Best ANE access available. Apple-only by definition.	Incentive is ecosystem lock-in, not cross-platform support.	Poorly documented edge cases. Reliability varies across A-series generations.
TFLite / LiteRT	Google's edge inference answer. Delegate system for hardware acceleration.	Google-maintained. History of sunsetting developer tools.	NNAPI delegate is inconsistent. No production observability.
→ US	Abstracts all backends (CoreML, NNAPI, QNN). Dynamic runtime routing. Single SDK.	B2B SaaS — revenue aligned with staying current. Paid SLAs.	Pre-ship optimization + chip split visibility + post-ship telemetry. Full stack.

// Market

Who we build for

Day-One TAM

Chinese iOS Developers

Legally blocked from Western cloud APIs. The pitch is not cheaper — it is the only viable option. High urgency. Strong OCR and translation use cases.

Simple Sale

EU / GDPR Developers

'Never leaves the device' eliminates their legal review entirely. Strong signal in German and Dutch indie dev communities. Privacy angle makes the sales conversation simple.

Pure ROI

High-Volume Apps

Apps processing large image or text volumes pay real API costs. Pure cost elimination. Quantifiable savings. Direct sales motion.

Higher ACV

Enterprise App Teams

Shipping SLM-powered features. Can't bet on open-source maintainability. Need SLAs and support. Longer sales cycle. Best fit for Enterprise tier.

// Business model

How we make money

The SDK is free. We monetize the tooling and guarantees around it — not the runtime itself. Free instrumentation, paid insights and reliability. Sentry / Datadog / Segment model.

Free

SDK + runtime. Limited device matrix. Community support. Evaluating and building.

Pro

Monthly

Full device test matrix. Production telemetry. Chip split visibility. Latency reports.

Enterprise

Annual

SLA guarantees. Day 1 hardware support. Private model hosting. Dedicated support.

Marketplace

Rev Share

Platform cut on paid model listings. Model creators get distribution; we take a cut.

// Execution

Sequencing

NowPhase 1

Own the runtime layer

Get into real codebases. Port first models in-house or via flat-fee contracts. No public announcement. No open marketplace yet.

Near TermPhase 2

SDK soft launch

SDK soft launch on GitHub. Tutorials go live. Engage dev communities as a resource, not a product. Become the default Cursor/Copilot suggestion. Documentation written for LLM consumption first, humans second.

PublicPhase 3

Public launch

Launch once real usage exists. Lead with the technical story: we solve cross-device AI. Device testing platform goes live. Marketplace opens.

Long TermMoat

Network effects

When model companies hit the iOS/Android wall, we are already there. Marketplace fills with specialized models. Going direct becomes less attractive than listing on the platform everyone already uses. We think about this 100% of the time. Model companies think about device distribution 2% of the time. That gap only grows.