What a Capability Is
When you right-click a file and the Transmute menu shows you what that file can become, each entry is reachable through one or more capabilities — typed operations a cartridge knows how to perform on a typed input to produce a typed output.
A capability is not a button in MachineFabric’s code. It’s a contract:
- What it accepts — an input media URN (e.g.
media:pdf,media:wav;bytes,media:llm-generation-request;json;record). - What it produces — an output media URN (e.g.
media:textable;page,media:llm-text-stream;ndjson). - How it’s named — a Cap URN of the form
cap:in="media:X";<tags>;out="media:Y"that distinguishes the operation distinctly enough that the planner can route to it. - What arguments it takes — optional parameters extracted from stdin, CLI positions, or flags.
The naming, matching, and routing of capabilities is defined by CapDAG, the open protocol MachineFabric is built on. This page covers what capabilities mean from inside MachineFabric — what the Transmute menu shows, what the standard ones are, what cartridges declare. For protocol-level details:
- CapDAG Cap URN structure — how a Cap URN composes input, output, and tags.
- CapDAG Tagged URN domain — URN syntax, normalization, wildcard semantics.
- CapDAG Dispatch — the single rule that decides which provider serves a request.
- CapDAG Ranking — how the most specific provider is selected when more than one matches.
- CapDAG Media URNs — the typed payloads that flow between cartridges.
How the Transmute Menu Reaches a Capability
You right-click a file. Here’s what happens before the menu opens:
- MachineFabric reads the file’s media type — a PDF becomes
media:pdf, a WAV becomesmedia:wav;bytes, and so on. - The planner walks the capability graph — every cap that accepts that input directly, plus every cap reachable through one or more chained cartridges, contributes a possible destination.
- The menu shows destinations, not steps — “render to image,” “transcribe,” “summarize.” You pick the destination; the planner picks the route.
- The route is shown before it runs — for multi-step pipelines the graph appears so you can see the journey before a single byte moves.
- Cartridges are invoked in order — each one in a sandboxed XPC service, each step’s progress streamed back live, results stored in your library.
When the same destination is reachable by more than one route, ranking picks the most specific match. A PDF-specific extractor beats a generic document one; an LLM cap that names a specific model beats a generic one.
Standard Capabilities
CapDAG ships a small set of standard caps that every cartridge can rely on. The mandatory one is CAP_IDENTITY; the rest are common operations the registry has helpers for.
Mandatory
| Cap URN | Purpose |
|---|---|
cap: (CAP_IDENTITY) |
Pass input through unchanged. Every cartridge must declare it. The runtime auto-registers a handler — you just include it in the manifest. |
Standard (optional)
| Cap | Purpose |
|---|---|
CAP_DISCARD |
Accept any input, produce media:void. Default implementation provided. |
CAP_ADAPTER_SELECTION |
Inspect content and return the media URNs that describe it. Used for file-type detection. |
LLM, Embeddings, Models
The cartridge SDK exposes URN constants and builders for the LLM, embeddings, and model-management caps that MachineFabric routes through every day:
- LLM inference —
CAP_LLM_INFERENCE_GGUF,CAP_LLM_INFERENCE_MLX,CAP_LLM_INFERENCE_CANDLE,CAP_LLM_INFERENCE_CONSTRAINED - LLM introspection —
CAP_LLM_VOCAB,CAP_LLM_MODEL_INFO - Embeddings —
CAP_GENERATE_EMBEDDINGS,CAP_EMBEDDINGS_DIMENSIONS - Vision —
CAP_DESCRIBE_IMAGE - Models — download, list, status, contents, availability, path (URN builders in the standard library)
The associated payload media URNs:
| Media URN | Carries |
|---|---|
media:llm-generation-request;json;record |
An LlmGenerationRequest — prompt, model spec, sampling params, optional constraints |
media:llm-text-stream;ndjson |
An LlmStreamMessage stream — tokens, status, completion, tool requests, errors |
media:llm-vocab-response;json;record |
An LlmVocabResponse |
media:llm-model-info;json;record |
An LlmModelInfo |
The full request/stream/vocab/info shapes live in the SDK Reference. The dispatch rule that routes a generation request to the right backend (GGUF, MLX, or Candle) follows the same ranking logic as everything else — the request’s model_spec plus tags determines the provider.
Document and content caps
Standard URN builders in the CapDAG library cover the common document operations:
disbind_urn(input_media)— decompose a document into pages with text contentrender_page_image_urn(input_media)— render document pages as imagescoercion_urn(source_type, target_type)— generic format coercionformat_conversion_urn(in_media, out_media)— JSON / YAML / CSV interconversiongenerate_json_urn(lang_code),make_decision_urn(lang_code),make_multiple_decisions_urn(lang_code)— structured-output LLM caps
For everything else (custom extraction, sentiment tagging, OCR, sentiment, transliteration), cartridges define their own caps. The only requirements are that the URN parses (per CapDAG URN syntax) and that the input and output media URNs it references either exist in the registry or are proposed alongside the cartridge (Contributing).
# Examples of valid custom caps
cap:in="media:text;sentiment-input;textable";out="media:sentiment-tag;textable";tag-sentiment
cap:in="media:png;bytes";analyze;target=faces;out="media:json;record"
cap:in="media:wav;bytes";transcribe;model=whisper;out="media:textable"
What a Cartridge Declares
Cartridges declare their capabilities in a CapManifest returned at startup. The manifest is what the host reads at install time to register the cartridge with the router.
The minimal Python shape looks like this (the Getting Started tutorial walks through every line):
from capdag.bifaci.manifest import CapManifest, default_group
from capdag.cap.definition import Cap, CapArg, CapOutput, PositionSource, StdinSource
from capdag.standard.caps import CAP_IDENTITY
from capdag.urn.cap_urn import CapUrn, CapUrnBuilder
def build_manifest() -> CapManifest:
tag_sentiment = Cap(
urn=(
CapUrnBuilder()
.marker("tag-sentiment")
.in_spec("media:text;sentiment-input;textable")
.out_spec("media:sentiment-tag;textable")
.build()
),
title="Tag sentiment",
command="tag-sentiment",
)
tag_sentiment.args = [
CapArg(
media_urn="media:text;sentiment-input;textable",
required=True,
sources=[
StdinSource("media:text;sentiment-input;textable"),
PositionSource(0),
],
arg_description="UTF-8 text to classify.",
)
]
tag_sentiment.output = CapOutput(
media_urn="media:sentiment-tag;textable",
output_description="One of 'positive', 'neutral', 'negative'.",
)
identity = Cap(
urn=CapUrn.from_string(CAP_IDENTITY),
title="Identity",
command="identity",
)
return CapManifest(
name="sentiment-tagger",
version="0.1.0",
channel="nightly",
registry_url=None,
description="Classify text as positive, neutral, or negative.",
cap_groups=[default_group([identity, tag_sentiment])],
)
A few rules the host enforces:
CAP_IDENTITYis mandatory. Every cartridge’s manifest must list it. The Python runtime auto-registers a handler; you just declare the cap.channelandregistry_urlare baked in. They must agree with the engine’s compile-time configuration and with the on-disk install record (cartridge.json); a mismatch is rejected at discovery time.cap_groupsare atomic registration units. A group succeeds or fails as a whole; the host uses groups to keep related caps consistent.
The Cap URN builder API (CapUrnBuilder().marker().in_spec().out_spec().build()) is the only safe way to construct a Cap URN — the runtime canonicalizes tags alphabetically, and a hand-written URN string almost certainly won’t match the canonical form. See the Getting Started tutorial for the build-once-and-keep-the-string pattern.
Routing in Practice
When MachineFabric needs a capability:
- It builds a request — a Cap URN describing what is needed.
- The router runs the dispatch predicate against every registered provider and collects the eligible ones.
- Among eligible providers, ranking picks the most specific.
- The selected cartridge is spawned (if not already running), the request is sent over the Bifaci protocol, and the response is streamed back.
- For multi-step requests, the planner finds the path through the capability graph and the executor runs each step in order, with parallel execution where the graph allows it.
If two providers match with equal specificity, the first registered wins. This means a specialized cartridge for a specific file type reliably overrides a more general one for that type, while leaving the general cartridge available everywhere else.