You Don't Have a Content Problem. You Have a Pre-Production Problem.
shotkit. The Explainer. · 90 seconds · the pre-production system we use to ship hundreds of videos a month, every platform, every aspect ratio. github.com/whystrohm/shotkit.
The diagnosis. Vague brief plus model roulette equals brand drift.
You ship a brief. "One social video this week. Make it pop. Make it brandy." A teammate or a contractor opens Midjourney, then Flux, then Ideogram, then GPT Image, looking for the one that lands. Each generator rewards a different prompt syntax. Each output drifts in a slightly different direction.
By the fifth iteration, the brand colors have shifted. The framing has shifted. The on-screen text is fighting with the photography. Nothing is technically wrong. None of it feels like the brand.
You read the result and you do not say "the AI is broken." You say "we have a content problem." Then you double down on output volume, hoping consistency comes from quantity. It does not.
The problem is upstream of the generator. The brief is vague. There is no shared spec. There is no encoded brand state. There is no audit trail. The generator is doing what it always does, which is regression to the mean of whatever was popular last quarter. The drift is structural.
Who shotkit is for.
shotkit is built for one specific person: the founder, operator, or content lead who runs AI-assisted video for a real brand and is tired of the model-roulette tax.
You are in the target if you are nodding at any of these:
- You ship more than a few branded videos per month and the consistency keeps slipping.
- You bounce between Midjourney, Flux, Ideogram, and GPT Image looking for the right one and re-doing prompt syntax each time.
- You have a brand book on Notion that nobody references when the generator is open.
- You want to scale video output without scaling the founder bottleneck.
- You want to be able to answer "what brand version was this approved against" six months from now.
- You build inside Claude Code and you want skills that compose with the rest of your stack.
shotkit is not for hobbyists generating one image at a time. The structure is overhead for casual use. It is also not for teams that prefer SaaS storyboard tools with synchronous editing in a browser. It is for operators who think in files.
Score your current content infrastructure with the free /scan diagnostic →
Five layers locked top to bottom.
Every image prompt that ships from a serious AI video pipeline is composed from five locked layers. Change one layer, the others stay still. This is the architectural property that makes the whole pipeline maintainable as you scale.
1. Brand lock. The constant across the entire project. Palette as hex values. Typography with explicit weights. Mood adjectives that contrast against generic alternatives. A "never" list that rules out entire visual categories. Aspect ratios. Color grade direction. Voice rules. This file changes maybe twice a year for a stable brand.
2. Series lock. The constant across one storyboard. Character anchor (who is in frame). Environment. Lighting setup. Color grade specifics. This locks visual continuity across a single piece of content so the same person in shot one is the same person in shot seven, in the same room, under the same light.
3. Shot spec. Variable per shot. Framing, angle, motion, depth of field, subject, rationale. Each shot earns its individual identity within the series-locked world. Every shot has a one-sentence rationale. Why this beat. Why this duration. Why this framing.
4. Text layer. On-screen copy with its own font, color, position, animation, and timing. Never in the image prompt. Composited separately because text rendering in image generators is still imprecise in 2026, and because brand fonts at brand weights at brand colors only survive a separate compositing pass.
5. Generator adapter. Applied at compose time. Same shot data, different syntax. Midjourney rewards short high-signal phrases. Flux wants natural-language sentences. Ideogram is text-aware. GPT Image is paragraph-form. The adapter is the only layer this skill applies. Everything above it is input.
The mistake every storyboard SaaS makes is collapsing these five layers into one prompt string. It is fast for a single shot. It is fragile across a series. shotkit keeps the layers separate so a brand-color edit propagates to every shot, every prompt, every render in one move.
Brand-lock snapshots. Defense-grade audit trail.
Six months from now, somebody on your team is going to look at a video and say "what version of our brand was this approved against." The answer to that question is the difference between a content system and a content situation.
shotkit answers it by snapshotting the brand-lock file at the time of every storyboard run. The snapshot is dated. It references the source path. It is frozen. If the brand evolves, the snapshot stays valid against its original inputs. The video remains reproducible six months later, six brand revisions later, by the same team or a new one.
This is not novelty. It is the audit trail pattern from defense systems engineering applied to commercial content. Every artifact has a chain back to the source decision. You can answer "why does this look this way" by reading the file, not asking the designer.
Most teams skip this and do not feel the cost until the second or third year, when nobody can remember why a launch from March looks slightly off compared to a launch from June. By then, the chain is already broken. Read more on the audit trail pattern in The Founder Fingerprint, where we walk through the same discipline applied to brand voice.
What shotkit produces.
Run the four skills against a brief and a brand-lock. shotkit returns this:
output/ ├── storyboard.md # Human-readable, shot-by-shot ├── shots.json # Schema-validated, machine-readable ├── text-overlays.json # On-screen text + timing ├── brand-lock.snapshot.md # Frozen brand state at generation time ├── prompts/ # Per-generator prompts, copy-paste ready │ ├── midjourney.txt │ ├── flux.txt │ ├── ideogram.txt │ ├── gpt-image.txt │ ├── nano-banana.txt │ ├── seedream.txt │ └── runway-sora.txt └── preview.html # Single file. Shareable. Printable. Brand-aware.
Files. Not panels. Not a SaaS dashboard. Files an editor, motion designer, or developer can act on without asking a follow-up question. The shots.json file drives image generation. The text-overlays.json drives the editorial timeline. The brand-lock snapshot is the audit anchor. The preview.html is the artifact you send to a stakeholder before a single image is generated.
The prompts directory is the cross-generator part. Same shot data, seven syntax variations. Switch from Midjourney to Flux without redoing the storyboard. Switch from Flux to Ideogram for a text-in-image override. Run Seedream for high-volume cost-efficient series work. Same spec, different syntax.
shotkit vs SaaS storyboard tools.
Most storyboarding tools on the market are SaaS apps with a UI you log into. The output never leaves the platform. shotkit takes the opposite stance.
The trade-off is real. SaaS tools are easier for synchronous review (a panel grid in a browser). shotkit is easier for shipping (files an editor, agency, or developer can act on without follow-up). Pick by your bottleneck. If your team prefers dashboards over files, Boords or Storyflow will fit better. If your team thinks in files, shotkit will fit faster.
How shotkit was used to make this film.
The 90-second explainer at the top of this post is itself a shotkit run. Six shots. One brand-lock snapshot. One storyboard.md. One shots.json. One text-overlays.json. The deterministic video composition that rendered the MP4 reads from the same shots data. Every beat in the video is annotated in the storyboard.md.
The artifact set is in the public repo at github.com/whystrohm/shotkit/tree/main/skills/storyboard-architect/examples/shotkit-explainer. Clone it, fork it, swap your brand-lock, ship your own version.
Install in 60 seconds.
Two commands. No accounts. No email gates. No API keys.
git clone https://github.com/whystrohm/shotkit.git cd shotkit && ./install.sh
The install script copies the four skills into ~/.claude/skills/. Restart your Claude Code session and they trigger on natural-language prompts.
Once installed, this is what a real workflow looks like. Open Claude Code in any directory and type:
# Brief in "30-second founder explainer for [your brand]. Pain reframe promise. Use brand-packs/[your-brand].md. Aspect 9:16." # Storyboard out (one minute later) output/ ├── storyboard.md ├── shots.json ├── text-overlays.json ├── brand-lock.snapshot.md ├── prompts/midjourney.txt └── preview.html # Per-generator prompts, copy-paste into the model of your choice "Generate Flux prompts for these shots." # Critique the rendered images against shot spec "Critique this image against shot_03 and the brand-lock."
The skills compose because they cooperate by file format, not by import. Every step is a file boundary. Every artifact is auditable.
Companion tools.
shotkit composes with the rest of the WhyStrohm open-source ecosystem:
- media-tsunami extracts a brand-pack from your existing assets. Pairs with shotkit's
brand-packs/directory. - whystrohm-audit scores your content against a 5-layer framework. Use it on the videos shotkit produces.
- whystrohm-voice-extract turns any URL into a structured voice profile.
- whystrohm-voice-scorer measures voice drift between site and social content.
- digital-twin codifies founder voice as a system prompt.
- ritual is the orchestration layer. Schedules the rest.
Frequently asked.
Does shotkit work with the image generator I already use?
Yes if it is one of Midjourney, Flux, Ideogram, GPT Image, Nano Banana (Gemini 2.5 Flash Image), Seedream, or Runway. Each has a dedicated adapter. If you use a different generator, the same shot data is generator-agnostic markdown plus JSON, so writing your own adapter is two pages of pattern matching.
Do I need a specific video framework to use shotkit?
No. shotkit stops at storyboards and prompts. The output works with any video assembly path: manual editorial in After Effects, Premiere, or Resolve, or a deterministic programmatic video framework that reads JSON. The repo includes a video-pipeline bridge document for teams that want to wire it up themselves, and a demo composition that renders the README explainer GIF.
How is shotkit different from Boords, Storyflow, or LTX Studio?
SaaS storyboard tools live in a browser dashboard and produce panels you log in to view. shotkit produces files (markdown, JSON, HTML) that an editor, motion designer, or developer can act on without follow-up. Plus shotkit is generator-agnostic, brand-lock-snapshotted, and Apache 2.0. See the comparison table above.
Is shotkit free? Will it stay free?
Yes and yes. shotkit is Apache 2.0. The methodology is what we publish. The operated pipeline (running generators, managing rendering, scheduled publishing across many brands) is the WhyStrohm commercial offering. Two layers, two audiences, no overlap.
Can I use shotkit without Claude Code?
You can. Each skill is a Markdown specification with a YAML frontmatter and reference docs. Other agents that support the SKILL.md open standard work too. The brand-lock files, shots.json schema, and templates are all framework-agnostic. Claude Code is the primary install target because it auto-discovers skills from ~/.claude/skills/.
How do I create a brand-pack for my brand?
Three options. Hand-write one using the template in brand-packs/_template.md, which is 90 lines and covers identity, palette, typography, mood, never list, aspect ratios, color grade, motion language, and voice rules. Or extract one from your existing assets using media-tsunami. Or fork brand-packs/whystrohm.md as a starting point and edit.
What does the audit trail actually buy me?
Reproducibility. Every storyboard freezes the brand-lock state at run time. Six months later, when the brand has refreshed twice and a stakeholder asks "what version of our brand was this approved against," the answer is in the snapshot file, not a Slack thread. This is the same defense-grade discipline applied to commercial content.
The boundary.
shotkit stops at specs and prompts. It does not call image-generation APIs. It does not run a video render pipeline. It does not publish to social. The boundary is deliberate.
Generators churn monthly. Flux 2 Pro replaced Flux 1.1 Pro inside a quarter. Seedream 4.5 dropped right after 4.0. If shotkit hard-coded any specific API integration, half of it would be broken every quarter. By stopping at prompts and specs, the methodology stays stable across the generator landscape.
The full pipeline (image generation, video rendering, automated publishing, brand-aware monitoring across many active brands) is what WhyStrohm runs commercially as managed content infrastructure. The methodology is open. The operator is paid.
Take the next step.
If you want to see what your brand looks like running through the pre-production layer, install shotkit and feed it your own brief. Two commands. Sixty seconds. Hundreds of videos a month, every platform, every aspect ratio. At scale.
Open shotkit on GitHub → Score your content infrastructure →
Or if you want the operated version, where one operator runs the full pipeline (voice extracted, brand encoded, video rendered, publishing scheduled), see whystrohm.com/pricing and the case files at whystrohm.com/results.
Related reading: The Founder Fingerprint on brand voice as code. The Content Spiral on accelerating-zoom video pacing. Founder activity vs founder infrastructure on what to systematize first.
shotkit v0.1.0 ships May 2026. Apache 2.0. Built by WhyStrohm. Hundreds of videos a month, every platform, every aspect ratio. At scale.
Free in 10 seconds
Find out what's costing you time, trust, and conversions.
The WhyStrohm Content Audit scores your published content against 5 layers of infrastructure-grade standards. Vocabulary. Structure. Proof density. Voice consistency. Buyer alignment. You get a number, the exact quotes that earned it, and a live rewrite of your weakest piece.
Or reach out directly
Tell me about your brand.
Name, email, and one line. I'll get back to you within 24 hours.