Stylify: AI Provider Resilience Strategy

Executive Summary

Stylify operates across three distinct AI dependencies: (1) Anthropic Claude API for caption generation (product), (2) Claude CoWork for operations/strategy, and (3) Claude Code for engineering and social media execution. The good news: none of these are as exposed as they first appear. The bad news: there are real single points of failure, and we're about to discover one due to a model deprecation that may already be causing production errors.

Bottom line: Immediate action required on the Haiku model deprecation (likely breaking captions now). Medium-term resilience is mostly there already. No strategic pivot needed, but two tactical improvements would reduce risk significantly.

CRITICAL — Immediate Action Required:
Model audit reveals Anthropic's Claude Haiku 3.5 was retired February 19, 2026 (9 days ago). If caption generation still references this model, captions are currently failing silently. This is not hypothetical — this is likely happening right now. Stitch needs to verify the hardcoded model string in backend/src/services/contentService.js immediately.

1. Risk Assessment: Current State

What We're Actually Dependent On

The phrase "dependent on Anthropic" collapses three very different things:

Caption generation API (Anthropic Claude Haiku/Sonnet): Product feature. Customers interact with this directly. If it goes down, stylists can't generate captions.
Charlotte (Claude CoWork): Ops agent. Coordinates handovers, strategy, documentation. If Claude CoWork becomes unavailable, Charlotte's role is paralyzed.
Stitch & Pixel (Claude Code): Development & execution agents. Stitch builds features; Pixel manages social media. If Claude Code goes down, feature development and content operations stop.

These require different risk evaluations because the impact profile is completely different.

Severity and Likelihood Assessment

Component	Type of Outage	Likelihood (Annual)	Impact	Recovery Time
Caption Generation API	Service outage (Anthropic), network, or client error	3-5% (15-18 days/year)	Stylists can't create posts; revenue at risk	Minutes-hours (depends on root cause)
Claude Code availability	Service outage or extended downtime	2-4% (7-14 days/year)	Feature work halts; ops continues; no revenue impact (pre-revenue startup)	Hours (rebuild context)
Claude CoWork availability	Service outage or extended downtime	2-4% (7-14 days/year)	Strategy/operations paralyzed; handovers can't happen	Hours-days (depends on context recovery)
Model deprecation	Retire without clear migration path	Historically ~1 per 6-12 months	Can be sudden; requires code changes	Minutes-days (depends on discovery time)

Reality Check: Anthropic's availability is exceptionally high. Claude CoWork and Claude Code go down less frequently than our internet connection. But model deprecations happen on a predictable schedule — Anthropic retires models every 6-12 months — and we've already missed one. This is not a freak risk; it's a recurring maintenance task.

2. What Already Works in Our Favor

Built-In Resilience We Already Have

The OpenAI fallback is your biggest existing protection. The architecture already specifies Claude Haiku 4.5 as primary with GPT-4o-mini as fallback for caption generation. This means:

If Anthropic becomes unavailable, captions can still be generated (at slightly higher cost and unknown quality differential).
The API layer already has multi-provider logic. This is the hardest part of moving away from single-provider dependency — it's already done.
No customer feature loss during Anthropic outage (assuming fallback is implemented and tested).

Your agent coordination system is provider-agnostic by design. DECISIONS.md, handovers, inbox, session-context.md, and CLAUDE.md are plain markdown files on OneDrive. These work regardless of which AI tool is running the agents. This is not accidental; this is good architecture:

If Charlotte (Claude CoWork) becomes unavailable, a human or different AI tool can read the inbox and CLAUDE.md and resume where she left off.
If Stitch (Claude Code) becomes unavailable, another developer or AI tool can read the latest handover and continue building.
The coordination system doesn't encode Claude-specific syntax or requirements.

You have a documented model audit process. Charlotte's weekly platform risk check (Feb 28 inbox file) explicitly monitors Anthropic API changes and model deprecations. This is exactly the discipline that catches problems like the Haiku 3.5 retirement early.

You operate in pre-revenue stage. If Claude Code or Claude CoWork goes down, it delays feature work. That's painful but not a business emergency (no paying customers are blocked). The only revenue-blocking dependency is the caption generation API — and even that has a fallback.

What This Means

Your actual resilience posture is stronger than "concentrated on Anthropic" suggests. You've built in the most important safety nets (fallback API provider, provider-agnostic coordination, documented risk checks). The risk isn't catastrophic or even high at your current stage.

3. Remaining Gaps: The Real Single Points of Failure

Gap 1 (ACTIVE NOW): Model deprecation discovery lag.

Haiku 3.5 was retired Feb 19. Caption generation likely broke Feb 19-28 (9 days of silent failures). The weekly platform check caught it, but only on Feb 28 — after a week of potentially broken production feature. The gap is between "model is retired" and "we discover we need to change code." At this stage, that's acceptable. But as you grow, a 7-day discovery lag becomes unacceptable.

Gap 2: Fallback testing & verification.

The OpenAI fallback is designed in, but there's no evidence it's been tested end-to-end. If caption generation fails and the system tries to fall back to GPT-4o-mini, will it work? Will it produce acceptable quality? Is the cost model acceptable? This gap is moderate — it's straightforward to close, but it needs to happen before you have paying customers.

Gap 3: Tooling layer portability untested.

Your coordination files (CLAUDE.md, handovers, inbox) are provider-agnostic in theory. But in practice, they're designed for Claude. If you needed to swap Charlotte out for a different agentic tool, would the handover format work? Would the inbox system work? The files might be portable, but the process is Claude-specific. This gap is minor at pre-launch but grows with scale.

Gap 4: No automated model deprecation alerts.

Charlotte's weekly check is manual (task-based). It's good, but manual processes fail if the task is skipped or the interval changes. Anthropic publishes deprecation notices in their API docs, but Stylify doesn't have an automated way to detect "one of our hardcoded models is about to retire." This is a low-probability gap (the weekly check is reliable), but it's fixable.

4. Strategic Recommendation: What to Do & When

Phase 1: Immediate (This Week)

Action: Stitch audits and fixes the Haiku model string. This is non-negotiable.

Check backend/src/services/contentService.js (and any other caption generation code).
If using `claude-haiku-3-5` or `claude-3-5-haiku-20241022`, migrate to `claude-haiku-4-5-20251001`.
Verify captions are generating again.
Estimated time: 30 minutes.

Cost of action: 30 minutes of Stitch's time.

Cost of inaction: Captions silently fail for new users. When Meta App Review starts testing, this becomes a blocking issue.

Phase 2: Before Production (Next 2-3 Weeks)

Action: Test the OpenAI fallback end-to-end. This is low-risk but essential.

In a test environment, generate captions with Anthropic API disabled (simulate outage).
Verify GPT-4o-mini fallback kicks in automatically.
Review caption quality (compare Haiku vs. GPT-4o-mini outputs).
Verify cost assumptions (GPT-4o-mini is ~2-3x more expensive per token).
Estimated time: 2-3 hours for Stitch + Charlotte (for quality review).

Cost of action: 2-3 hours, small API cost for testing.

Cost of inaction: Fallback is untested. If Anthropic actually goes down, you'll discover the fallback doesn't work mid-incident.

Action: Document model deprecation schedule & set calendar reminders.

Add a note to DECISIONS.md or session-context.md: "Next model retirement: Claude 3 Haiku on April 19, 2026 — audit + migrate by April 1."
Set a calendar reminder (Jason's calendar, CT) for April 1: "Audit for Claude 3.x models — retire April 19."
Keep Charlotte's weekly platform check active (it's working).
Estimated time: 15 minutes.

Cost of action: 15 minutes.

Cost of inaction: Another model deprecation surprise in ~50 days. The pattern repeats every 6-12 months indefinitely.

Phase 3: Scale (6+ Months, After Product Launch)

Do NOT do this now. Only pursue this if:

You've hit paying customer scale (100+ users) where developer downtime costs real revenue, OR
Anthropic has multiple extended outages that materially impact your business, OR
You hit a situation where Claude Code is genuinely inadequate and you need specialized tooling.

Option A: Automated Model Deprecation Watcher

Write a lightweight script that:

Polls Anthropic's model list API daily or weekly.
Compares against hardcoded models in your codebase.
Creates an inbox alert if a match is found in the deprecation list.

Cost: ~4-6 hours for Stitch to build. Ongoing: minimal (automated).

Value: Reduces discovery lag from 7 days to near-zero. Only worth it if you're running multiple models or if model deprecations become blocking.

Option B: Multi-Tool Orchestration (Overseer + Executor Pattern)

Have Stitch verify that a non-Claude coding assistant could theoretically pick up the work using the CLAUDE.md handover format. This is less about switching now and more about knowing you could switch if you needed to.

Cost: Zero now. ~2-3 hours post-launch if needed (testing a different tool with your existing docs).

Value: Psychological safety + optionality. Moderate if provider lock-in feels existential; minimal if you're happy with Claude.

5. Multi-Provider Orchestration: Is It Worth It?

Jason asked whether there's a strategic case for "overseer + actor" architecture — one AI managing/reviewing while another executes. Here's the honest answer:

The Case For

Specialization: Different models excel at different tasks. GPT-4 is stronger at certain coding styles; Claude is stronger at context reasoning and long documents.
Resilience: If one provider goes down, the other can theoretically take over (at lower quality perhaps, but available).
Cost optimization: Route cheap tasks to cheaper models; expensive/complex tasks to expensive models.

The Case Against (Much Stronger)

Context fragmentation: Passing context between two different model providers introduces latency, formatting hassles, and information loss. Your current setup (all Claude, all living on one machine) has zero context friction.
Complexity explosion: Orchestrating two tools requires routing logic, fallback decision trees, and error handling. This is a force multiplier on cognitive load. For a solo founder with $86/mo burn, this is premature optimization.
Marginal resilience gain: Your biggest risk isn't "one provider is down" — it's "a model was deprecated and we didn't notice." Overseer + executor doesn't fix that; it creates two deprecation problems instead of one.
For caption generation specifically: You already have GPT-4o-mini as fallback. That's the overseer + executor pattern for your product layer. Adding it for the tooling layer (Stitch + Charlotte) is duplication.

Bottom Line on Multi-Provider Orchestration

Not recommended for Stylify at this stage. Your current setup — all Claude, tight context loops, single-provider coordination — is simpler, faster, and good enough for a pre-launch startup. The marginal resilience gain (maybe 1-2% better uptime) doesn't justify the operational complexity tax.

Revisit this in 12-18 months if: (a) Claude Code becomes a bottleneck for feature shipping, (b) Anthropic has multiple extended outages, or (c) you're managing a team where different tools have genuine performance advantages. Today, none of those are true.

The exception: For the caption generation API, the GPT-4o-mini fallback is already the right call. Keep it, test it, maintain it. The product layer is different from the tooling layer.

Conclusion & Action Summary

What This Means for Stylify Right Now

You are not dangerously concentrated on Anthropic. You have good bones: a working fallback, provider-agnostic coordination, and a documented risk process. But you have one active emergency (the Haiku model audit) and several low-to-medium gaps that are easily fixable.

Priority ranking for Jason:

Immediate (this week): Have Stitch audit and fix the caption generation model string. This is likely a production break.
Before launch (2-3 weeks): Test the OpenAI fallback end-to-end. Verify it works and quality is acceptable.
Calendar: Set a reminder for April 1 to audit for Claude 3.x models (retirement April 19).
Optional (6+ months): Consider automated model deprecation watcher if manual platform checks become unreliable.
Not recommended (now): Multi-provider orchestration. Revisit in 12-18 months if conditions change.

The real lesson: Provider dependency risk comes from neglect (missing model retirements), not from choice (using Claude). Your platform check and handover system catch drift early. Keep that discipline, and you're fine.

Stylify: AI Provider Resilience & Dependency Strategy