Part 6 of 10

/optimize - Performance and Efficiency

ai cagents claude optimize performance metrics

Created: March 27, 2026 | Modified: May 3, 2026

/run does the work. /optimize does the work and proves it, because the difference between "I think that fixed it" and "LCP dropped from 4.8s to 1.9s" is a number on the table.

You've built the thing. It works. Then you check the Lighthouse score and it's a 43, or you look at your email analytics and nobody's clicking anything, or your site takes four seconds to load on a phone. The problem is real but scattered — images, scripts, copy, config — and you're not sure which fix will actually move the needle.

/optimize is built for that situation. It detects performance and efficiency issues, measures a baseline before touching anything, applies fixes, and then verifies the improvement. If a fix makes things worse, it rolls back automatically. You end up with a before/after report showing exactly what changed and by how much.

When to reach for it

Use /optimize when:

A page loads slowly and you need to know what's actually causing it
Bundle size has grown and you're not sure what to cut
Database queries are taking longer than they should
Email open rates or click-through rates are underperforming and you want data-driven improvements
You've made changes and want to confirm nothing got slower

Use /run instead when the fix is obvious and you don't need before/after proof — a typo in a query, a known dependency to remove, a config value to change. /run is faster for clear, contained fixes.

Reach for /optimize when the problem is real but diffuse, you need to know which changes are worth making, or you need to show measurable improvement to someone else.

How the loop runs

/optimize runs a structured five-step loop:

Detect — scans for performance issues across the relevant domain (page weight, render-blocking resources, query plans, content quality signals, and the rest).
Measure baseline — captures current metrics before any changes: load times, bundle sizes, Lighthouse scores, engagement rates, whatever is relevant to the target.
Apply fixes — implements improvements in order of expected impact, working from highest to lowest.
Verify improvement — re-measures after each fix to confirm it helped.
Rollback if worse — if a fix doesn't improve the metrics, it's reverted automatically before moving on.

The result is a report that shows what was found, what was changed, what improved, and what was rolled back. You don't have to take the agent's word for it — the numbers are in the report.

That loop is what separates /optimize from the rest of the surface. /run executes your instructions. /optimize executes, measures, and proves.

Walkthrough -- Optimizing Site Performance

The site launched last week via /org. Traffic is coming in, but users are bouncing fast. You pull up PageSpeed Insights and see an LCP of 4.8 seconds and a performance score of 51. Time to find out what's actually slow.

/optimize "improve site performance - LCP is 4.8s, Lighthouse score 51"

The optimizer starts by scanning the site:

[detect] Scanning for performance issues...
  → 14 images without compression or modern formats
  → hero image (1.4MB PNG) loaded eagerly, above the fold
  → JavaScript bundle: 842KB uncompressed, 3 unused dependencies
  → render-blocking stylesheet from third-party font loading
  → no lazy loading on below-fold images

[baseline] Measuring current metrics...
  → LCP: 4.8s
  → TBT: 620ms
  → Lighthouse performance: 51
  → Page weight: 2.1MB

[fixing] Applying improvements by impact...
  → [1] Convert hero image to WebP, add width/height attrs
  → [2] Enable lazy loading on 13 below-fold images
  → [3] Remove 3 unused dependencies (moment.js, lodash, jquery)
  → [4] Switch font loading to font-display: swap
  → [5] Add resource hints (preconnect) for external assets

[verify] Re-measuring after fixes...
  → LCP: 1.9s  ✓ (was 4.8s - 60% improvement)
  → TBT: 180ms ✓ (was 620ms - 71% improvement)
  → Lighthouse performance: 87 ✓ (was 51)
  → Page weight: 0.7MB ✓ (was 2.1MB - 67% reduction)

The before/after report is saved to the session's outputs/ subdirectory. You don't need to find it yourself — ask Claude "show me the optimization report" and it will pull it up. Every fix is listed with the metric it affected. The hero image conversion alone accounted for most of the LCP improvement — useful to know if you're ever deciding where to invest time manually.

If the font-display swap had caused layout shift (CLS regression), it would have appeared in the rollback list instead of the fixes list.

Walkthrough -- Optimizing Email Engagement

The content is published. Four blog posts went out in parallel last month via /team, and the cross-domain launch via /org brought in traffic. But the email list isn't converting — open rates are around 18% and click-through is under 2%. Industry average for your niche is 26% opens and 4% CTR.

/optimize "improve email engagement - open rate 18%, CTR 1.8%, target is 26% and 4%"

The optimizer audits the content signals it can analyze:

[detect] Scanning content for engagement issues...
  → 6 of 7 email subject lines use sentence case with no urgency signal
  → average subject line length: 61 characters (optimal: 35–50)
  → CTAs in 4 emails are generic ("Read more", "Click here")
  → blog post headlines scored low on specificity (no numbers, no outcomes)
  → no personalization tokens used in any emails

[baseline] Capturing current metrics...
  → email open rate: 18.2% (7 campaigns)
  → click-through rate: 1.8%
  → top-performing subject: "The 5 tools we use every day" - 31% open rate
  → lowest-performing: "Our latest blog post is live" - 11% open rate

[fixing] Applying improvements...
  → [1] Rewrite 6 subject lines: shorter, specific, outcome-focused
  → [2] Rewrite 4 CTAs: action-specific ("Get the template", "See the full setup")
  → [3] Rewrite 3 blog headlines: add specificity, expected outcome
  → [4] Add first-name personalization token to subject lines

[verify] Projected improvement based on best-performing baseline...
  → subject lines: avg length reduced 61 → 42 chars, specificity score +40%
  → CTAs: action specificity improved across 4 emails
  → A/B testing framework added for next campaign

The report includes a comparison table — original subject line vs. rewritten, with the signals that informed each change. You can review every edit before it goes anywhere. The optimizer doesn't push anything to your email platform; it rewrites the files in your content directory and flags what changed.

Content optimization doesn't produce the same clean before/after numbers as page speed — you can't re-run a campaign to measure. What /optimize gives you here is evidence-based rewrites grounded in your own best performers, plus the framework to measure the next campaign properly.

Walkthrough -- Improving Student Engagement Metrics

Maya's redesigned US History course launched last month via /org -- the department approved the standards alignment, the librarian prepped the resources, and parents received the welcome packet. Unit 1 is done. Canvas LMS analytics are in, and the picture is uneven: structured debates had 92% student participation, but guided reading activities hit only 41% completion. Research days landed somewhere in the middle at 68% on-task (per her walk-around tallies). Something about the passive activities isn't landing.

/optimize "Student engagement is uneven across activity types. Debate
activities: 92% participation. Guided reading: 41% completion. Research
days: 68% on-task (per walk-around tally). Optimize the lesson plan
templates for Units 2-6 to improve engagement on the low-performing
activity types."

The optimizer starts by analyzing what separates the high-performing activities from the low ones:

[detect] Scanning activity structures for engagement patterns...
  → Debate activities are active: students take positions, respond to peers
  → Guided reading is passive: read-then-answer, individual, no collaboration
  → Research days lack structure: no milestones, students don't know
    what "done" looks like for a research session
  → High-engagement activities share two traits: social interaction
    and clear deliverables within the class period

[baseline] Current engagement metrics...
  → Debates: 92% participation
  → Guided reading: 41% completion
  → Research days: 68% on-task

[fixing] Applying improvements by expected impact...
  → [1] Restructure guided reading into jigsaw format: each student
      reads a different section, then teaches their small group.
      Adds social accountability and a clear deliverable (the teach-back).
  → [2] Add research day checklists with milestone deliverables:
      "By end of class, have 3 sources identified, 1 primary source
      quoted, and a thesis statement drafted."
  → [3] Add exit-ticket questions to every lesson plan as a lightweight
      daily engagement signal Maya can track in Canvas.

[verify] Producing measurement framework...
  → Revised lesson plan templates saved for Units 2-6
  → Comparison rubric created: measure same metrics on Units 2-3
    to establish before/after across the same activity types
  → Exit tickets provide daily data points instead of end-of-unit only

[rollback] Rejected: replacing independent reading with audio versions.
  → Reasoning: the reading skill itself is a learning objective, not
    just a content delivery mechanism. Swapping to audio would hit
    engagement targets but undermine a pedagogical goal. The optimizer
    flagged this as "metric improvement at the cost of the actual purpose."

The engagement gap wasn't about the content -- it was about the activity structure. The debates worked because students had to do something with someone by a deadline. The reading activities had none of those three elements. The jigsaw restructure adds all three without changing the reading itself.

Live classroom engagement can't be measured from a terminal. What the optimizer produces instead is the revised templates and a measurement framework so Maya can compare Unit 2's numbers against Unit 1's using the same Canvas analytics. The exit tickets give her a daily signal instead of waiting for end-of-unit data.

Walkthrough -- Improving Food Cost Percentages

David's second restaurant location in Decatur has been open for five weeks. The /org-coordinated launch went smoothly -- operations, marketing, and staffing all hit their marks for opening week. Revenue is tracking to plan. But when David pulls the numbers, food costs are running at 38% of revenue. His Midtown location runs at 29%. That nine-point gap is eating his margin.

/optimize "Food cost percentage at Decatur is 38% vs. 29% target. Have POS
sales data, weekly inventory reports, and kitchen waste logs for 5 weeks.
Need to identify where the money is going and fix it."

The optimizer cross-references the three data sources to isolate the cost drivers:

[detect] Analyzing cost variance between locations...
  → Protein over-portioning on 4 dishes: new cooks are serving 20-30%
    above target weights (generous, but expensive)
  → Plantain prep waste: 31% at Decatur vs. 18% at Midtown.
    Inexperienced prep technique -- too much trimmed, usable
    portions discarded
  → 2 low-selling dishes (egusi soup, pepper soup) have expensive
    ingredients sitting in inventory and expiring weekly

[baseline] Current metrics...
  → Food cost: 38.2% of revenue
  → Waste rate: 14% (Midtown runs 8%)
  → Per-plate protein cost: $2.40 above target
  → Inventory spoilage: $340/week on low-sellers' ingredients

[fixing] Applying improvements by cost impact...
  → [1] Create portion-weight reference cards with photos for each
      protein dish. Post at every station. Target: bring per-plate
      protein cost down $2.40 → within $0.50 of target.
      Projected impact: 4 percentage points of food cost.
  → [2] Add plantain prep technique module to training manual with
      visual guides (cross-section photos showing correct trim lines).
      Projected impact: 1.5 percentage points from reduced waste.
  → [3] Remove 2 low-sellers from menu. Replace with items using
      shared ingredients already in regular inventory rotation.
      Projected impact: eliminates $340/week spoilage (~1 percentage point).

[verify] Projected improvement...
  → Food cost after fixes: 31-32% (down from 38.2%)
  → Protein portioning alone: 38.2% → ~34% (largest single lever)
  → Waste reduction: 34% → ~32.5%
  → Menu swap: 32.5% → ~31.5%
  → Gap to Midtown's 29%: remaining 2-3 points attributed to new-kitchen
    learning curve, expected to close over next 8 weeks as team
    builds speed and consistency

[rollback] Rejected: raising prices on the over-portioned dishes.
  → Reasoning: current prices match the Midtown menu. Inconsistent
    pricing between locations creates customer trust issues --
    regulars who visit both spots will notice. The brand promise
    is the same food at the same price. Fix the portions, not the prices.

Three problems, three fixes, each with a projected dollar impact. The protein portioning is the biggest lever -- those reference cards with photos cost nothing to produce and address the single largest cost driver. The plantain prep module goes into the training manual that /run built during the Decatur setup. The menu swap is the smallest impact but also eliminates a recurring spoilage cost that compounds weekly.

The remaining 2-3 point gap to Midtown's 29% is expected. New kitchens are slower, and slower means more labor cost per plate. That gap closes with experience, not process changes. The optimizer identifies it but correctly doesn't try to "fix" it.

Walkthrough -- Improving Steam Page Conversion Rate

Sadie's game launched two weeks ago. The /org-coordinated launch week went as planned -- press embargo, store page, trailer, and community events all landed in sequence. The game has had 45,000 Steam page visits. But only 1,800 people bought it -- a 4% conversion rate. For indie narrative games, 7-10% is the benchmark. Steam analytics show where visitors are dropping off: 60% don't scroll past the first screenshot, and of those who do scroll, 80% leave at the "About This Game" section.

/optimize "Steam page conversion rate is 4%, target is 7-10%. 45,000 visits,
1,800 sales. Steam analytics show: 60% of visitors don't scroll past the
first screenshot. Of those who scroll, 80% leave at the 'About This Game'
section without reaching the reviews."

The optimizer treats the Steam page as a conversion funnel and analyzes each drop-off point:

[detect] Scanning page content against drop-off data...
  → First screenshot is a landscape establishing shot -- atmospheric but
    tells the visitor nothing about what they'd actually do in the game.
    60% of visitors see this and leave.
  → "About This Game" section opens with 3 paragraphs of world lore
    before mentioning gameplay mechanics. Visitors who scroll this
    far are interested but still don't know what the game IS.
  → No animated content (GIFs or short clips) above the fold.
    Competing narrative puzzle games average 2 GIFs in their first
    5 screenshots.

[baseline] Current metrics...
  → Page conversion: 4.0% (1,800 / 45,000)
  → First-screenshot bounce: 60%
  → About-section exit: 80% of remaining visitors

[fixing] Applying improvements by funnel position...
  → [1] Replace first screenshot with one showing active puzzle-solving
      gameplay -- a player mid-interaction, not a landscape.
      Targets the 60% first-screenshot bounce.
  → [2] Restructure About section: lead with what you do (1 sentence),
      then what makes it different (1 sentence), then the lore.
      Mechanics before narrative, not the reverse.
  → [3] Add a 10-second gameplay GIF above the fold showing a puzzle
      interaction from start to solution.

[verify] Producing comparison framework...
  → Revised page copy and screenshot order saved
  → Benchmarked new copy structure against 5 successful narrative
    puzzle games' Steam pages (all lead with mechanics, not lore)
  → Measurement checklist for Sadie: compare conversion rate after
    1 week with the new page layout

[rollback] Rejected: adding "overwhelmingly positive" early access quotes
  to the page header.
  → The game was never in early access. Those quotes don't exist.
    The optimizer caught its own hallucinated suggestion before it
    reached the output. This is the rollback safety doing exactly
    what it's supposed to -- a fix that would fabricate evidence
    gets flagged and reverted, even when the system itself generated it.

The page wasn't converting because it answered the wrong question first. Visitors don't need to know the world's lore to decide whether to buy -- they need to know what the game feels like to play. The screenshot swap and About restructure both address the same underlying issue: showing the experience before the backstory.

The self-caught hallucination is worth noting. The optimizer generated a suggestion (fake review quotes), identified it as fabricated during verification, and rolled it back automatically. This is the same rollback mechanism that reverts a CSS change when it increases layout shift -- it works on content integrity, not just code metrics.

Walkthrough -- Optimizing Budget Surplus Allocation

Rosa's destination wedding in Todos Santos is ten days out. The /org-coordinated final week is underway -- vendors confirmed, guests managed, venue contingency decided. With final numbers locked, she's $4,200 under the $65,000 budget. The couple asks the obvious question: should they upgrade something or save the money? Rosa wants a recommendation grounded in impact, not instinct.

/optimize "Wedding is $4,200 under the $65K budget with 10 days to go.
Where should we spend the surplus for maximum guest experience impact?
Current allocations: florals $4,200, catering $18,500, photography $5,000,
lighting $3,200, music $2,800, transportation $3,500, decor $4,800,
bar $6,000, coordination $4,500, contingency $5,800 (now $4,200 available)."

The optimizer evaluates each category for impact-per-dollar based on the specific event profile:

[detect] Analyzing impact-per-dollar across budget categories...
  → Evening outdoor events: lighting is the single highest-impact
    aesthetic element for both guest experience and photography
  → Destination weddings: guests drink more and stay later than
    local weddings. Food availability after 9pm correlates strongly
    with guest satisfaction and reduced complaints
  → Current contingency: $5,800 allocated, $4,200 now available.
    Destination weddings average 12% higher unexpected costs than
    local events (vendor travel, weather pivots, currency fluctuation)

[baseline] Current state...
  → All categories funded to contracted levels
  → Surplus: $4,200 available
  → No unfunded guest experience elements identified in original plan

[fixing] Ranked allocation recommendations...
  → [1] String lights over reception area: $1,800. Evening beach
      receptions photograph dramatically better under string lights.
      This is the single upgrade most likely to appear in every
      guest's photos and the couple's album.
  → [2] Late-night snack station (tacos al pastor, churros): $900.
      Served 9:30-11pm. Guests at destination weddings stay later
      and food after the main service reduces taxi requests,
      complaints, and early departures.
  → [3] Keep $1,500 in contingency. Destination weddings have
      higher variance -- weather pivots, vendor no-shows, currency
      fluctuation on final payments in pesos. The buffer has earned
      its place.

[verify] Budget impact...
  → $1,800 + $900 = $2,700 allocated to upgrades
  → $1,500 retained as contingency (total contingency now $7,300)
  → Both upgrades target the 9pm-11pm window, which is currently
    the weakest part of the reception timeline (no planned activities
    between cake cutting at 9pm and last dance at 10:45pm)

[rollback] Rejected: upgrading bar package from mid-range to top-shelf.
  → Guest satisfaction research shows diminishing returns on alcohol
    quality past mid-range. Most guests can't distinguish top-shelf
    from mid-range after the first hour. Cost: $2,100 for an upgrade
    that surveys consistently show guests don't notice. The string
    lights are visible in every photo for the rest of their lives.

The optimizer turns "where should we spend extra money" into a ranked impact analysis. String lights are $1,800 that will appear in every photograph the couple looks at for decades. The late-night snacks solve a gap in the reception timeline that Rosa hadn't explicitly planned for. And the contingency buffer stays intact because destination weddings are unpredictable.

The bar rollback is the kind of recommendation that feels obvious once stated but is hard to arrive at when a couple is excited about surplus budget. More expensive doesn't mean more impactful -- the numbers show where the impact actually lives.

Walkthrough -- Improving Episode Completion Rate

Jordan's podcast "Forgotten Picket Lines" has three episodes live. The /org-coordinated launch got the show onto all platforms, the cross-promotion pitches landed coverage, and downloads are solid -- 2,400 on Episode 1. But the hosting analytics tell a different story about what happens after someone hits play: only 58% of listeners make it to the end. For narrative podcasts, 75-80% completion is the benchmark. The drop-off data is specific: most listeners bail between minutes 8 and 12 of each episode.

/optimize "Episode completion rate is 58%, target is 75-80%. 3 episodes live,
2,400 downloads on Ep 1, 1,900 on Ep 2, 1,600 on Ep 3. Hosting analytics
show most drop-offs happen between minutes 8-12 of each episode. Need to
identify what's causing mid-episode abandonment and fix it in future scripts."

The optimizer cross-references the drop-off timestamps against the script content at those exact minutes:

[detect] Analyzing scripts at the 8-12 minute drop-off window...
  → All 3 episodes have a 4-5 minute expository section in this window:
    historical context delivered as dates, political background,
    economic conditions -- no narrative tension, no characters acting
  → Pattern: the storytelling pauses for a history lecture. Listeners
    came for narrative and got a textbook.
  → Episodes with highest completion (Ep 1, 62%) have the shortest
    exposition block (3.5 min). Lowest (Ep 3, 54%) has the longest (5.2 min).

[baseline] Current metrics...
  → Average completion rate: 58%
  → Drop-off concentration: minutes 8-12
  → Exposition block length: 3.5-5.2 minutes (avg 4.3 min)

[fixing] Applying structural improvements to script templates...
  → [1] Interleave exposition with narrative action. Instead of "here's
      the political background" then "here's what happened," weave them:
      "As the governor signed the order, workers 200 miles away had no
      idea their world was about to change." Facts delivered through
      story, not before it.
  → [2] Add a cold-open teaser at minute 0 that previews the episode's
      climax. Gives listeners a destination to stay for through the
      context sections: "By the end of this day, 53 people would be
      dead and the National Guard would be called."
  → [3] Hard cap: no more than 2 minutes of pure exposition without
      a narrative beat. If context runs longer, break it with a
      character moment or a present-day connection.

[verify] Producing revised templates...
  → Rewritten exposition sections from all 3 episodes saved as
    reference examples for future scripts
  → Pacing checklist added to production workflow: "2-minute rule"
    for exposition blocks flagged during script review
  → Comparison framework: measure completion rate on Episodes 4-6
    against the 58% baseline

[rollback] Rejected: adding mid-episode ads as "engagement checkpoints."
  → Research suggests ads can reset listener attention by creating
    a micro-break. But Jordan explicitly doesn't want ads in the
    show -- it's a creative and editorial principle, not a monetization
    decision. Artificial engagement tricks contradict the show's ethos.
    The optimizer flagged the conflict and reverted.

The problem wasn't the content -- it was the structure. Listeners don't leave because the history is uninteresting. They leave because the storytelling stops. A 4-minute exposition dump at minute 8 asks the audience to trust that the narrative will resume. At 58% completion, most of them don't.

The interleaving fix is genuinely good podcast craft: every fact delivered through a character or a moment lands harder than the same fact delivered as context. The cold-open teaser gives listeners a reason to push through the slower sections -- they know something is coming. And the 2-minute rule is a production tool Jordan will use on every future script, not just a one-time fix.

These five scenarios -- student engagement, food costs, conversion rates, budget allocation, and completion rates -- share a pattern. /optimize doesn't guess at what's wrong. It measures, identifies the specific cause, applies the fix with the highest expected impact first, and produces the framework to verify the improvement. When it encounters a fix that would technically improve the metric but violate a deeper goal -- pedagogical integrity, brand consistency, content authenticity, financial prudence, editorial principles -- it rolls back. The numbers matter, but so does understanding what the numbers are actually for.

If the metric doesn't improve, or if something breaks that the optimizer didn't anticipate, that's where /debug picks up.

Flags worth knowing

Flag	What it does
`--type <metric>`	Focus on a specific optimization type: `performance`, `bundle`, `accessibility`, and the rest.
`--dry-run`	Detect and baseline only — shows what would be fixed without changing anything.
`--rollback`	Undo changes from the last optimize run.

What to keep in mind

Rollback safety means you can push harder. If you've been hesitant to strip unused dependencies or convert all your images to WebP because something might break, /optimize is a reasonable move. Every change is measured. If it regresses a metric, it's reverted. And if you need to undo everything after the fact, --rollback restores the previous state.

Always read the before/after report. Ask Claude "show me the optimization results" and it will pull up the report from the session. It tells you which fixes actually moved the metrics and by how much. That is more useful than the final score on its own — it tells you where to focus if you ever do the work manually, and it gives you something concrete to share with stakeholders.

/optimize doesn't push or publish. It makes changes in your local files, measures what it can measure locally, and produces a report. Deployment is a separate step. Don't assume your optimized site is live just because the run finished — check your deploy pipeline.

Content metrics need time. For code and assets, /optimize can measure improvement immediately. For content changes — headlines, CTAs, email copy — the before/after is about the quality signals, not live campaign data. You still need to run the next campaign and compare. Build that comparison into your process.

Part 6 of 10