Writing about aerospace and electronic systems, particularly with defense applications. Areas of interest include radar, sonar, space, satellites, unmanned plaforms, hypersonic platforms, and artificial intelligence.
Retrofitting Failure: The Zumwalt-Class and the $32 Billion Learning Curve
How the Navy finally might turn a failed gun platform into a usable warship—if it keeps changing everything
BOTTOM LINE UP FRONT
The U.S. Navy is evaluating a proposal to retrofit AN/SPY-6
radar systems—originally manufactured for the cancelled
Constellation-class frigate program—onto all three operational
Zumwalt-class destroyers as part of the Zumwalt Enterprise Upgrade
Solution (ZEUS). Raytheon has received Navy funding to develop combat
management system modifications enabling SPY-6 integration, while both
contractors and Navy officials have expressed confidence in the
technical feasibility. The SPY-6(V)3 variant, dimensionally comparable
to the incumbent AN/SPY-3, could be installed without major structural
modifications; however, no final decision has yet been made. The backfit
represents one element of a broader strategic pivot to transform the
Zumwalts from their failed original concept as gun-armed littoral
platforms into long-range hypersonic strike assets aligned with the
wider Aegis fleet.
The Zumwalt Class in Transition
The Zumwalt-class destroyers represent one of the U.S. Navy's
most dramatic strategic reversals. Originally envisioned as a 32-ship
class optimized for naval surface fire support (NSFS) in shallow-water
operations, the platform's distinctive tumblehome hull and composite
deckhouse were engineered to achieve radar cross-section comparable to
that of a fishing boat—approximately fifty times more difficult to
detect than a conventional destroyer.1
However, rising costs for the Long-Range Land-Attack Projectile (LRLAP)
ammunition essential to the ship's core mission rendered the
155-millimeter Advanced Gun System economically unsustainable, and
procurement was cancelled well before the first ship's commissioning.
With only three ships authorized and built—USS Zumwalt
(DDG-1000), USS Lyndon B. Johnson (DDG-1002), and USS Michael Monsoor
(DDG-1001)—the Navy has radically reoriented the class toward
extended-range strike warfare. Beginning in 2023, both AGS turrets were
removed from each destroyer and replaced with vertical launch system
(VLS) cells accommodating the Conventional Prompt Strike (CPS)
hypersonic missile system.2 USS
Zumwalt completed this conversion in late 2025, and now carries twelve
CPS missiles in four Advanced Payload Modules forward of the
superstructure.3 USS Lyndon B.
Johnson is undergoing similar modifications at Ingalls Shipbuilding in
Pascagoula, while USS Michael Monsoor is scheduled for conversion during
its next maintenance availability.
The CPS missile, jointly developed by the Army and Navy, achieves
Mach 5+ velocity and delivers a Common Hypersonic Glide Body (C-HGB)
across ranges exceeding 1,725 nautical miles—a dramatic capability
expansion compared to the AGS's notional 63-nautical-mile range.4,5
This transformation has effectively shifted the Zumwalt-class from a
littoral gun platform to a strategic-depth strike destroyer,
fundamentally altering the operational calculus for the ships' remaining
service life.
The Combat System Modernization: ZEUS
Recognizing that hypersonic strike capability alone would not
suffice for twenty-first-century fleet operations, the Navy initiated
the Zumwalt Enterprise Upgrade Solution (ZEUS)—a comprehensive combat
system modernization program first formally outlined in a Request for
Information (RFI) issued in November 2022.6
ZEUS encompasses far more than radar replacement alone. The program
includes integration of the Surface Electronic Warfare Improvement
Program (SEWIP), the undersea warfare combat system SQQ-89, and the
Cooperative Engagement Capability (CEC) datalink—measures designed to
align the Zumwalt-class more closely with the Aegis-equipped fleet
standard and enhance network-centric warfare integration.7,8
The radar upgrade component reflects a critical shortcoming in
the original Zumwalt design. The AN/SPY-3 multifunction radar, while
performing well in its X-band search and track role, was never intended
to shoulder the full burden of air defense alone. Zumwalt-class
destroyers were originally equipped with a dual-band radar architecture
pairing the SPY-3 with the AN/SPY-4 S-band volume search radar. However,
in June 2010, Pentagon acquisition officials elected to delete the
SPY-4 as a cost-reduction measure, requiring the SPY-3 to be
reprogrammed to perform both horizon search and volume search functions
simultaneously—a compromise that limits its capability to manage
large-scale air attacks while providing fire control for multiple
simultaneous engagements.9,10
The SPY-3 also lacks integration with modern ballistic missile defense
systems, a growing liability as the Navy faces advanced cruise-missile
and hypersonic threats.
The AN/SPY-6: A Generation Forward
The AN/SPY-6 represents the latest generation of Raytheon naval
radar technology. First delivered to the Navy in July 2020, the SPY-6 is
built on a modular, scalable architecture employing Radar Modular
Assemblies (RMAs)—self-contained radar modules, each approximately two
feet per side, that function as individual transmit/receive elements.11
This modular approach enables the Navy to field multiple variants
optimized for specific platforms and mission sets, ranging from the full
four-sided SPY-6(V)1 system aboard Flight III Arleigh Burke-class
destroyers (with 37 RMAs per face) to more compact configurations for
smaller combatants.
The SPY-6(V)3 configuration under consideration for Zumwalt-class
integration employs a three-sided phased array, each with nine RMAs,
providing volume search and track capabilities across extended detection
ranges and advanced electronic scanning performance characteristic of
modern AESA radar systems.11,12
The SPY-6(V)3 is already planned for installation on
Constellation-class frigates (for ships remaining under construction)
and serves as the primary air and missile defense radar aboard Gerald R.
Ford-class aircraft carriers beginning with USS John F. Kennedy
(CVN-79).11 This commonality across platform classes has significant implications for fleet logistics, training, and maintenance.
The SPY-6 system offers approximately 15 decibels improved
sensitivity compared to the SPY-1 radar architecture that equips the
Aegis fleet—equivalent to detecting targets half the size at twice the
distance—and provides simultaneous defense against ballistic missiles,
cruise missiles, air and surface threats, plus organic electronic
warfare capability.11
Integration with CEC enables true network-centric air defense, where
each ship's SPY-6 radar data is fused with information from surrounding
platforms to create a composite battlespace picture far superior to what
any single ship could achieve in isolation.
The Constellation-Class Cancellation: An Unexpected Opportunity
The Constellation-class frigate program, awarded to Fincantieri
Marinette Marine in April 2020, was conceived as a more affordable
complement to the DDG-51 Arleigh Burke-class destroyer. The design was
based on a scaled adaptation of Marinette's FREMM (Frigate European
Multi-Mission) platform, itself a derivative of the Italian FREMM design
with extensive Americanization to meet Navy survivability and
electromagnetic requirements.13,14
However, the program encountered cascading delays. As of April 2024,
the lead ship, USS Constellation (FFG-62), was only 10 percent complete,
with the Navy's FY2026 budget projecting delivery slipping from the
original 2026 target to April 2029—a delay of 36 months at an estimated
cost of $1.5 billion.14,15 The
Government Accountability Office identified fundamental design stability
issues, with the ship becoming significantly heavier than anticipated
and achieving far less than the promised cost advantage over the larger,
more capable DDG-51.
On 25 November 2025, Secretary of the Navy John C. Phelan
cancelled all but the first two ships in the Constellation-class program
as part of a comprehensive Navy fleet strategy review.16
At the time of cancellation, the lead frigate was reported 12 percent
complete. The Navy elected to complete the two ships under construction
(FFG-62 and FFG-63) to preserve Marinette Marine's industrial capacity
and maintain continuity of shipyard employment, but halted procurement
of the remaining four ships on contract. The Navy subsequently announced
a new frigate competition for a smaller, faster-to-build design based
on the U.S. Coast Guard's National Security Cutter (NSC) hulform,
designated FF(X)—an architecture explicitly not optimized for the SPY-6
radar due to size constraints.16,17
This cancellation decision created a windfall of surplus
long-lead-time items manufactured for the Constellation-class program.
According to John Tobin, Associate Director for International SPY Radar
Programs at Raytheon, SPY-6(V)3 radar arrays originally procured for the
cancelled frigates remain in inventory. Raytheon and Navy officials
have indicated that these systems could be repurposed and installed on
the Zumwalt-class at considerably lower total cost than procuring
entirely new radar suites.6 The decision to salvage these components represents pragmatic asset stewardship in an environment of fiscal constraint.
Technical and Programmatic Feasibility
Raytheon officials have expressed confidence in the technical
feasibility of the SPY-6 backfit. Jennifer Gauthier, Vice President of
Naval Systems & Sustainment at Raytheon, stated in an interview
conducted in Tokyo in May 2026 that "we are currently in discussions
with the U.S. Navy and nothing has been decided," while elaborating on
Raytheon's ongoing development efforts. Importantly, she confirmed that
Raytheon had received Navy funding for development work on the Zumwalt
combat management system specifically intended to enable SPY-6
integration, and that the company had established "the first certified,
classified software factory for Zumwalt" enabling rapid, secure software
uploads to the ships without the extended procurement and testing
cycles traditionally required for fleet updates.6
From a physical integration perspective, the SPY-6(V)3 is
dimensionally comparable to the incumbent SPY-3. Tobin noted that the
SPY-3 is "roughly comparable in size" to the SPY-6(V)3 configuration of
nine RMAs, suggesting that physical installation would not require
substantial deckhouse modifications or structural rework.6
This point is significant; the Zumwalt-class composite deckhouse is one
of the ship's most complex and costly structural elements, and any
extensive modification would substantially increase backfit cost and
risk schedule slippage.
The Navy has signaled its commitment to the modernization path
through concrete funding actions. On 20 April 2026, the Navy awarded
Raytheon a $213.4 million contract modification for continuation of
Zumwalt-class combat system integration, modernization, installation,
testing, and sustainment through 2027.8
This funding supports development activities intended to prepare the
ships for future upgrades and demonstrates sustained naval commitment to
keeping the Zumwalts at an acceptable combat readiness level throughout
their operational lifespans.
Strategic and Doctrinal Implications
The SPY-6 backfit should not be viewed in isolation, but rather
as a single element in a comprehensive effort to transform the
Zumwalt-class from an aberrant platform pursuing a failed operational
concept into an integrated member of the twenty-first-century fleet. The
combination of hypersonic strike capability (via CPS), improved air and
missile defense (via SPY-6), undersea warfare integration (via SQQ-89),
modern electronic warfare systems (via SEWIP), and network-centric
capability (via CEC) would position the Zumwalt-class as a formidable
multi-mission platform capable of fulfilling strike, air defense, and
information-warfare roles across the operational spectrum.
The three Zumwalt-class destroyers are expected to remain in
service for decades. Without modernization, they would become
increasingly obsolete, representing a diminishing return on the $32
billion invested in the class's research, development, and construction.
The ZEUS program, including the SPY-6 backfit, represents the most
cost-effective path to preserving their relevance and utility within the
constrained fiscal environment the Navy now inhabits.
Critically, the SPY-6 backfit would enhance the Navy's ability to
operate in contested environments. The improved detection range,
ballistic missile defense capability, and network-centric integration
afforded by the SPY-6 would significantly increase the Zumwalts'
survivability in scenarios involving near-peer competitors equipped with
advanced antiship cruise missiles and ballistic-missile threats. In the
context of potential Pacific operations against peer adversaries, every
incremental improvement in sensor capability and defensive integration
carries strategic weight.
Remaining Uncertainties and Next Steps
No final decision has yet been made regarding the SPY-6 backfit.
Navy officials and Raytheon representatives alike characterize the
current phase as one of active dialogue and development work, with no
commitment to proceed. Several factors will likely influence the Navy's
ultimate decision: the outcome of ongoing ZEUS integration testing and
combat management system development; the final cost estimate for the
backfit across three ships; schedule implications relative to other
competing modernization priorities; and the availability of repurposed
SPY-6(V)3 arrays from the cancelled Constellation-class program as the
Navy completes those two remaining frigates and assesses its actual
surplus inventory.
The Congressional Research Service and Government Accountability
Office will likely scrutinize any decision to proceed, particularly
given Congress's longstanding concerns over Zumwalt-class cost overruns
and program management. The Navy will need to make a compelling case
that the SPY-6 backfit represents a prudent investment in fleet
readiness rather than merely throwing additional resources at a
historically troubled program.
The most likely scenario involves a phased approach, with USS
Zumwalt receiving the initial SPY-6 installation during a future
deployment to sea availability, followed by USS Lyndon B. Johnson and
USS Michael Monsoor in subsequent modernization periods. This approach
would allow the Navy to validate integration, test operational
employment, and refine procedural and training requirements while
preserving the ability to adjust subsequent installations based on
lessons learned.
Conclusion
The potential backfit of AN/SPY-6 radar systems to the
Zumwalt-class destroyers represents a pragmatic response to both
technical shortcomings in the original design and fiscal realities that
preclude procuring entirely new sensor suites. By leveraging surplus
systems from a cancelled competitor program, the Navy can modernize
three aging platforms at a fraction of the cost of new-build radar
integration. The technical feasibility appears sound, contractor
development efforts are well underway with Navy financial support, and
senior Navy officials have expressed optimism regarding the
modernization path.
What began as a technology-demonstrator for naval gun fire
support has been progressively transformed—first into a littoral surface
fire support platform, then into a hypersonic strike destroyer, and now
potentially into a network-integrated multi-mission combatant capable
of holding its own within the modern fleet. The SPY-6 backfit, if
approved and executed successfully, would represent the final critical
piece of that transformation, converting the troubled Zumwalt-class from
a symbol of failed innovation into a capable platform suited to
contemporary naval warfare. Whether the Navy ultimately commits to the
backfit will reveal much about the service's willingness to invest in
the long-term modernization of existing platforms rather than
perpetually pursuing new starts.
A Final Irony
The arc of the Zumwalt-class offers a cautionary lesson in
defense acquisition. The original concept—a gun-armed littoral strike
platform with stealth features—was sound in theory but ultimately
unsustainable: the Long-Range Land-Attack Projectile proved economically
ruinous, the gun-centric mission concept lost political support, and
only three ships were ever built instead of the planned 32. What the
Navy is now contemplating—a long-range hypersonic strike destroyer with
SPY-6 air defense and network integration—bears almost no resemblance to
what was originally approved. Yet after fifteen years and more than $32
billion in sunk costs, after multiple complete mission redesigns, and
after stripping out systems and bolting on others, the Zumwalts may
finally achieve utility as capable twenty-first-century surface
combatants. The tragic irony is that three Flight III Arleigh
Burke-class destroyers, equipped with SPY-6 and conventional strike
missiles from their inception, would have cost considerably less and
delivered equivalent or superior capability a decade earlier. The Navy
has, in effect, paid dearly for a protracted learning experience
conducted aboard billion-dollar warships. If the SPY-6 backfit proceeds
and succeeds, the Zumwalts will vindicate themselves not through
faithfulness to their original design concept, but through the Navy's
willingness to change course fundamentally and repeatedly until three
troubled platforms finally become genuinely useful assets. That is a
hard-won but valuable lesson for an institution that struggles with
admitting error and altering course.
Verified Sources
1. Zumwalt-class destroyer. Wikipedia. Retrieved May 2026. https://en.wikipedia.org/wiki/Zumwalt-class_destroyer
2. Inaba, Yoshihiro. "Zumwalt-class destroyers may receive SPY-6 radars from frigates." Naval News, May 5-6, 2026. https://www.navalnews.com/naval-news/2026/05/zumwalt-class-destroyers-may-receive-spy-6-radars-from-frigates/
3. "USS Zumwalt to put to Sea in 2026 without main gun systems." Naval News, January 15, 2026. https://www.navalnews.com/naval-news/2026/01/uss-zumwalt-to-put-to-sea-in-2026-without-main-gun-systems/
4. "The Navy's Futuristic
$8 Billion Stealth 'Battleship' Slips Out of Port with Brand New Mach 5
Hypersonic Weapons Canisters." National Security Journal, April 29,
2026. https://nationalsecurityjournal.org/the-navys-futuristic-8-billion-stealth-battleship-slips-out-of-port-with-brand-new-mach-5-hypersonic-weapons-canisters/
5. Lemoine, William. "First Look At Stealth Destroyer's Hypersonic Missile Launchers." The War Zone, January 16, 2025. https://www.twz.com/sea/first-look-at-stealth-destroyers-hypersonic-missile-launchers
6. Inaba, Yoshihiro.
"Zumwalt-class destroyers may receive SPY-6 radars from frigates." Naval
News, May 2026. (Primary source for Raytheon executive interviews and
RFI timeline.) https://www.navalnews.com/naval-news/2026/05/zumwalt-class-destroyers-may-receive-spy-6-radars-from-frigates/
7. "Repurposing the US Navy's Zumwalt-class destroyers with hypersonic strike capability." Navy Lookout, August 21, 2025. https://www.navylookout.com/repurposing-the-us-navys-zumwalt-class-destroyers-with-hypersonic-strike-capability/
8. "U.S. Navy Considers
Replacing Zumwalt-Class SPY-3 Radars with SPY-6 from Cancelled Frigate
Program." The Defense News, May 1, 2026. https://www.thedefensenews.com/news-details/US-Navy-Considers-Replacing-Zumwalt-Class-SPY-3-Radars-with-SPY-6-from-Cancelled-Frigate-Program/
9. AN/SPY-3. Wikipedia. Retrieved May 2026. https://en.wikipedia.org/wiki/AN/SPY-3
10. "Dual Band Radar Swapped Out In New Carriers." Defense News, March 17, 2015. https://www.defensenews.com/naval/2015/03/17/dual-band-radar-swapped-out-in-new-carriers/
11. AN/SPY-6. Wikipedia. Retrieved May 2026. https://en.wikipedia.org/wiki/AN/SPY-6
12. "The Navy a Hypersonic Plan to Save the Stealth Zumwalt-Class Destroyers." National Security Journal, September 5, 2025. https://nationalsecurityjournal.org/the-navy-a-hypersonic-plan-to-save-the-stealth-zumwalt-class-destroyers/
13. Constellation-class frigate. Wikipedia. Retrieved May 2026. https://en.wikipedia.org/wiki/Constellation-class_frigate
15. Navy Constellation
(FFG-62) and FF(X) Class Frigate Programs: Background and Issues for
Congress. Congressional Research Service (R44972), March 16, 2026. https://www.congress.gov/crs-product/R44972
16. "The US Navy Just Scuttled the Constellation-Class Frigate Program." The National Interest, November 26, 2025. https://nationalinterest.org/blog/buzz/us-navy-just-scuttled-constellation-class-frigate-program-ps-112625
17. "U.S. Navy retains
first six Constellation-class frigates in FY2026 budget to strengthen
fleet coverage." Army Recognition, July 7, 2025. https://www.armyrecognition.com/news/navy-news/2025/us-navy-retains-first-six-constellation-class-frigates-in-fy2026-budget-to-strengthen-fleet-coverage
Use Checkpoints Every 5–7 Messages – Ask "Are we on track?" in complex conversations to catch wrong paths early. Saves 83%.
Edit Instead of Correcting – Click Edit on your message, fix it, regenerate. Don't stack "Actually, I meant…" messages.
Use New Chats for Different Topics – One topic per chat. Separate chats avoid re-reading irrelevant context. Saves 40% in multi-topic conversations.
Disable Tools by Default – Tools add 200–400 token overhead per exchange even when unused.
Restart Conversations Every 15–20 Messages – Long conversations accumulate re-read overhead. Saves 55%.
Search Before Asking – Use conversation search to find past solutions. Saves 67–75% if found.
Crop Screenshots Tightly – Crop to only the relevant portion. Full screenshot = 1,300 tokens; tight crop = 50 tokens.
Chain Tasks in One Message – "Analyze this data, then write a summary from your analysis" instead of separate messages. Saves 30%.
Use "Assume You Know" References – After establishing context, reference it: "Assume you know the CONVERGE-01 trial from earlier." Saves 75%.
Use Negative Constraints – "Explain without covering basics I already know" is clearer than restating what you know.
Outline Mode Before Full Detail – Ask for pseudocode/outline first, expand only necessary sections. Saves 20–40%.
Pre-Process Data Externally – Clean data before uploading (Excel, Python). Saves 60–75%.
Build Conditional Templates – Create reusable templates with [IF: condition] sections for different use cases. Saves 40–50%.
Project Status Summaries – At session end, ask Claude to write a status summary. Paste it next session instead of re-explaining. Saves 72%.
Implementation Roadmap
Week 1: Quick Wins (Save ~30%)
Technique 7: Right-size models
Technique 8: Write shorter prompts
Technique 18: Crop screenshots
Week 2: Process Changes (Additional 20%)
Technique 3: Batch tasks
Technique 14: Separate chats for topics
Technique 13: Edit instead of correcting
Week 3: Structural Setup (Additional 25%)
Technique 1: PDF → Markdown
Technique 2: Projects for shared files
Technique 4: Trim personal context
Week 4: Advanced Optimization (Additional 15%)
Technique 15: Tool management
Technique 24: Prompt templates with conditions
Technique 9: Restart long conversations
Expected total improvement: 70–80% token efficiency gain
The Core Principle
Token efficiency is a systems problem, not a single-query problem. Efficient workflows:
Build around one Project per major work area (IPCSG research, technical analysis, civic policy)
Use persistent templates and shared files across chats
Create continuity with status summaries and checkpoints
Batch related work together to leverage prompt caching
Individual tips help. But combining them into a system-level workflow is what really multiplies savings across months of work.
Quick Wins Summary
Technique
Savings
Effort
Replace PDFs with .md
85–90%
Low
Use Projects
80%
Low
Batch tasks
56%
Low
Right-size models
50%
Medium
Trim context
70%
Medium
Short prompts
33%
Low
Compress outputs
67%
Low
Checkpoints
83%
Low
Batch with caching
87.5%
Medium
Pre-process data
60–75%
Low
Start with the "Low Effort" column. You'll hit 50%+ savings in Week 1.
Techniques to Stop Hitting Claude's Limits: Details
Claude's token limits aren't arbitrary walls—they're guardrails that force discipline. Every token you waste on redundant uploads, verbose prompts, or context bloat is a token stolen from actual work. This article translates raw optimization techniques into a workflow that scales.
The Problem: How Users Burn Tokens
Most Claude users operate at 30–50% efficiency. A 200K token limit sounds generous until you realize:
A 10-page PDF = 15,000–30,000 tokens gone before you type anything
A 400-word prompt gets re-read 20+ times across a conversation
Three sequential messages force Claude to re-tokenize the entire history three times
A single bloated personal context file (20K words) loads into every session
Tools left enabled burn tokens on every exchange, even when unused
For teams, this compounds catastrophically. One poorly optimized workflow × 50 users × 20 chats/month = token hemorrhage that looks like a feature problem when it's actually a process problem.
Technique 1: Replace PDFs with Markdown via Google Docs
The Problem PDFs are opaque to token counting. A single page burns 1,500–3,000 tokens depending on layout complexity, images, and formatting. A 20-page technical document = 30,000–60,000 tokens before analysis begins.
The Solution
Paste PDF text into a Google Doc
Clean up formatting (remove headers, footers, duplicate spacing)
Download as .md
Upload the markdown file
Token Cost Comparison
PDF (20 pages): 30,000–60,000 tokens
Markdown equivalent: 3,000–5,000 tokens
Savings: 85–90%
Why It Works Markdown is plain text. Claude tokenizes it at ~0.25 tokens per word. PDFs include invisible rendering information, font metadata, and positioning data that all get tokenized. Google Docs' export strips that noise.
When to Use This
Technical reports, whitepapers, research papers
Legal documents, contracts, policy briefs
Any document longer than 3 pages
Documents with complex formatting or images
When Not To
Documents requiring exact visual layout (posters, forms with specific spacing)
Scanned PDFs (use OCR first, then convert)
Single-page quick references (just copy-paste text directly)
Technique 2: Right-Size the Model for the Task
The Problem Opus costs 5x more per token than Haiku and 3x more than Sonnet. Using Opus for summarization or simple coding is like hiring a surgeon to check your blood pressure.
Model Economics | Task | Right Choice | Why | |------|-------------|-----| | Summarize a document | Haiku | 90% accuracy, 1/5 cost | | Write a simple script | Sonnet | Handles most coding, 1/3 Opus cost | | Debug complex reasoning | Opus | Deep chains need depth | | Brainstorm ideas | Haiku | Ideation doesn't need reasoning depth | | Multi-step analysis | Opus | Benefit from extended reasoning | | Customer service reply | Haiku | Template matching, not reasoning |
Decision Tree
Does this task require multi-step reasoning across 5+ inference steps? → Opus
Does it need deep technical knowledge but straightforward logic? → Sonnet
Is it straightforward task execution? → Haiku
Token Budget Impact A team running 50 daily chats:
The Problem Every new message forces Claude to re-read the entire conversation history before responding. Three sequential messages = three full re-reads of context.
Example: The Inefficient Way
Message 1: "Can you summarize this report?"
[Claude responds, tokens consumed]
Message 2: "Now extract the key metrics"
[Claude re-reads entire conversation + new message]
[Claude responds, tokens consumed]
Message 3: "Format those metrics as a table"
[Claude re-reads entire conversation again]
[Claude responds, tokens consumed]
Token Cost: Each message re-reads full history. With a 20-message conversation, message 21 retokenizes all 20 previous exchanges.
The Efficient Way
Message 1: "Do three things:
1. Summarize this report in 3 sentences
2. Extract the top 5 metrics
3. Format those metrics as a table with columns: Metric, Value, Trend"
Specify output format for each (table, list, paragraph, JSON)
Set constraints (word counts, detail level) per task
Use one message to capture all context needed
When Not to Batch
Tasks require Claude's output from task #1 to inform task #2
Second task is fundamentally different in scope
You need to iterate on one task before moving to the next
Technique 4: Edit Instead of Stacking Corrections
The Problem Users write a message, realize mid-reply they misspoke, and send a follow-up: "Actually, I meant…" This creates bad history that Claude must re-read forever.
Example: Poor Practice
Message 1: "Analyze this dataset with regression analysis"
[Claude responds]
Message 2: "Wait, I said regression but I meant clustering"
[Claude re-reads both messages, applies fix]
Message 3: "Also, use k-means specifically"
[Conversation now has 3 messages for what should be 1]
The Right Way
Click the Edit button on your original message
Fix the prompt
Click Regenerate
Result: Original bad message disappears. Conversation history stays clean. No token waste on corrections.
Token Impact
Stack of 3 messages with corrections: 5,000 tokens (includes overhead of re-reading bad context)
Single edited message: 2,000 tokens
Savings: 60%
Technique 5: Use New Chats for New Topics
The Problem One chat drifts across 4 different topics (analyzing a dataset, then drafting an email, then brainstorming ideas, then debugging code). Claude must re-read everything above before every response.
Example: Conversation Bloat
Messages 1-5: Analyze Q3 sales data
Messages 6-10: Draft investor email (unrelated)
Messages 11-15: Brainstorm product features (unrelated)
Messages 16-20: Debug Python script (unrelated)
Message 21: New question about the Python script
[Claude must re-read all 20 previous messages, 80% of which are irrelevant]
Token Cost: Message 21 tokenizes 20 previous exchanges even though only 5 are relevant.
The Right Way
New topic = new chat
One chat = one focused problem
Token Impact
Bloated single chat (4 topics, 20 messages): Each new message re-reads ~8,000 tokens of irrelevant context
Four separate chats (5 messages each): Each new message re-reads ~2,000 tokens of relevant context
Savings across 20 total messages: 40,000 tokens (60% of original)
Bonus: Organization becomes much easier. Your chat history is searchable and scannable.
Highly specialized domains where brevity creates ambiguity
Chats where you've established context already
Technique 7: Use Projects to Share Files Across Chats
The Problem You upload the same document to 5 different chats. That document gets tokenized in full for each chat. A 10K-token document = 50K tokens burned unnecessarily.
All chats reference the same files. No redundant uploads.
Technique 8: Disable Tools and Connectors When Not In Use
The Problem Tools consume tokens on every exchange, even when inactive. Web search, calculator, file operations—if enabled, Claude considers them on every response.
Token Cost of Enabled Tools Enabling 3 tools (web search, code execution, file creation) adds ~200–400 tokens overhead per exchange, even when unused.
Enable only the specific tool(s) needed for current task
Disable when task completes
Tools to Keep Disabled Most of the Time
Web search (enable only when asking current events)
Code execution (enable only during debugging/testing)
File creation (enable for artifact generation, disable for Q&A)
Connectors (enable only when accessing Gmail/Calendar/Drive)
Tools Worth Keeping On
Within Projects that specifically need them
During focused work sessions where they're used consistently
Technique 9: Restart Conversations Every 15–20 Messages
The Problem At message 25, Claude re-reads all 24 previous messages before responding. By message 50, the context window overhead becomes significant.
Context Re-Read Cost
Message 10: Re-read ~4,000 tokens of context
Message 20: Re-read ~8,000 tokens of context
Message 30: Re-read ~12,000 tokens of context
Message 50: Re-read ~20,000 tokens of context
Solution: Refresh Every 15–20 Messages
At ~message 15, summarize key points in a new message: "Summary: We've analyzed X, decided on Y, next step is Z"
Start a new chat with that summary as context
New chat begins fresh without re-reading all history
Token Impact
Single 50-message conversation: ~100,000 tokens (with re-read overhead)
Two 25-message conversations: ~60,000 tokens (less re-read overhead)
Three 17-message conversations: ~45,000 tokens (minimal re-read overhead)
Savings: 55%
When to Restart
Task fundamentally shifts direction
Conversation length approaches 30+ messages
New day/session (fresh start feels cleaner)
When Not To
You need full context from all prior messages for current task
You're iterating on something that needs complete history
Technique 10: Crop Screenshots to Only Relevant Portions
The Problem Users upload full 1000×1000 pixel screenshots when a 200×300 pixel crop would work. Full screenshots tokenize at ~1,300 tokens; crops can drop below 100.
Token Cost of Screenshots
Full screenshot (1000×1000): ~1,300 tokens
Medium crop (400×400): ~200 tokens
Tight crop (200×200): ~50 tokens
Potential savings: 96%
Example: The Inefficient Way User pastes full desktop screenshot showing:
Entire taskbar
Application menu
Status bar
And the actual error dialog in bottom right
Claude tokenizes all of it.
The Efficient Way Crop to just the error dialog:
[Cropped to 250×150 pixels]
Claude gets the information without the noise.
Cropping Checklist
Remove any UI chrome (taskbars, menus) unless relevant
Remove whitespace margins
Crop to the minimal bounding box that includes the issue
Keep just enough context for understanding (one surrounding line/button)
Tools
Windows: Snip & Sketch (Win+Shift+S)
Mac: Cmd+Shift+4 (drag to select area)
Linux: Flameshot or built-in tool
Online: Snipping tools in browser
Technique 11: Build and Reuse Prompt Templates
The Problem Users rewrite similar prompts from scratch repeatedly. Each rewrite is slightly different, prevents caching, and burns mental energy.
Example: Inefficient Rewriting
Chat 1: "Write a technical analysis of the MQ-9B SeaGuardian focusing on
operational range, sensor capabilities, and integration with naval systems."
Chat 2 (weeks later): "Can you analyze the GA-ASI Gambit system? I want to
understand its operational capabilities, sensor suite, and how it fits into
the broader defense architecture."
Chat 3 (another week): "Technical overview of the V-22 Osprey: what it does,
what sensors it has, and how it works with other military systems."
Same structure, different words each time. Prevents caching.
Prompt Template Approach Create a template in a document:
# Technical System Analysis Template
Analyze [SYSTEM_NAME] and cover:
1. Operational range and endurance
2. Sensor suite and detection capabilities
3. Integration with broader force architecture
4. Notable operational history or incidents
5. Key limitations or known issues
Format as: summary section + detailed technical breakdown +
comparison table with similar systems.
Reuse for every similar analysis:
Chat 1: Analyze MQ-9B SeaGuardian [use template]
Chat 2: Analyze GA-ASI Gambit [use template]
Chat 3: Analyze V-22 Osprey [use template]
Token Benefit Prompt caching (available in Projects) means repeated prompts aren't fully re-tokenized.
Manual rewriting each time: Each prompt re-tokenized in full
Template reuse in Projects: First use tokenizes template, subsequent uses get cached hit (90% cost reduction)
Content drafting (outline, research questions, audience, tone)
Technique 12: Keep Personal Context Under 2,000 Words
The Problem A 20,000-word personal context file loads into every single conversation. That's 20K tokens of overhead before you type your first question.
Real Impact A 20K-word context file in a 200K limit:
10% of your token budget consumed by context alone
Every conversation starts 20K tokens in the hole
A 1-hour work session might be 50% context + 50% actual work
Example: Bloated Context
[USER PROFILE: 20,000 words covering]
- Entire work history (every job description)
- Complete family tree and relationships
- Full list of 50+ projects and their outcomes
- Every skill and certification
- All medical history and preferences
- Complete reading list and book summaries
- Full financial situation
- Detailed hobby list
[END: 20,000 tokens burned before starting]
The Trimmed Version
[USER PROFILE: 1,500 words covering]
- Current role: Retired Senior Engineer, radar systems
- Key expertise: Signal processing, C4ISR, AMASS
- Current projects: IPCSG advocacy, technical writing
- Key context for Claude: Prostate cancer patient-advocate,
uses pseudonym "Pseudo Publius" for civic policy work
- Preferences: HTML for newsletters, Markdown for analysis
[END: 1,500 tokens used for genuine context]
What to Keep in Context (1,500 words max)
Current professional role
3–5 core skills Claude needs to know about
1–2 active projects
Key preferences (format, tone, communication style)
Any ongoing work Claude should reference
What to Cut
Complete work history (mention only current role)
Family relationships unless directly relevant
Completed projects (list only active ones)
Medical details beyond "patient advocate in X field"
3. San Diego civic policy research (transit, water, governance)
## Preferences for Claude
- HTML output for IPCSG content (avoid .docx)
- Markdown for technical analysis
- Cite sources for health/policy content
- Flag when content needs fact-checking
## Key Context
- Patient with 11+ years prostate cancer history
- Uses "Pseudo Publius" pseudonym for civic policy writing
- Lives in San Diego, familiar with local transit/healthcare systems
- Enrolled in CONVERGE-01 actinium-225 PSMA trial at UCSD
Token Savings
20K context file: 20,000 tokens per conversation
2K context file: 2,000 tokens per conversation
10 conversations/week: 180,000 token savings per week
Monthly savings: 720,000 tokens (enough for 3-4 complex analysis projects)
13. Leverage Conversation Search Before Asking
The Problem
You ask Claude a question you've already solved in a previous chat. Claude re-answers from scratch, consuming tokens for work already done.
Solution
Use the conversation search tool to find past relevant chats before messaging Claude. If you find the answer, you're done (zero tokens). If not, you have context for a more targeted question.
Example
Instead of: "How do I configure AMASS for multi-sensor fusion?"
Search first. If found in past chat, copy the answer. If not found, ask: "I've searched my past work—it's not there. Here's what I tried last time [specific detail]. What's the next step?"
Token Cost
Re-asking and re-answering: 2,000–3,000 tokens
Searching + targeted follow-up: 500 tokens
Savings: 67–75%
14. Use Structured Output Formats to Reduce Back-and-Forth
The Problem
You ask a question, Claude gives prose, you ask for it in table format, Claude reformats. Two exchanges for one deliverable.
Solution
Specify the exact output format upfront: "JSON with keys: name, value, unit" or "Markdown table: columns are X, Y, Z" or "CSV format" or "Numbered list with 1-sentence descriptions."
Example
Inefficient: "List the key features of the MQ-9B"
Claude responds with prose paragraph
You: "Can you make that a table?"
Claude reformats (2 exchanges, 4,000+ tokens)
Efficient: "List MQ-9B key features as a markdown table with columns: Feature, Specification, Operational Impact"
Claude responds with table in one go (1 exchange, 1,500 tokens)
Token Savings: 60%
15. Use Claude's "Drafts" or Internal Reasoning to Reduce Revisions
The Problem
You ask Claude to write something, it's close but needs tweaks, you ask for revision, Claude rewrites. One task becomes three messages.
Solution
In the initial prompt, ask Claude to "show your thinking first" or "provide a draft + notes on what could improve it." Claude self-critiques, reducing revision cycles.
Example
Inefficient: "Write a technical brief on the Constellation-class frigate cancellation"
Claude writes
You: "It's good but needs more detail on cost overruns"
Claude revises (2 exchanges minimum)
Efficient: "Write a technical brief on the Constellation-class frigate cancellation. Include: executive summary, cost breakdown, timeline of delays, political context. Flag any sections that feel weak or incomplete."
Claude writes with self-critique built in
You get a more complete product on first try (1 exchange)
Token Savings: 40–50% (fewer revision cycles)
16. Reuse Outputs as Inputs (Chaining Without Re-Prompting)
The Problem
You ask Claude to analyze data, then ask it to write a summary of that analysis. Claude re-reads both the original data and its analysis.
Solution
When Claude produces output you'll use as input for another task, say so upfront: "Analyze this data, then use your analysis to draft a one-paragraph summary."
This chains tasks in a single message, avoiding re-reads.
Example
Inefficient:
Message 1: "Analyze Q3 sales by region" (Claude analyzes)
Message 2: "Summarize that analysis for a board memo" (Claude re-reads data + analysis)
(2,500 + 2,500 = 5,000 tokens)
Efficient:
Message 1: "Analyze Q3 sales by region, then summarize findings in one paragraph for a board memo"
(3,500 tokens, single pass)
Token Savings: 30%
17. Specify Constraint Limits Upfront to Avoid Scope Creep
The Problem
You ask for "an analysis," Claude writes 2,000 words because no constraint was given. You then ask "can you make it shorter?" Claude re-writes. Wasted tokens.
Solution
Always specify: word count, depth level, or section count upfront.
Examples
"Summarize in 3 sentences"
"Brief analysis (under 500 words)"
"Outline only—no prose, just 5 bullet points per section"
"Executive summary format: 1 page max"
Token Impact
Unconstrained ask: 3,500 tokens → You request trim → 2,000 tokens for re-work (5,500 total)
Constrained ask: 1,500 tokens (right size on first try)
Savings: 73%
18. Cache Repeated Context by Using "Assume You Know" Statements
The Problem
Every time you chat, you re-explain your domain, your current project, or your constraints.
Solution
Once you've established context in a chat, use "assume you know" statements to avoid re-explaining:
"Assume you know the San Diego MTS budget structure from our earlier discussion"
"Assume you're familiar with the CONVERGE-01 trial protocol"
"Assume you know the PIRAN radiation belt software context"
This signals Claude to reference prior exchanges without restating everything.
Token Cost
Restating context every time: 800+ tokens per message
Using "assume" reference: 200 tokens per message
Savings: 75%
19. Use Negative Constraints (What NOT to Include)
The Problem
Specifying what you want is harder than specifying what you don't. "Don't explain basic concepts I already know" saves tokens better than "explain advanced concepts."
Solution
Frame prompts with what to exclude:
"Analyze this without explaining what a PSMA scan is"
"Write the technical section without introductory material"
"List only the novel findings—skip anything in standard literature"
Example
Inefficient: "Explain the latest prostate cancer biomarkers" (Claude might explain what biomarkers are, burning tokens on known info)
Efficient: "Explain novel prostate cancer biomarkers, assuming I know what biomarkers are and how standard testing works"
Token Savings: 20–30%
20. Compress Intermediate Outputs via Summarization Prompts
The Problem
You ask Claude to do a deep analysis (3,000 tokens), then ask questions about it. Claude must re-read the full analysis for each question.
Solution
After the analysis, immediately ask Claude to produce a "compressed summary for reference." You then use the summary for follow-ups, not the full analysis.
Example
Message 1: "Analyze the 50-page NTSB docket on the LaGuardia collision" (3,000 tokens)
Message 2: "Compress that into a 10-point summary I can reference for follow-ups" (500 tokens)
Messages 3+: Ask questions referencing the summary, not the original analysis (saves 2,000+ tokens per follow-up)
Token Impact
With compression: Original (3,000) + summary (500) + 5 follow-ups using summary (2,500) = 6,000 total
Without compression: Original (3,000) + 5 follow-ups re-reading full analysis (15,000) = 18,000 total
Savings: 67%
21. Use Pseudocode or Outline Mode for Complex Tasks
The Problem
You ask Claude to solve a complex problem in full detail. It writes long explanations. You ask for just the outline. It re-writes.
Solution
Ask for "pseudocode" or "outline mode" first, then expand only sections you need.
Example
Inefficient: "Help me design a system for analyzing satellite megaconstellation fragmentation" (Claude writes 2,000-word design doc)
Efficient:
Message 1: "Outline only: system architecture for analyzing satellite megaconstellation fragmentation" (Claude: 300-word outline)
Message 2: "Expand section 3 (data pipeline) to full technical detail"
(1,000 + 1,500 = 2,500 vs 2,000 + potential revisions)
Token Savings: 20–40% (you pay only for sections you need)
22. Pre-Process Data Externally Before Uploading
The Problem
You upload raw messy data (10K tokens), Claude cleans it, then you ask questions. Claude must re-read the messy + cleaned data.
Solution
Clean/process data before uploading. Use a spreadsheet tool, Python script, or other lightweight processing first.
Example
Inefficient: Upload 500-row CSV with duplicates, formatting issues, irrelevant columns (8,000 tokens) → Claude cleans and analyzes
Efficient: Clean in Excel/Python locally (2 min, no tokens) → Upload cleaned 200-row CSV (2,000 tokens) → Claude analyzes
Token Savings: 60–75%
23. Use Checkpoints: "Are We On Track?" Mid-Conversation
The Problem
You work through a complex analysis with Claude, go down the wrong path for 10 messages, then realize the approach is wrong. All 10 messages must be re-read going forward.
Solution
Every 5–7 messages in complex tasks, insert a checkpoint: "Summarize progress so far and confirm we're on the right track before continuing."
If wrong track, you catch it early. If right track, you've created a compressed summary for future reference.
Token Cost
Wrong path after 10 messages: Wasted 6,000 tokens, plus future re-reads (12,000 total over conversation)
Checkpoint at message 5: Catch early, save 10 messages of wasted work (10,000 tokens)
Savings: 83%
24. Leverage Templates with Conditional Sections
The Problem
(Similar to #11, but more sophisticated) You have a template, but different uses require different sections. You still re-write parts.
Solution
Build templates with conditional markers. Example:
# Technical Analysis Template
## Executive Summary (always)
[1 paragraph]
## [IF: System is military] Operational History
[relevant section]
## [IF: System has sensors] Sensor Capabilities
[relevant section]
## Key Metrics (always)
[data table]
## [IF: System is controversial] Safety/Incident History
[relevant section]
When reusing, you fill only the sections relevant to the specific system.
Token Benefit
Manual rewriting each time: 100% re-tokenization
Template with conditionals: Reusable frame (cached) + conditional sections only
Savings: 40–50% on repeated similar analyses
25. Use "Status Check" Outputs for Ongoing Projects
The Problem
You're working on a multi-week project. Each new chat, you brief Claude on what's been done. That briefing is always ~1,000 tokens.
Solution
At the end of each session, ask Claude to generate a "project status summary" (500 words). Start next chat by pasting that summary instead of re-explaining.
Example
After session 1 on IPCSG newsletter research:
You: "Create a 300-word status summary for my next session: what we've covered, what's pending, open questions"
Claude: [Status summary] (800 tokens)
Next session:
You: "Here's the status from last session: [paste]. Continue with the next section on ADT cardiovascular risk"
Status summary approach: 800 + 200×10 = 2,800 tokens
Savings: 72%
26. Batch Similar Queries to Use Prompt Caching
The Problem
You ask 10 different questions about the same system (MQ-9B). Each question re-reads the full context.
Solution
In Projects, ask all related questions about the same system in one session before moving to a new system. Prompt caching means the context gets tokenized once, reused for all questions.
Example
Chat 1: "Answer all questions about MQ-9B SeaGuardian" + [list 10 questions]
Questions about same context leverage caching
Chat 2 (different day, same project): Ask 10 questions about Gambit CCA
New context, but again leveraging caching within session
Token Benefit
Separate chats for each question: 10 questions × 2,000 tokens = 20,000 tokens
Batched in one chat with caching: 2,000 (context) + 500 (questions) = 2,500 tokens
Savings: 87.5%
Summary: All 26 Techniques by Impact
Highest Impact (40%+ savings each)
#1: Replace PDFs with markdown (85–90%)
#7: Use Projects to avoid redundant uploads (80%)
#3: Batch tasks (56%)
#12: Trim personal context (70% when compounded)
#20: Compress intermediate outputs (67%)
#26: Batch similar queries with caching (87.5%)
High Impact (25–40% savings)
#2: Right-size models (50% when applied systematically)
#6: Short prompts (33%)
#10: Crop screenshots (96% but narrow use case)
#15: Show thinking first (40–50%)
#17: Specify constraints upfront (73%)
#23: Checkpoints (83%)
Medium Impact (15–25% savings)
#4: Edit instead of stacking (60% but single-message impact)
#5: New chats for new topics (40% for multi-topic conversations)
#8: Disable tools (varies by tool usage)
#9: Restart conversations (55% but only if you hit 50+ messages)
#13: Search before asking (67% but only if found)
#14: Specify output format (60% but narrow use case)
#16: Chain tasks (30%)
#18: "Assume you know" statements (75% but only after context established)
#19: Negative constraints (20–30%)
#21: Pseudocode mode (20–40%)
#22: Pre-process data (60–75% but narrow use case)
#24: Conditional templates (40–50%)
#25: Status summaries (72%)
The Strategic Layer: What Most People Miss
Beyond these 26 techniques, there's one meta-insight:
Token efficiency is a systems problem, not a tips-and-tricks problem.
Most users treat Claude like a search engine: ask question, get answer, move on. That model inherently wastes tokens because there's no continuity.
Efficient users treat Claude like a long-term collaborator:
One Project per major body of work (IPCSG research, Naval analysis, San Diego civic work)
Conversations that build on each other (status summaries, checkpoints)
Clear handoffs between sessions (summary → next session → summary)
When you operate at the "system" level instead of the "single query" level, all 26 techniques compound. You're not just saving tokens on individual exchanges—you're building workflows that stay efficient across months.
That's the real win.
Integrated Workflow: Putting It All Together
Here's how these 12 techniques work together in practice:
Scenario: Research and Write a Technical Analysis
The Bloated Approach
1. Upload raw 15-page PDF (30,000 tokens)
2. Write 300-word prompt with every detail (prompt re-read 15+ times)
3. Send message → realize you need more info → "Actually, I meant…"
(correction stacking)
4. Ask 3 follow-ups in separate messages (context re-read x3)
5. Use Opus for summarization (wrong model)
6. Keep web search enabled (unused, 300 token overhead)
7. Two weeks later, use same PDF in new chat (30,000 tokens again)
Technique 9: Restart conversations every 15–20 messages
Expected Result After 4 Weeks: 70–80% improvement in token efficiency
The Mindset Shift
Token optimization isn't about deprivation—it's about clarity. When you're forced to communicate concisely, write better prompts, and focus on one task at a time, you get better results and use fewer tokens.
Every inefficient workflow pattern masks itself as "flexibility" or "exploratory thinking." In reality, it's just waste.
The 12 techniques above are the proven guardrails. Use them, and you'll never feel constrained by Claude's limits again. The constraint becomes a feature: it forces you to think like an engineer, not just an experimenter.
Your next Claude session will be 3x more productive and cost 70% less.