Saturday, May 2, 2026

Zumwalt-class destroyers may receive SPY-6 radars from frigates - Naval News


Zumwalt-class destroyers may receive SPY-6 radars from frigates - Naval News

Retrofitting Failure: The Zumwalt-Class and the $32 Billion Learning Curve

BOTTOM LINE UP FRONT

The U.S. Navy is evaluating a proposal to retrofit AN/SPY-6 radar systems—originally manufactured for the cancelled Constellation-class frigate program—onto all three operational Zumwalt-class destroyers as part of the Zumwalt Enterprise Upgrade Solution (ZEUS). Raytheon has received Navy funding to develop combat management system modifications enabling SPY-6 integration, while both contractors and Navy officials have expressed confidence in the technical feasibility. The SPY-6(V)3 variant, dimensionally comparable to the incumbent AN/SPY-3, could be installed without major structural modifications; however, no final decision has yet been made. The backfit represents one element of a broader strategic pivot to transform the Zumwalts from their failed original concept as gun-armed littoral platforms into long-range hypersonic strike assets aligned with the wider Aegis fleet.

The Zumwalt Class in Transition

The Zumwalt-class destroyers represent one of the U.S. Navy's most dramatic strategic reversals. Originally envisioned as a 32-ship class optimized for naval surface fire support (NSFS) in shallow-water operations, the platform's distinctive tumblehome hull and composite deckhouse were engineered to achieve radar cross-section comparable to that of a fishing boat—approximately fifty times more difficult to detect than a conventional destroyer.1 However, rising costs for the Long-Range Land-Attack Projectile (LRLAP) ammunition essential to the ship's core mission rendered the 155-millimeter Advanced Gun System economically unsustainable, and procurement was cancelled well before the first ship's commissioning.

With only three ships authorized and built—USS Zumwalt (DDG-1000), USS Lyndon B. Johnson (DDG-1002), and USS Michael Monsoor (DDG-1001)—the Navy has radically reoriented the class toward extended-range strike warfare. Beginning in 2023, both AGS turrets were removed from each destroyer and replaced with vertical launch system (VLS) cells accommodating the Conventional Prompt Strike (CPS) hypersonic missile system.2 USS Zumwalt completed this conversion in late 2025, and now carries twelve CPS missiles in four Advanced Payload Modules forward of the superstructure.3 USS Lyndon B. Johnson is undergoing similar modifications at Ingalls Shipbuilding in Pascagoula, while USS Michael Monsoor is scheduled for conversion during its next maintenance availability.

The CPS missile, jointly developed by the Army and Navy, achieves Mach 5+ velocity and delivers a Common Hypersonic Glide Body (C-HGB) across ranges exceeding 1,725 nautical miles—a dramatic capability expansion compared to the AGS's notional 63-nautical-mile range.4,5 This transformation has effectively shifted the Zumwalt-class from a littoral gun platform to a strategic-depth strike destroyer, fundamentally altering the operational calculus for the ships' remaining service life.

The Combat System Modernization: ZEUS

Recognizing that hypersonic strike capability alone would not suffice for twenty-first-century fleet operations, the Navy initiated the Zumwalt Enterprise Upgrade Solution (ZEUS)—a comprehensive combat system modernization program first formally outlined in a Request for Information (RFI) issued in November 2022.6 ZEUS encompasses far more than radar replacement alone. The program includes integration of the Surface Electronic Warfare Improvement Program (SEWIP), the undersea warfare combat system SQQ-89, and the Cooperative Engagement Capability (CEC) datalink—measures designed to align the Zumwalt-class more closely with the Aegis-equipped fleet standard and enhance network-centric warfare integration.7,8

The radar upgrade component reflects a critical shortcoming in the original Zumwalt design. The AN/SPY-3 multifunction radar, while performing well in its X-band search and track role, was never intended to shoulder the full burden of air defense alone. Zumwalt-class destroyers were originally equipped with a dual-band radar architecture pairing the SPY-3 with the AN/SPY-4 S-band volume search radar. However, in June 2010, Pentagon acquisition officials elected to delete the SPY-4 as a cost-reduction measure, requiring the SPY-3 to be reprogrammed to perform both horizon search and volume search functions simultaneously—a compromise that limits its capability to manage large-scale air attacks while providing fire control for multiple simultaneous engagements.9,10 The SPY-3 also lacks integration with modern ballistic missile defense systems, a growing liability as the Navy faces advanced cruise-missile and hypersonic threats.

The AN/SPY-6: A Generation Forward

The AN/SPY-6 represents the latest generation of Raytheon naval radar technology. First delivered to the Navy in July 2020, the SPY-6 is built on a modular, scalable architecture employing Radar Modular Assemblies (RMAs)—self-contained radar modules, each approximately two feet per side, that function as individual transmit/receive elements.11 This modular approach enables the Navy to field multiple variants optimized for specific platforms and mission sets, ranging from the full four-sided SPY-6(V)1 system aboard Flight III Arleigh Burke-class destroyers (with 37 RMAs per face) to more compact configurations for smaller combatants.

The SPY-6(V)3 configuration under consideration for Zumwalt-class integration employs a three-sided phased array, each with nine RMAs, providing volume search and track capabilities across extended detection ranges and advanced electronic scanning performance characteristic of modern AESA radar systems.11,12 The SPY-6(V)3 is already planned for installation on Constellation-class frigates (for ships remaining under construction) and serves as the primary air and missile defense radar aboard Gerald R. Ford-class aircraft carriers beginning with USS John F. Kennedy (CVN-79).11 This commonality across platform classes has significant implications for fleet logistics, training, and maintenance.

The SPY-6 system offers approximately 15 decibels improved sensitivity compared to the SPY-1 radar architecture that equips the Aegis fleet—equivalent to detecting targets half the size at twice the distance—and provides simultaneous defense against ballistic missiles, cruise missiles, air and surface threats, plus organic electronic warfare capability.11 Integration with CEC enables true network-centric air defense, where each ship's SPY-6 radar data is fused with information from surrounding platforms to create a composite battlespace picture far superior to what any single ship could achieve in isolation.

The Constellation-Class Cancellation: An Unexpected Opportunity

The Constellation-class frigate program, awarded to Fincantieri Marinette Marine in April 2020, was conceived as a more affordable complement to the DDG-51 Arleigh Burke-class destroyer. The design was based on a scaled adaptation of Marinette's FREMM (Frigate European Multi-Mission) platform, itself a derivative of the Italian FREMM design with extensive Americanization to meet Navy survivability and electromagnetic requirements.13,14 However, the program encountered cascading delays. As of April 2024, the lead ship, USS Constellation (FFG-62), was only 10 percent complete, with the Navy's FY2026 budget projecting delivery slipping from the original 2026 target to April 2029—a delay of 36 months at an estimated cost of $1.5 billion.14,15 The Government Accountability Office identified fundamental design stability issues, with the ship becoming significantly heavier than anticipated and achieving far less than the promised cost advantage over the larger, more capable DDG-51.

On 25 November 2025, Secretary of the Navy John C. Phelan cancelled all but the first two ships in the Constellation-class program as part of a comprehensive Navy fleet strategy review.16 At the time of cancellation, the lead frigate was reported 12 percent complete. The Navy elected to complete the two ships under construction (FFG-62 and FFG-63) to preserve Marinette Marine's industrial capacity and maintain continuity of shipyard employment, but halted procurement of the remaining four ships on contract. The Navy subsequently announced a new frigate competition for a smaller, faster-to-build design based on the U.S. Coast Guard's National Security Cutter (NSC) hulform, designated FF(X)—an architecture explicitly not optimized for the SPY-6 radar due to size constraints.16,17

This cancellation decision created a windfall of surplus long-lead-time items manufactured for the Constellation-class program. According to John Tobin, Associate Director for International SPY Radar Programs at Raytheon, SPY-6(V)3 radar arrays originally procured for the cancelled frigates remain in inventory. Raytheon and Navy officials have indicated that these systems could be repurposed and installed on the Zumwalt-class at considerably lower total cost than procuring entirely new radar suites.6 The decision to salvage these components represents pragmatic asset stewardship in an environment of fiscal constraint.

Technical and Programmatic Feasibility

Raytheon officials have expressed confidence in the technical feasibility of the SPY-6 backfit. Jennifer Gauthier, Vice President of Naval Systems & Sustainment at Raytheon, stated in an interview conducted in Tokyo in May 2026 that "we are currently in discussions with the U.S. Navy and nothing has been decided," while elaborating on Raytheon's ongoing development efforts. Importantly, she confirmed that Raytheon had received Navy funding for development work on the Zumwalt combat management system specifically intended to enable SPY-6 integration, and that the company had established "the first certified, classified software factory for Zumwalt" enabling rapid, secure software uploads to the ships without the extended procurement and testing cycles traditionally required for fleet updates.6

From a physical integration perspective, the SPY-6(V)3 is dimensionally comparable to the incumbent SPY-3. Tobin noted that the SPY-3 is "roughly comparable in size" to the SPY-6(V)3 configuration of nine RMAs, suggesting that physical installation would not require substantial deckhouse modifications or structural rework.6 This point is significant; the Zumwalt-class composite deckhouse is one of the ship's most complex and costly structural elements, and any extensive modification would substantially increase backfit cost and risk schedule slippage.

The Navy has signaled its commitment to the modernization path through concrete funding actions. On 20 April 2026, the Navy awarded Raytheon a $213.4 million contract modification for continuation of Zumwalt-class combat system integration, modernization, installation, testing, and sustainment through 2027.8 This funding supports development activities intended to prepare the ships for future upgrades and demonstrates sustained naval commitment to keeping the Zumwalts at an acceptable combat readiness level throughout their operational lifespans.

Strategic and Doctrinal Implications

The SPY-6 backfit should not be viewed in isolation, but rather as a single element in a comprehensive effort to transform the Zumwalt-class from an aberrant platform pursuing a failed operational concept into an integrated member of the twenty-first-century fleet. The combination of hypersonic strike capability (via CPS), improved air and missile defense (via SPY-6), undersea warfare integration (via SQQ-89), modern electronic warfare systems (via SEWIP), and network-centric capability (via CEC) would position the Zumwalt-class as a formidable multi-mission platform capable of fulfilling strike, air defense, and information-warfare roles across the operational spectrum.

The three Zumwalt-class destroyers are expected to remain in service for decades. Without modernization, they would become increasingly obsolete, representing a diminishing return on the $32 billion invested in the class's research, development, and construction. The ZEUS program, including the SPY-6 backfit, represents the most cost-effective path to preserving their relevance and utility within the constrained fiscal environment the Navy now inhabits.

Critically, the SPY-6 backfit would enhance the Navy's ability to operate in contested environments. The improved detection range, ballistic missile defense capability, and network-centric integration afforded by the SPY-6 would significantly increase the Zumwalts' survivability in scenarios involving near-peer competitors equipped with advanced antiship cruise missiles and ballistic-missile threats. In the context of potential Pacific operations against peer adversaries, every incremental improvement in sensor capability and defensive integration carries strategic weight.

Remaining Uncertainties and Next Steps

No final decision has yet been made regarding the SPY-6 backfit. Navy officials and Raytheon representatives alike characterize the current phase as one of active dialogue and development work, with no commitment to proceed. Several factors will likely influence the Navy's ultimate decision: the outcome of ongoing ZEUS integration testing and combat management system development; the final cost estimate for the backfit across three ships; schedule implications relative to other competing modernization priorities; and the availability of repurposed SPY-6(V)3 arrays from the cancelled Constellation-class program as the Navy completes those two remaining frigates and assesses its actual surplus inventory.

The Congressional Research Service and Government Accountability Office will likely scrutinize any decision to proceed, particularly given Congress's longstanding concerns over Zumwalt-class cost overruns and program management. The Navy will need to make a compelling case that the SPY-6 backfit represents a prudent investment in fleet readiness rather than merely throwing additional resources at a historically troubled program.

The most likely scenario involves a phased approach, with USS Zumwalt receiving the initial SPY-6 installation during a future deployment to sea availability, followed by USS Lyndon B. Johnson and USS Michael Monsoor in subsequent modernization periods. This approach would allow the Navy to validate integration, test operational employment, and refine procedural and training requirements while preserving the ability to adjust subsequent installations based on lessons learned.

Conclusion

The potential backfit of AN/SPY-6 radar systems to the Zumwalt-class destroyers represents a pragmatic response to both technical shortcomings in the original design and fiscal realities that preclude procuring entirely new sensor suites. By leveraging surplus systems from a cancelled competitor program, the Navy can modernize three aging platforms at a fraction of the cost of new-build radar integration. The technical feasibility appears sound, contractor development efforts are well underway with Navy financial support, and senior Navy officials have expressed optimism regarding the modernization path.

What began as a technology-demonstrator for naval gun fire support has been progressively transformed—first into a littoral surface fire support platform, then into a hypersonic strike destroyer, and now potentially into a network-integrated multi-mission combatant capable of holding its own within the modern fleet. The SPY-6 backfit, if approved and executed successfully, would represent the final critical piece of that transformation, converting the troubled Zumwalt-class from a symbol of failed innovation into a capable platform suited to contemporary naval warfare. Whether the Navy ultimately commits to the backfit will reveal much about the service's willingness to invest in the long-term modernization of existing platforms rather than perpetually pursuing new starts.

A Final Irony

The arc of the Zumwalt-class offers a cautionary lesson in defense acquisition. The original concept—a gun-armed littoral strike platform with stealth features—was sound in theory but ultimately unsustainable: the Long-Range Land-Attack Projectile proved economically ruinous, the gun-centric mission concept lost political support, and only three ships were ever built instead of the planned 32. What the Navy is now contemplating—a long-range hypersonic strike destroyer with SPY-6 air defense and network integration—bears almost no resemblance to what was originally approved. Yet after fifteen years and more than $32 billion in sunk costs, after multiple complete mission redesigns, and after stripping out systems and bolting on others, the Zumwalts may finally achieve utility as capable twenty-first-century surface combatants. The tragic irony is that three Flight III Arleigh Burke-class destroyers, equipped with SPY-6 and conventional strike missiles from their inception, would have cost considerably less and delivered equivalent or superior capability a decade earlier. The Navy has, in effect, paid dearly for a protracted learning experience conducted aboard billion-dollar warships. If the SPY-6 backfit proceeds and succeeds, the Zumwalts will vindicate themselves not through faithfulness to their original design concept, but through the Navy's willingness to change course fundamentally and repeatedly until three troubled platforms finally become genuinely useful assets. That is a hard-won but valuable lesson for an institution that struggles with admitting error and altering course.

Verified Sources

1. Zumwalt-class destroyer. Wikipedia. Retrieved May 2026. https://en.wikipedia.org/wiki/Zumwalt-class_destroyer
2. Inaba, Yoshihiro. "Zumwalt-class destroyers may receive SPY-6 radars from frigates." Naval News, May 5-6, 2026. https://www.navalnews.com/naval-news/2026/05/zumwalt-class-destroyers-may-receive-spy-6-radars-from-frigates/
3. "USS Zumwalt to put to Sea in 2026 without main gun systems." Naval News, January 15, 2026. https://www.navalnews.com/naval-news/2026/01/uss-zumwalt-to-put-to-sea-in-2026-without-main-gun-systems/
4. "The Navy's Futuristic $8 Billion Stealth 'Battleship' Slips Out of Port with Brand New Mach 5 Hypersonic Weapons Canisters." National Security Journal, April 29, 2026. https://nationalsecurityjournal.org/the-navys-futuristic-8-billion-stealth-battleship-slips-out-of-port-with-brand-new-mach-5-hypersonic-weapons-canisters/
5. Lemoine, William. "First Look At Stealth Destroyer's Hypersonic Missile Launchers." The War Zone, January 16, 2025. https://www.twz.com/sea/first-look-at-stealth-destroyers-hypersonic-missile-launchers
6. Inaba, Yoshihiro. "Zumwalt-class destroyers may receive SPY-6 radars from frigates." Naval News, May 2026. (Primary source for Raytheon executive interviews and RFI timeline.) https://www.navalnews.com/naval-news/2026/05/zumwalt-class-destroyers-may-receive-spy-6-radars-from-frigates/
7. "Repurposing the US Navy's Zumwalt-class destroyers with hypersonic strike capability." Navy Lookout, August 21, 2025. https://www.navylookout.com/repurposing-the-us-navys-zumwalt-class-destroyers-with-hypersonic-strike-capability/
8. "U.S. Navy Considers Replacing Zumwalt-Class SPY-3 Radars with SPY-6 from Cancelled Frigate Program." The Defense News, May 1, 2026. https://www.thedefensenews.com/news-details/US-Navy-Considers-Replacing-Zumwalt-Class-SPY-3-Radars-with-SPY-6-from-Cancelled-Frigate-Program/
9. AN/SPY-3. Wikipedia. Retrieved May 2026. https://en.wikipedia.org/wiki/AN/SPY-3
10. "Dual Band Radar Swapped Out In New Carriers." Defense News, March 17, 2015. https://www.defensenews.com/naval/2015/03/17/dual-band-radar-swapped-out-in-new-carriers/
11. AN/SPY-6. Wikipedia. Retrieved May 2026. https://en.wikipedia.org/wiki/AN/SPY-6
12. "The Navy a Hypersonic Plan to Save the Stealth Zumwalt-Class Destroyers." National Security Journal, September 5, 2025. https://nationalsecurityjournal.org/the-navy-a-hypersonic-plan-to-save-the-stealth-zumwalt-class-destroyers/
13. Constellation-class frigate. Wikipedia. Retrieved May 2026. https://en.wikipedia.org/wiki/Constellation-class_frigate
14. "Navy Cancels Constellation-class Frigate Program." USNI News, November 25, 2025. LaGrone, Sam. https://news.usni.org/2025/11/25/navy-cancels-constellation-class-frigate-program-considering-new-small-surface-combatants
15. Navy Constellation (FFG-62) and FF(X) Class Frigate Programs: Background and Issues for Congress. Congressional Research Service (R44972), March 16, 2026. https://www.congress.gov/crs-product/R44972
16. "The US Navy Just Scuttled the Constellation-Class Frigate Program." The National Interest, November 26, 2025. https://nationalinterest.org/blog/buzz/us-navy-just-scuttled-constellation-class-frigate-program-ps-112625
17. "U.S. Navy retains first six Constellation-class frigates in FY2026 budget to strengthen fleet coverage." Army Recognition, July 7, 2025. https://www.armyrecognition.com/news/navy-news/2025/us-navy-retains-first-six-constellation-class-frigates-in-fy2026-budget-to-strengthen-fleet-coverage

 

The Claude Token Efficiency Playbook - Get More for your Money

 


26 Token Optimization Techniques: Quick Reference

High-Impact Strategies (40%+ savings)

  1. Replace PDFs with Markdown – Convert PDFs to Google Docs, export as .md. Saves 85–90% vs raw PDF.
  2. Use Projects for Shared Files – Upload once, reference across multiple chats. Saves 80%.
  3. Batch Tasks Into One Message – Ask 3 things at once instead of 3 messages. Saves 56%.
  4. Trim Personal Context to <2K Words – Bloated context files waste 10% of every conversation. Saves 70%.
  5. Compress Intermediate Outputs – After analysis, summarize into bullet points for reference in follow-ups. Saves 67%.
  6. Batch Similar Queries with Caching – Ask all related questions about one system in one chat. Saves 87.5%.

Medium-Impact Strategies (25–40% savings)

  1. Right-Size Models – Use Haiku for simple tasks, Sonnet for standard work, Opus only for deep reasoning. Saves 50% when applied systematically.
  2. Write Short Prompts (<30 words) – Brief, clear prompts reduce re-read overhead. Saves 33%.
  3. Specify Output Format Upfront – "JSON table with columns X, Y, Z" prevents reformatting requests. Saves 60%.
  4. Show Your Thinking First – Ask Claude to self-critique in initial response, reducing revision cycles. Saves 40–50%.
  5. Specify Constraints Upfront – "Under 500 words," "3 bullet points," "1-paragraph summary" prevents scope creep. Saves 73%.
  6. Use Checkpoints Every 5–7 Messages – Ask "Are we on track?" in complex conversations to catch wrong paths early. Saves 83%.
  7. Edit Instead of Correcting – Click Edit on your message, fix it, regenerate. Don't stack "Actually, I meant…" messages.
  8. Use New Chats for Different Topics – One topic per chat. Separate chats avoid re-reading irrelevant context. Saves 40% in multi-topic conversations.
  9. Disable Tools by Default – Tools add 200–400 token overhead per exchange even when unused.
  10. Restart Conversations Every 15–20 Messages – Long conversations accumulate re-read overhead. Saves 55%.
  11. Search Before Asking – Use conversation search to find past solutions. Saves 67–75% if found.
  12. Crop Screenshots Tightly – Crop to only the relevant portion. Full screenshot = 1,300 tokens; tight crop = 50 tokens.
  13. Chain Tasks in One Message – "Analyze this data, then write a summary from your analysis" instead of separate messages. Saves 30%.
  14. Use "Assume You Know" References – After establishing context, reference it: "Assume you know the CONVERGE-01 trial from earlier." Saves 75%.
  15. Use Negative Constraints – "Explain without covering basics I already know" is clearer than restating what you know.
  16. Outline Mode Before Full Detail – Ask for pseudocode/outline first, expand only necessary sections. Saves 20–40%.
  17. Pre-Process Data Externally – Clean data before uploading (Excel, Python). Saves 60–75%.
  18. Build Conditional Templates – Create reusable templates with [IF: condition] sections for different use cases. Saves 40–50%.
  19. Project Status Summaries – At session end, ask Claude to write a status summary. Paste it next session instead of re-explaining. Saves 72%.

Implementation Roadmap

Week 1: Quick Wins (Save ~30%)

  • Technique 7: Right-size models
  • Technique 8: Write shorter prompts
  • Technique 18: Crop screenshots

Week 2: Process Changes (Additional 20%)

  • Technique 3: Batch tasks
  • Technique 14: Separate chats for topics
  • Technique 13: Edit instead of correcting

Week 3: Structural Setup (Additional 25%)

  • Technique 1: PDF → Markdown
  • Technique 2: Projects for shared files
  • Technique 4: Trim personal context

Week 4: Advanced Optimization (Additional 15%)

  • Technique 15: Tool management
  • Technique 24: Prompt templates with conditions
  • Technique 9: Restart long conversations

Expected total improvement: 70–80% token efficiency gain


The Core Principle

Token efficiency is a systems problem, not a single-query problem. Efficient workflows:

  • Build around one Project per major work area (IPCSG research, technical analysis, civic policy)
  • Use persistent templates and shared files across chats
  • Create continuity with status summaries and checkpoints
  • Batch related work together to leverage prompt caching

Individual tips help. But combining them into a system-level workflow is what really multiplies savings across months of work.


Quick Wins Summary

TechniqueSavingsEffort
Replace PDFs with .md85–90%Low
Use Projects80%Low
Batch tasks56%Low
Right-size models50%Medium
Trim context70%Medium
Short prompts33%Low
Compress outputs67%Low
Checkpoints83%Low
Batch with caching87.5%Medium
Pre-process data60–75%Low

Start with the "Low Effort" column. You'll hit 50%+ savings in Week 1.

Techniques to Stop Hitting Claude's Limits: Details

Claude's token limits aren't arbitrary walls—they're guardrails that force discipline. Every token you waste on redundant uploads, verbose prompts, or context bloat is a token stolen from actual work. This article translates raw optimization techniques into a workflow that scales.

The Problem: How Users Burn Tokens

Most Claude users operate at 30–50% efficiency. A 200K token limit sounds generous until you realize:


  • A 10-page PDF = 15,000–30,000 tokens gone before you type anything

  • A 400-word prompt gets re-read 20+ times across a conversation

  • Three sequential messages force Claude to re-tokenize the entire history three times

  • A single bloated personal context file (20K words) loads into every session

  • Tools left enabled burn tokens on every exchange, even when unused


For teams, this compounds catastrophically. One poorly optimized workflow × 50 users × 20 chats/month = token hemorrhage that looks like a feature problem when it's actually a process problem.



Technique 1: Replace PDFs with Markdown via Google Docs

The Problem PDFs are opaque to token counting. A single page burns 1,500–3,000 tokens depending on layout complexity, images, and formatting. A 20-page technical document = 30,000–60,000 tokens before analysis begins.


The Solution


  1. Paste PDF text into a Google Doc

  2. Clean up formatting (remove headers, footers, duplicate spacing)

  3. Download as .md

  4. Upload the markdown file


Token Cost Comparison


  • PDF (20 pages): 30,000–60,000 tokens

  • Markdown equivalent: 3,000–5,000 tokens

  • Savings: 85–90%


Why It Works Markdown is plain text. Claude tokenizes it at ~0.25 tokens per word. PDFs include invisible rendering information, font metadata, and positioning data that all get tokenized. Google Docs' export strips that noise.


When to Use This


  • Technical reports, whitepapers, research papers

  • Legal documents, contracts, policy briefs

  • Any document longer than 3 pages

  • Documents with complex formatting or images


When Not To


  • Documents requiring exact visual layout (posters, forms with specific spacing)

  • Scanned PDFs (use OCR first, then convert)

  • Single-page quick references (just copy-paste text directly)



Technique 2: Right-Size the Model for the Task

The Problem Opus costs 5x more per token than Haiku and 3x more than Sonnet. Using Opus for summarization or simple coding is like hiring a surgeon to check your blood pressure.


Model Economics | Task | Right Choice | Why | |------|-------------|-----| | Summarize a document | Haiku | 90% accuracy, 1/5 cost | | Write a simple script | Sonnet | Handles most coding, 1/3 Opus cost | | Debug complex reasoning | Opus | Deep chains need depth | | Brainstorm ideas | Haiku | Ideation doesn't need reasoning depth | | Multi-step analysis | Opus | Benefit from extended reasoning | | Customer service reply | Haiku | Template matching, not reasoning |


Decision Tree


  • Does this task require multi-step reasoning across 5+ inference steps? → Opus

  • Does it need deep technical knowledge but straightforward logic? → Sonnet

  • Is it straightforward task execution? → Haiku


Token Budget Impact A team running 50 daily chats:


  • All Opus: 50 × 3,000 tokens/chat = 150,000 tokens (expensive baseline)

  • Right-sized mix (60% Haiku, 30% Sonnet, 10% Opus): 50 × 1,500 tokens avg = 75,000 tokens

  • Daily savings: 75,000 tokens (50% of budget)



Technique 3: Batch Tasks Into Single Messages

The Problem Every new message forces Claude to re-read the entire conversation history before responding. Three sequential messages = three full re-reads of context.


Example: The Inefficient Way


Message 1: "Can you summarize this report?"


[Claude responds, tokens consumed]


Message 2: "Now extract the key metrics"


[Claude re-reads entire conversation + new message]


[Claude responds, tokens consumed]


Message 3: "Format those metrics as a table"


[Claude re-reads entire conversation again]


[Claude responds, tokens consumed]


Token Cost: Each message re-reads full history. With a 20-message conversation, message 21 retokenizes all 20 previous exchanges.


The Efficient Way


Message 1: "Do three things:


1. Summarize this report in 3 sentences


2. Extract the top 5 metrics


3. Format those metrics as a table with columns: Metric, Value, Trend"


[Claude responds with all three outputs]


Token Savings


  • Inefficient (3 messages): 8,000 tokens (context re-read overhead included)

  • Efficient (1 message): 3,500 tokens

  • Savings: 56%


How to Batch Effectively


  1. List all tasks upfront with numbers

  2. Specify output format for each (table, list, paragraph, JSON)

  3. Set constraints (word counts, detail level) per task

  4. Use one message to capture all context needed


When Not to Batch


  • Tasks require Claude's output from task #1 to inform task #2

  • Second task is fundamentally different in scope

  • You need to iterate on one task before moving to the next



Technique 4: Edit Instead of Stacking Corrections

The Problem Users write a message, realize mid-reply they misspoke, and send a follow-up: "Actually, I meant…" This creates bad history that Claude must re-read forever.


Example: Poor Practice


Message 1: "Analyze this dataset with regression analysis"


[Claude responds]


Message 2: "Wait, I said regression but I meant clustering"


[Claude re-reads both messages, applies fix]


Message 3: "Also, use k-means specifically"


[Conversation now has 3 messages for what should be 1]


The Right Way


  1. Click the Edit button on your original message

  2. Fix the prompt

  3. Click Regenerate


Result: Original bad message disappears. Conversation history stays clean. No token waste on corrections.


Token Impact


  • Stack of 3 messages with corrections: 5,000 tokens (includes overhead of re-reading bad context)

  • Single edited message: 2,000 tokens

  • Savings: 60%



Technique 5: Use New Chats for New Topics

The Problem One chat drifts across 4 different topics (analyzing a dataset, then drafting an email, then brainstorming ideas, then debugging code). Claude must re-read everything above before every response.


Example: Conversation Bloat


Messages 1-5: Analyze Q3 sales data


Messages 6-10: Draft investor email (unrelated)


Messages 11-15: Brainstorm product features (unrelated)


Messages 16-20: Debug Python script (unrelated)


Message 21: New question about the Python script


[Claude must re-read all 20 previous messages, 80% of which are irrelevant]


Token Cost: Message 21 tokenizes 20 previous exchanges even though only 5 are relevant.


The Right Way


  • New topic = new chat

  • One chat = one focused problem


Token Impact


  • Bloated single chat (4 topics, 20 messages): Each new message re-reads ~8,000 tokens of irrelevant context

  • Four separate chats (5 messages each): Each new message re-reads ~2,000 tokens of relevant context

  • Savings across 20 total messages: 40,000 tokens (60% of original)


Bonus: Organization becomes much easier. Your chat history is searchable and scannable.



Technique 6: Write Short, Clear Prompts (Under 30 Words)

The Problem A 400-word prompt gets re-read dozens of times across a conversation. Each follow-up question forces re-tokenization of the entire prompt.


Example: Inefficient Prompt


"I'm working on a customer support dashboard for our SaaS platform. We need to


display metrics like average response time, customer satisfaction scores, and


ticket volume trends. The interface should be mobile-responsive and include


filters for date range, department, and customer segment. We're using React


and want it to match our existing design system which uses Tailwind CSS. Can


you help me build this?"


[357 words]


Every follow-up question re-reads all 357 words.


Efficient Version


"Build a customer support dashboard in React: metrics (response time, 


satisfaction, volume), filters (date, dept, segment), mobile-responsive, Tailwind CSS."


[27 words]


Claude asks clarifying questions if needed. You provide details only for what's unclear.


Token Cost Comparison


  • Long prompt + 10 follow-ups: Prompt gets re-read 10+ times = 3,000+ tokens of re-reads alone

  • Short prompt + 10 clarifications: Clarifications cost ~200 tokens each, total ~2,000

  • Savings: 33%


Structure for Short Prompts


  1. Action verb (Build, Analyze, Compare, Draft)

  2. Deliverable (React component, SQL query, essay outline)

  3. Key constraints (3 bullets max)

  4. Format (JSON, markdown table, code block)


When to Break This Rule


  • First message to a new chat (more context helps)

  • Highly specialized domains where brevity creates ambiguity

  • Chats where you've established context already



Technique 7: Use Projects to Share Files Across Chats

The Problem You upload the same document to 5 different chats. That document gets tokenized in full for each chat. A 10K-token document = 50K tokens burned unnecessarily.


Example: Inefficient Workflow


Chat 1: Upload quarterly_report.pdf → analyze revenue


[10,000 tokens to tokenize document]


Chat 2: Upload quarterly_report.pdf → analyze expenses


[10,000 tokens to tokenize same document again]


Chat 3: Upload quarterly_report.pdf → extract metrics


[10,000 tokens to tokenize same document again]


Total: 30,000 tokens for same document


Projects Solution


  1. Create a Project called "Q3 Analysis"

  2. Upload quarterly_report.pdf once to the project

  3. Every chat in that project references the document automatically

  4. Document tokenized once, referenced in every chat


Token Cost


  • Without Projects (5 chats with same document): 50,000 tokens

  • With Projects: 10,000 tokens (document tokenized once)

  • Savings: 80%


Bonus Features


  • Team collaboration (everyone in project sees same files)

  • Shared context (no redundant uploads)

  • Better organization (related chats grouped)

  • Prompt caching (reused prompts inside projects don't re-tokenize)


Project Structure Example


Project: "San Diego Transit Analysis"


├─ Files: SANDAG_2024_data.md, MTS_budget.csv, transit_report.pdf


└─ Chats:


    ├─ "Q1 ridership analysis"


    ├─ "Budget efficiency comparison"


    ├─ "Future capacity planning"


    └─ "Funding mechanisms"


All chats reference the same files. No redundant uploads.



Technique 8: Disable Tools and Connectors When Not In Use

The Problem Tools consume tokens on every exchange, even when inactive. Web search, calculator, file operations—if enabled, Claude considers them on every response.


Token Cost of Enabled Tools Enabling 3 tools (web search, code execution, file creation) adds ~200–400 tokens overhead per exchange, even when unused.


  • 20-message chat with tools enabled: 20 × 300 = 6,000 tokens overhead

  • Same chat with tools disabled: 0 tokens overhead

  • Savings: 6,000 tokens per 20-message chat


Best Practice


  1. Disable all tools by default

  2. Enable only the specific tool(s) needed for current task

  3. Disable when task completes


Tools to Keep Disabled Most of the Time


  • Web search (enable only when asking current events)

  • Code execution (enable only during debugging/testing)

  • File creation (enable for artifact generation, disable for Q&A)

  • Connectors (enable only when accessing Gmail/Calendar/Drive)


Tools Worth Keeping On


  • Within Projects that specifically need them

  • During focused work sessions where they're used consistently



Technique 9: Restart Conversations Every 15–20 Messages

The Problem At message 25, Claude re-reads all 24 previous messages before responding. By message 50, the context window overhead becomes significant.


Context Re-Read Cost


  • Message 10: Re-read ~4,000 tokens of context

  • Message 20: Re-read ~8,000 tokens of context

  • Message 30: Re-read ~12,000 tokens of context

  • Message 50: Re-read ~20,000 tokens of context


Solution: Refresh Every 15–20 Messages


  1. At ~message 15, summarize key points in a new message: "Summary: We've analyzed X, decided on Y, next step is Z"

  2. Start a new chat with that summary as context

  3. New chat begins fresh without re-reading all history


Token Impact


  • Single 50-message conversation: ~100,000 tokens (with re-read overhead)

  • Two 25-message conversations: ~60,000 tokens (less re-read overhead)

  • Three 17-message conversations: ~45,000 tokens (minimal re-read overhead)

  • Savings: 55%


When to Restart


  • Task fundamentally shifts direction

  • Conversation length approaches 30+ messages

  • New day/session (fresh start feels cleaner)


When Not To


  • You need full context from all prior messages for current task

  • You're iterating on something that needs complete history



Technique 10: Crop Screenshots to Only Relevant Portions

The Problem Users upload full 1000×1000 pixel screenshots when a 200×300 pixel crop would work. Full screenshots tokenize at ~1,300 tokens; crops can drop below 100.


Token Cost of Screenshots


  • Full screenshot (1000×1000): ~1,300 tokens

  • Medium crop (400×400): ~200 tokens

  • Tight crop (200×200): ~50 tokens

  • Potential savings: 96%


Example: The Inefficient Way User pastes full desktop screenshot showing:


  • Entire taskbar

  • Application menu

  • Status bar

  • And the actual error dialog in bottom right


Claude tokenizes all of it.


The Efficient Way Crop to just the error dialog:


[Cropped to 250×150 pixels]


Claude gets the information without the noise.


Cropping Checklist


  • unchecked

    Remove any UI chrome (taskbars, menus) unless relevant

  • unchecked

    Remove whitespace margins

  • unchecked

    Crop to the minimal bounding box that includes the issue

  • unchecked

    Keep just enough context for understanding (one surrounding line/button)


Tools


  • Windows: Snip & Sketch (Win+Shift+S)

  • Mac: Cmd+Shift+4 (drag to select area)

  • Linux: Flameshot or built-in tool

  • Online: Snipping tools in browser



Technique 11: Build and Reuse Prompt Templates

The Problem Users rewrite similar prompts from scratch repeatedly. Each rewrite is slightly different, prevents caching, and burns mental energy.


Example: Inefficient Rewriting


Chat 1: "Write a technical analysis of the MQ-9B SeaGuardian focusing on 


operational range, sensor capabilities, and integration with naval systems."


Chat 2 (weeks later): "Can you analyze the GA-ASI Gambit system? I want to 


understand its operational capabilities, sensor suite, and how it fits into 


the broader defense architecture."


Chat 3 (another week): "Technical overview of the V-22 Osprey: what it does, 


what sensors it has, and how it works with other military systems."


Same structure, different words each time. Prevents caching.


Prompt Template Approach Create a template in a document:


# Technical System Analysis Template


Analyze [SYSTEM_NAME] and cover:


1. Operational range and endurance


2. Sensor suite and detection capabilities


3. Integration with broader force architecture


4. Notable operational history or incidents


5. Key limitations or known issues


Format as: summary section + detailed technical breakdown + 


comparison table with similar systems.


Reuse for every similar analysis:


Chat 1: Analyze MQ-9B SeaGuardian [use template]


Chat 2: Analyze GA-ASI Gambit [use template]


Chat 3: Analyze V-22 Osprey [use template]


Token Benefit Prompt caching (available in Projects) means repeated prompts aren't fully re-tokenized.


  • Manual rewriting each time: Each prompt re-tokenized in full

  • Template reuse in Projects: First use tokenizes template, subsequent uses get cached hit (90% cost reduction)


Creating Your Prompt Library


  1. Identify 5–10 recurring tasks (analysis, drafting, coding, summarization)

  2. Write a template for each with [VARIABLE] placeholders

  3. Store templates in a Project-level document

  4. Reuse same template structure across similar tasks


Template Examples


  • Technical analysis (systems, weapons, platforms)

  • Policy briefs (problem, current approach, alternatives, recommendation)

  • Code review (architecture, security, performance, maintainability)

  • Content drafting (outline, research questions, audience, tone)



Technique 12: Keep Personal Context Under 2,000 Words

The Problem A 20,000-word personal context file loads into every single conversation. That's 20K tokens of overhead before you type your first question.


Real Impact A 20K-word context file in a 200K limit:


  • 10% of your token budget consumed by context alone

  • Every conversation starts 20K tokens in the hole

  • A 1-hour work session might be 50% context + 50% actual work


Example: Bloated Context


[USER PROFILE: 20,000 words covering]


- Entire work history (every job description)


- Complete family tree and relationships


- Full list of 50+ projects and their outcomes


- Every skill and certification


- All medical history and preferences


- Complete reading list and book summaries


- Full financial situation


- Detailed hobby list


[END: 20,000 tokens burned before starting]


The Trimmed Version


[USER PROFILE: 1,500 words covering]


- Current role: Retired Senior Engineer, radar systems


- Key expertise: Signal processing, C4ISR, AMASS


- Current projects: IPCSG advocacy, technical writing


- Key context for Claude: Prostate cancer patient-advocate, 


  uses pseudonym "Pseudo Publius" for civic policy work


- Preferences: HTML for newsletters, Markdown for analysis


[END: 1,500 tokens used for genuine context]


What to Keep in Context (1,500 words max)


  • Current professional role

  • 3–5 core skills Claude needs to know about

  • 1–2 active projects

  • Key preferences (format, tone, communication style)

  • Any ongoing work Claude should reference


What to Cut


  • Complete work history (mention only current role)

  • Family relationships unless directly relevant

  • Completed projects (list only active ones)

  • Medical details beyond "patient advocate in X field"

  • Reading lists, hobby catalogs, exhaustive skill inventories

  • Historical context that isn't actively shaping current work


How to Structure Trimmed Context


# Claude Context (Keep Under 2,000 Words)


## Professional


- Role: Retired radar systems engineer, 20+ years


- Current focus: Technical writing, IPCSG patient advocacy


- Key expertise: Signal processing, SAR/GMTI, C4ISR systems


## Active Projects


1. IPCSG newsletter (prostate cancer research translation)


2. Naval Institute-style technical analysis (defense systems)


3. San Diego civic policy research (transit, water, governance)


## Preferences for Claude


- HTML output for IPCSG content (avoid .docx)


- Markdown for technical analysis


- Cite sources for health/policy content


- Flag when content needs fact-checking


## Key Context


- Patient with 11+ years prostate cancer history


- Uses "Pseudo Publius" pseudonym for civic policy writing


- Lives in San Diego, familiar with local transit/healthcare systems


- Enrolled in CONVERGE-01 actinium-225 PSMA trial at UCSD


Token Savings


  • 20K context file: 20,000 tokens per conversation

  • 2K context file: 2,000 tokens per conversation

  • 10 conversations/week: 180,000 token savings per week

  • Monthly savings: 720,000 tokens (enough for 3-4 complex analysis projects)



13. Leverage Conversation Search Before Asking

The Problem You ask Claude a question you've already solved in a previous chat. Claude re-answers from scratch, consuming tokens for work already done.

Solution Use the conversation search tool to find past relevant chats before messaging Claude. If you find the answer, you're done (zero tokens). If not, you have context for a more targeted question.

Example Instead of: "How do I configure AMASS for multi-sensor fusion?" Search first. If found in past chat, copy the answer. If not found, ask: "I've searched my past work—it's not there. Here's what I tried last time [specific detail]. What's the next step?"

Token Cost

  • Re-asking and re-answering: 2,000–3,000 tokens
  • Searching + targeted follow-up: 500 tokens
  • Savings: 67–75%

14. Use Structured Output Formats to Reduce Back-and-Forth

The Problem You ask a question, Claude gives prose, you ask for it in table format, Claude reformats. Two exchanges for one deliverable.

Solution Specify the exact output format upfront: "JSON with keys: name, value, unit" or "Markdown table: columns are X, Y, Z" or "CSV format" or "Numbered list with 1-sentence descriptions."

Example

  • Inefficient: "List the key features of the MQ-9B"

    • Claude responds with prose paragraph
    • You: "Can you make that a table?"
    • Claude reformats (2 exchanges, 4,000+ tokens)
  • Efficient: "List MQ-9B key features as a markdown table with columns: Feature, Specification, Operational Impact"

    • Claude responds with table in one go (1 exchange, 1,500 tokens)

Token Savings: 60%


15. Use Claude's "Drafts" or Internal Reasoning to Reduce Revisions

The Problem You ask Claude to write something, it's close but needs tweaks, you ask for revision, Claude rewrites. One task becomes three messages.

Solution In the initial prompt, ask Claude to "show your thinking first" or "provide a draft + notes on what could improve it." Claude self-critiques, reducing revision cycles.

Example

  • Inefficient: "Write a technical brief on the Constellation-class frigate cancellation"

    • Claude writes
    • You: "It's good but needs more detail on cost overruns"
    • Claude revises (2 exchanges minimum)
  • Efficient: "Write a technical brief on the Constellation-class frigate cancellation. Include: executive summary, cost breakdown, timeline of delays, political context. Flag any sections that feel weak or incomplete."

    • Claude writes with self-critique built in
    • You get a more complete product on first try (1 exchange)

Token Savings: 40–50% (fewer revision cycles)


16. Reuse Outputs as Inputs (Chaining Without Re-Prompting)

The Problem You ask Claude to analyze data, then ask it to write a summary of that analysis. Claude re-reads both the original data and its analysis.

Solution When Claude produces output you'll use as input for another task, say so upfront: "Analyze this data, then use your analysis to draft a one-paragraph summary."

This chains tasks in a single message, avoiding re-reads.

Example

  • Inefficient:

    • Message 1: "Analyze Q3 sales by region" (Claude analyzes)
    • Message 2: "Summarize that analysis for a board memo" (Claude re-reads data + analysis)
    • (2,500 + 2,500 = 5,000 tokens)
  • Efficient:

    • Message 1: "Analyze Q3 sales by region, then summarize findings in one paragraph for a board memo"
    • (3,500 tokens, single pass)

Token Savings: 30%


17. Specify Constraint Limits Upfront to Avoid Scope Creep

The Problem You ask for "an analysis," Claude writes 2,000 words because no constraint was given. You then ask "can you make it shorter?" Claude re-writes. Wasted tokens.

Solution Always specify: word count, depth level, or section count upfront.

Examples

  • "Summarize in 3 sentences"
  • "Brief analysis (under 500 words)"
  • "Outline only—no prose, just 5 bullet points per section"
  • "Executive summary format: 1 page max"

Token Impact

  • Unconstrained ask: 3,500 tokens → You request trim → 2,000 tokens for re-work (5,500 total)
  • Constrained ask: 1,500 tokens (right size on first try)
  • Savings: 73%

18. Cache Repeated Context by Using "Assume You Know" Statements

The Problem Every time you chat, you re-explain your domain, your current project, or your constraints.

Solution Once you've established context in a chat, use "assume you know" statements to avoid re-explaining:

  • "Assume you know the San Diego MTS budget structure from our earlier discussion"
  • "Assume you're familiar with the CONVERGE-01 trial protocol"
  • "Assume you know the PIRAN radiation belt software context"

This signals Claude to reference prior exchanges without restating everything.

Token Cost

  • Restating context every time: 800+ tokens per message
  • Using "assume" reference: 200 tokens per message
  • Savings: 75%

19. Use Negative Constraints (What NOT to Include)

The Problem Specifying what you want is harder than specifying what you don't. "Don't explain basic concepts I already know" saves tokens better than "explain advanced concepts."

Solution Frame prompts with what to exclude:

  • "Analyze this without explaining what a PSMA scan is"
  • "Write the technical section without introductory material"
  • "List only the novel findings—skip anything in standard literature"

Example

  • Inefficient: "Explain the latest prostate cancer biomarkers" (Claude might explain what biomarkers are, burning tokens on known info)
  • Efficient: "Explain novel prostate cancer biomarkers, assuming I know what biomarkers are and how standard testing works"

Token Savings: 20–30%


20. Compress Intermediate Outputs via Summarization Prompts

The Problem You ask Claude to do a deep analysis (3,000 tokens), then ask questions about it. Claude must re-read the full analysis for each question.

Solution After the analysis, immediately ask Claude to produce a "compressed summary for reference." You then use the summary for follow-ups, not the full analysis.

Example

  • Message 1: "Analyze the 50-page NTSB docket on the LaGuardia collision" (3,000 tokens)
  • Message 2: "Compress that into a 10-point summary I can reference for follow-ups" (500 tokens)
  • Messages 3+: Ask questions referencing the summary, not the original analysis (saves 2,000+ tokens per follow-up)

Token Impact

  • With compression: Original (3,000) + summary (500) + 5 follow-ups using summary (2,500) = 6,000 total
  • Without compression: Original (3,000) + 5 follow-ups re-reading full analysis (15,000) = 18,000 total
  • Savings: 67%

21. Use Pseudocode or Outline Mode for Complex Tasks

The Problem You ask Claude to solve a complex problem in full detail. It writes long explanations. You ask for just the outline. It re-writes.

Solution Ask for "pseudocode" or "outline mode" first, then expand only sections you need.

Example

  • Inefficient: "Help me design a system for analyzing satellite megaconstellation fragmentation" (Claude writes 2,000-word design doc)
  • Efficient:
    • Message 1: "Outline only: system architecture for analyzing satellite megaconstellation fragmentation" (Claude: 300-word outline)
    • Message 2: "Expand section 3 (data pipeline) to full technical detail"
    • (1,000 + 1,500 = 2,500 vs 2,000 + potential revisions)

Token Savings: 20–40% (you pay only for sections you need)


22. Pre-Process Data Externally Before Uploading

The Problem You upload raw messy data (10K tokens), Claude cleans it, then you ask questions. Claude must re-read the messy + cleaned data.

Solution Clean/process data before uploading. Use a spreadsheet tool, Python script, or other lightweight processing first.

Example

  • Inefficient: Upload 500-row CSV with duplicates, formatting issues, irrelevant columns (8,000 tokens) → Claude cleans and analyzes
  • Efficient: Clean in Excel/Python locally (2 min, no tokens) → Upload cleaned 200-row CSV (2,000 tokens) → Claude analyzes

Token Savings: 60–75%


23. Use Checkpoints: "Are We On Track?" Mid-Conversation

The Problem You work through a complex analysis with Claude, go down the wrong path for 10 messages, then realize the approach is wrong. All 10 messages must be re-read going forward.

Solution Every 5–7 messages in complex tasks, insert a checkpoint: "Summarize progress so far and confirm we're on the right track before continuing."

If wrong track, you catch it early. If right track, you've created a compressed summary for future reference.

Token Cost

  • Wrong path after 10 messages: Wasted 6,000 tokens, plus future re-reads (12,000 total over conversation)
  • Checkpoint at message 5: Catch early, save 10 messages of wasted work (10,000 tokens)
  • Savings: 83%

24. Leverage Templates with Conditional Sections

The Problem (Similar to #11, but more sophisticated) You have a template, but different uses require different sections. You still re-write parts.

Solution Build templates with conditional markers. Example:

# Technical Analysis Template

## Executive Summary (always)
[1 paragraph]

## [IF: System is military] Operational History
[relevant section]

## [IF: System has sensors] Sensor Capabilities
[relevant section]

## Key Metrics (always)
[data table]

## [IF: System is controversial] Safety/Incident History
[relevant section]

When reusing, you fill only the sections relevant to the specific system.

Token Benefit

  • Manual rewriting each time: 100% re-tokenization
  • Template with conditionals: Reusable frame (cached) + conditional sections only
  • Savings: 40–50% on repeated similar analyses

25. Use "Status Check" Outputs for Ongoing Projects

The Problem You're working on a multi-week project. Each new chat, you brief Claude on what's been done. That briefing is always ~1,000 tokens.

Solution At the end of each session, ask Claude to generate a "project status summary" (500 words). Start next chat by pasting that summary instead of re-explaining.

Example After session 1 on IPCSG newsletter research:

  • You: "Create a 300-word status summary for my next session: what we've covered, what's pending, open questions"
  • Claude: [Status summary] (800 tokens)

Next session:

  • You: "Here's the status from last session: [paste]. Continue with the next section on ADT cardiovascular risk"
  • Claude: (Uses summary, no re-explanation needed)

Token Savings

  • Re-explaining each session: 1,000 tokens/session × 10 sessions = 10,000 tokens
  • Status summary approach: 800 + 200×10 = 2,800 tokens
  • Savings: 72%

26. Batch Similar Queries to Use Prompt Caching

The Problem You ask 10 different questions about the same system (MQ-9B). Each question re-reads the full context.

Solution In Projects, ask all related questions about the same system in one session before moving to a new system. Prompt caching means the context gets tokenized once, reused for all questions.

Example

  • Chat 1: "Answer all questions about MQ-9B SeaGuardian" + [list 10 questions]
    • Questions about same context leverage caching
  • Chat 2 (different day, same project): Ask 10 questions about Gambit CCA
    • New context, but again leveraging caching within session

Token Benefit

  • Separate chats for each question: 10 questions × 2,000 tokens = 20,000 tokens
  • Batched in one chat with caching: 2,000 (context) + 500 (questions) = 2,500 tokens
  • Savings: 87.5%

Summary: All 26 Techniques by Impact

Highest Impact (40%+ savings each)

  • #1: Replace PDFs with markdown (85–90%)
  • #7: Use Projects to avoid redundant uploads (80%)
  • #3: Batch tasks (56%)
  • #12: Trim personal context (70% when compounded)
  • #20: Compress intermediate outputs (67%)
  • #26: Batch similar queries with caching (87.5%)

High Impact (25–40% savings)

  • #2: Right-size models (50% when applied systematically)
  • #6: Short prompts (33%)
  • #10: Crop screenshots (96% but narrow use case)
  • #15: Show thinking first (40–50%)
  • #17: Specify constraints upfront (73%)
  • #23: Checkpoints (83%)

Medium Impact (15–25% savings)

  • #4: Edit instead of stacking (60% but single-message impact)
  • #5: New chats for new topics (40% for multi-topic conversations)
  • #8: Disable tools (varies by tool usage)
  • #9: Restart conversations (55% but only if you hit 50+ messages)
  • #13: Search before asking (67% but only if found)
  • #14: Specify output format (60% but narrow use case)
  • #16: Chain tasks (30%)
  • #18: "Assume you know" statements (75% but only after context established)
  • #19: Negative constraints (20–30%)
  • #21: Pseudocode mode (20–40%)
  • #22: Pre-process data (60–75% but narrow use case)
  • #24: Conditional templates (40–50%)
  • #25: Status summaries (72%)

The Strategic Layer: What Most People Miss

Beyond these 26 techniques, there's one meta-insight:

Token efficiency is a systems problem, not a tips-and-tricks problem.

Most users treat Claude like a search engine: ask question, get answer, move on. That model inherently wastes tokens because there's no continuity.

Efficient users treat Claude like a long-term collaborator:

  • One Project per major body of work (IPCSG research, Naval analysis, San Diego civic work)
  • Persistent templates, reusable context, shared files
  • Conversations that build on each other (status summaries, checkpoints)
  • Clear handoffs between sessions (summary → next session → summary)

When you operate at the "system" level instead of the "single query" level, all 26 techniques compound. You're not just saving tokens on individual exchanges—you're building workflows that stay efficient across months.

That's the real win.


Integrated Workflow: Putting It All Together

Here's how these 12 techniques work together in practice:

Scenario: Research and Write a Technical Analysis

The Bloated Approach


1. Upload raw 15-page PDF (30,000 tokens)


2. Write 300-word prompt with every detail (prompt re-read 15+ times)


3. Send message → realize you need more info → "Actually, I meant…" 


   (correction stacking)


4. Ask 3 follow-ups in separate messages (context re-read x3)


5. Use Opus for summarization (wrong model)


6. Keep web search enabled (unused, 300 token overhead)


7. Two weeks later, use same PDF in new chat (30,000 tokens again)


8. Maintain 35-message conversation (10,000 tokens re-read overhead)


Total waste: ~95,000 tokens


The Optimized Approach


1. Project: "Technical Analysis" (upload PDF once as .md)


   [3,000 tokens vs 30,000]


2. Short prompt (25 words, edited before sending)


   "Analyze MQ-9B SeaGuardian: range, sensors, naval integration, 


    format as summary + technical breakdown + comparison table"


   [Prompt re-read cost: minimal]


3. Batch all questions into one message, use Sonnet (not Opus)


   [1/3 the cost, 90% as good]


4. Disable web search and tools (enable only if needed)


   [No overhead]


5. Reuse prompt template for future system analyses


   [Prompt caching reduces repeat cost 90%]


6. Keep conversation to 18 messages, then restart


   [Minimal re-read overhead]


Total usage: ~10,000 tokens


Savings: ~85,000 tokens (90% reduction)

Scenario: Customer Support Email + Dataset Analysis in One Session

Wrong: One chat with both tasks, tools enabled, Opus for both


Right:


Chat 1: "Draft customer support email"


- Task: Simple templating


- Model: Haiku


- Tools: Off


- Expected: 1-2 messages


Chat 2: In same Project, "Analyze Q3 customer data"


- Task: Statistical analysis


- Model: Sonnet


- Tools: Off (already have data)


- Expected: 3-4 messages


[Both reference same Project files, no redundant uploads]


[Different tasks, different chats, minimal re-reading]



The Economics: Real Savings

For Individual Users


  • Using all 12 techniques: ~60–70% token efficiency improvement

  • A 200K limit effectively becomes ~300K in actual work capacity

  • Cost savings: If paying per token, 30–40% reduction in bills


For Teams (10 users)


  • Typical: 500 chats/month, 50M tokens burned

  • Optimized: 500 chats/month, 15M tokens used

  • Monthly savings: 35M tokens

  • Billable value: Equivalent to ~$5,000–$10,000/month in unused capacity recovered


For Enterprises


  • Bloated workflows lead to:


  • Team members buying extra credits (hidden costs)

  • Unnecessary token quota expansions

  • Perceived "slowness" (actually just inefficiency)


  • Optimized workflow means:


  • Planned budgets actually cover work

  • Clear ROI on Claude investment

  • Scalability without proportional cost increase



Implementation: Start Here

You don't need to adopt all 12 techniques simultaneously. Phase them in:


Week 1: Quick Wins (Saves ~30%)


  • Technique 2: Right-size models (Haiku for simple tasks)

  • Technique 6: Write shorter prompts

  • Technique 10: Crop screenshots


Week 2: Process Changes (Saves additional 20%)


  • Technique 3: Batch tasks into single messages

  • Technique 5: Use separate chats for different topics

  • Technique 4: Edit instead of stacking corrections


Week 3: Structural Optimization (Saves additional 25%)


  • Technique 1: Replace PDFs with markdown

  • Technique 7: Move files to Projects

  • Technique 12: Trim personal context


Week 4: Advanced Optimization (Saves additional 15%)


  • Technique 8: Disable tools by default

  • Technique 11: Build prompt templates

  • Technique 9: Restart conversations every 15–20 messages


Expected Result After 4 Weeks: 70–80% improvement in token efficiency



The Mindset Shift

Token optimization isn't about deprivation—it's about clarity. When you're forced to communicate concisely, write better prompts, and focus on one task at a time, you get better results and use fewer tokens.


Every inefficient workflow pattern masks itself as "flexibility" or "exploratory thinking." In reality, it's just waste.


The 12 techniques above are the proven guardrails. Use them, and you'll never feel constrained by Claude's limits again. The constraint becomes a feature: it forces you to think like an engineer, not just an experimenter.


Your next Claude session will be 3x more productive and cost 70% less.



Zumwalt-class destroyers may receive SPY-6 radars from frigates - Naval News

Zumwalt-class destroyers may receive SPY-6 radars from frigates - Naval News Retrofitting Failure: The Zumwalt-Class and the $...