Openai Prompt Cache Retention 24H Cost Proof 2026
- Date: 2026-06-29 (UTC)
- Model:
gpt-5.5-2026-04-23(viaEFFLOOW_OPENAI_LAB_MODEL) - Endpoint:
POST https://api.openai.com/v1/responses - Scripts:
scripts/openai-lab-run.py(small-prompt series) andscripts/openai-cache-probe.py(cache-key bursts) - Safety boundary: prompt is a fictional, non-confidential support policy. No customer, credential, or private data.
What we measured
We sent the same prompt many times back-to-back and read usage.input_tokens_details.cached_tokens
from each response. A value > 0 means OpenAI served part of the prompt from cache (billed at the
cached-input rate, 1/10th of normal input on GPT-5.5).
Results table
| Series | Prompt size | prompt_cache_key set |
Calls | Cache hits (cached_tokens > 0) |
Cached tokens on a hit |
|---|---|---|---|---|---|
| A (lab-run script) | ~1,323 input tokens | no | 7 | 0 / 7 | — |
| B (cache-key burst) | ~1,289 input tokens | yes | 8 | 0 / 8 | — |
| C (large prompt) | ~3,741 input tokens | yes (...bigprompt-v1) |
6 | 3 / 6 | 3,328 (~89% of input) |
| D (large prompt, repeat) | ~3,741 input tokens | yes (...bigprompt-v2) |
8 | 6 / 8 | 3,328 (~89% of input) |
Across series A and B (small prompt, 1.3K tokens, just above the 1,024-token minimum) we got
zero cache hits in 15 identical back-to-back calls, with or without a prompt_cache_key.
In series C and D (large prompt, ~3.7K tokens) cache hits appeared: on a hit, 3,328 of 3,741 input
tokens (~89%) were served from cache. But hits were not on every call: 5 of 14 large-prompt
calls returned cached_tokens = 0. Two of those were the expected cold first call of each series;
the other three were warm-streak misses (series C runs 3 and 6, series D run 3). Cache routing is
best-effort, not guaranteed per call.
Cost math on a large-prompt hit (GPT-5.5 published rates: $5.00/M input, $0.50/M cached, $30/M output)
- Cold call input cost: 3,741 × $5.00/1e6 = $0.01871
- Cache-hit input cost: (413 uncached × $5.00/1e6) + (3,328 cached × $0.50/1e6) = $0.00207 + $0.00166 = $0.00373
- Input-cost reduction on a hit: ~80% ($0.01871 → $0.00373)
Honest limitations
- This is a low-volume lab on a single org, not a production traffic test. Real cache-hit rates depend on sustained, steady traffic to the same prefix.
- We did not test 24-hour persistence directly. Confirming that a prefix is still cached 24 hours later would require a separate run a day apart; we measured mechanics and within-session hit behavior only.
output_textis empty in series A because the smallmax_output_tokensbudget was consumed by the model's reasoning tokens. We were measuringcached_tokens, not answer quality.- A transient HTTP 500 occurred once and was retried successfully; OpenAI server errors are part of normal operation.
Effloow Lab OpenAI API Run: openai-prompt-cache-retention-24h-cost-proof-2026
- Date: 2026-06-29T00:37:30.116153+00:00
- Model: gpt-5.5-2026-04-23
- Purpose: prompt-cache mechanics run 1: cold prefix, expect cached_tokens=0
- Request ID: dac60f0a-8355-4ef8-8280-cc0dbe658b72
- Usage: {"input_tokens": 1323, "input_tokens_details": {"cached_tokens": 0}, "output_tokens": 120, "output_tokens_details": {"reasoning_tokens": 120}, "total_tokens": 1443}
- Prompt SHA-256: 9c79f2484ec753eafc471deedeab657165c36e54e700c2b7d1149374c3ba154a
- Safety boundary: no confidential, customer, credential, or private data should be included in this run.
Prompt Excerpt
SUPPORT POLICY KNOWLEDGE BASE — STABLE PREFIX FOR PROMPT CACHE TEST
You are a customer-support routing assistant for a fictional SaaS company called "Meridian Notes", a note-taking and task app. The following policy text is intentionally long and static so it can serve as a cacheable prompt prefix during an Effloow Lab measurement. None of this content is confidential. It describes only invented, illustrative policies for a fictional product.
SECTION 1 — PLANS AND BILLING
Meridian Notes offers three plans. The Free plan includes up to 200 notes, 1 device sync, and community support. The Pro plan costs a flat monthly fee and adds unlimited notes, sync across up to 10 devices, version history for 30 days, and email support with a two business-day response target. The Team plan adds shared workspaces, role-based permissions, an audit log, and priority support with a one business-day response target. Billing renews monthly on the calendar day the subscription started. Customers who upgrade mid-cycle are charged a prorated amount for the remainder of the current period. Customers who downgrade keep their current plan benefits until the end of the paid period, after which the lower plan limits apply. Refund requests inside the first 14 days of an initial purchase are honored in full. After 14 days, refunds are handled case by case and are not guaranteed. Failed payments trigger three retry attempts over seven days before the account is moved to a read-only state.
SECTION 2 — DATA, EXPORT, AND DELETION
Every customer can export their notes at any time as Markdown or JSON from the account settings page. Exports include note bodies, titles, tags, and timestamps, but not deleted items. When a customer deletes a note, it moves to a trash area for 30 days and is then permanently removed. When a customer deletes their entire account, all associated notes, attachments, and metadata are scheduled for permanent deletion within 30 days, except where a longer retention period is r
Output
[NO OUTPUT TEXT]
Limitations
This API run is a bounded lab check. It is not a production benchmark, user study, or proof that an external product works in a real customer environment.
Effloow Lab OpenAI API Run: openai-prompt-cache-retention-24h-cost-proof-2026
- Date: 2026-06-29T00:37:40.554973+00:00
- Model: gpt-5.5-2026-04-23
- Purpose: prompt-cache mechanics run 2: identical prefix moments later, expect cached_tokens>0
- Request ID: 06552e1f-e35d-4342-8f55-d84e094f5def
- Usage: {"input_tokens": 1323, "input_tokens_details": {"cached_tokens": 0}, "output_tokens": 113, "output_tokens_details": {"reasoning_tokens": 90}, "total_tokens": 1436}
- Prompt SHA-256: 9c79f2484ec753eafc471deedeab657165c36e54e700c2b7d1149374c3ba154a
- Safety boundary: no confidential, customer, credential, or private data should be included in this run.
Prompt Excerpt
SUPPORT POLICY KNOWLEDGE BASE — STABLE PREFIX FOR PROMPT CACHE TEST
You are a customer-support routing assistant for a fictional SaaS company called "Meridian Notes", a note-taking and task app. The following policy text is intentionally long and static so it can serve as a cacheable prompt prefix during an Effloow Lab measurement. None of this content is confidential. It describes only invented, illustrative policies for a fictional product.
SECTION 1 — PLANS AND BILLING
Meridian Notes offers three plans. The Free plan includes up to 200 notes, 1 device sync, and community support. The Pro plan costs a flat monthly fee and adds unlimited notes, sync across up to 10 devices, version history for 30 days, and email support with a two business-day response target. The Team plan adds shared workspaces, role-based permissions, an audit log, and priority support with a one business-day response target. Billing renews monthly on the calendar day the subscription started. Customers who upgrade mid-cycle are charged a prorated amount for the remainder of the current period. Customers who downgrade keep their current plan benefits until the end of the paid period, after which the lower plan limits apply. Refund requests inside the first 14 days of an initial purchase are honored in full. After 14 days, refunds are handled case by case and are not guaranteed. Failed payments trigger three retry attempts over seven days before the account is moved to a read-only state.
SECTION 2 — DATA, EXPORT, AND DELETION
Every customer can export their notes at any time as Markdown or JSON from the account settings page. Exports include note bodies, titles, tags, and timestamps, but not deleted items. When a customer deletes a note, it moves to a trash area for 30 days and is then permanently removed. When a customer deletes their entire account, all associated notes, attachments, and metadata are scheduled for permanent deletion within 30 days, except where a longer retention period is r
Output
Queue: account-recovery.
Advise the customer to reset their password immediately.
Limitations
This API run is a bounded lab check. It is not a production benchmark, user study, or proof that an external product works in a real customer environment.
Effloow Lab OpenAI API Run: openai-prompt-cache-retention-24h-cost-proof-2026
- Date: 2026-06-29T00:37:54.740651+00:00
- Model: gpt-5.5-2026-04-23
- Purpose: prompt-cache mechanics run 3
- Request ID: 5ccfcb30-524f-43d5-8e52-fd47fe3000f8
- Usage: {"input_tokens": 1323, "input_tokens_details": {"cached_tokens": 0}, "output_tokens": 80, "output_tokens_details": {"reasoning_tokens": 80}, "total_tokens": 1403}
- Prompt SHA-256: 9c79f2484ec753eafc471deedeab657165c36e54e700c2b7d1149374c3ba154a
- Safety boundary: no confidential, customer, credential, or private data should be included in this run.
Prompt Excerpt
SUPPORT POLICY KNOWLEDGE BASE — STABLE PREFIX FOR PROMPT CACHE TEST
You are a customer-support routing assistant for a fictional SaaS company called "Meridian Notes", a note-taking and task app. The following policy text is intentionally long and static so it can serve as a cacheable prompt prefix during an Effloow Lab measurement. None of this content is confidential. It describes only invented, illustrative policies for a fictional product.
SECTION 1 — PLANS AND BILLING
Meridian Notes offers three plans. The Free plan includes up to 200 notes, 1 device sync, and community support. The Pro plan costs a flat monthly fee and adds unlimited notes, sync across up to 10 devices, version history for 30 days, and email support with a two business-day response target. The Team plan adds shared workspaces, role-based permissions, an audit log, and priority support with a one business-day response target. Billing renews monthly on the calendar day the subscription started. Customers who upgrade mid-cycle are charged a prorated amount for the remainder of the current period. Customers who downgrade keep their current plan benefits until the end of the paid period, after which the lower plan limits apply. Refund requests inside the first 14 days of an initial purchase are honored in full. After 14 days, refunds are handled case by case and are not guaranteed. Failed payments trigger three retry attempts over seven days before the account is moved to a read-only state.
SECTION 2 — DATA, EXPORT, AND DELETION
Every customer can export their notes at any time as Markdown or JSON from the account settings page. Exports include note bodies, titles, tags, and timestamps, but not deleted items. When a customer deletes a note, it moves to a trash area for 30 days and is then permanently removed. When a customer deletes their entire account, all associated notes, attachments, and metadata are scheduled for permanent deletion within 30 days, except where a longer retention period is r
Output
[NO OUTPUT TEXT]
Limitations
This API run is a bounded lab check. It is not a production benchmark, user study, or proof that an external product works in a real customer environment.
Effloow Lab OpenAI API Run: openai-prompt-cache-retention-24h-cost-proof-2026
- Date: 2026-06-29T00:37:57.057888+00:00
- Model: gpt-5.5-2026-04-23
- Purpose: prompt-cache mechanics run 4
- Request ID: cb443693-0559-41f2-910d-9466a49e903d
- Usage: {"input_tokens": 1323, "input_tokens_details": {"cached_tokens": 0}, "output_tokens": 80, "output_tokens_details": {"reasoning_tokens": 70}, "total_tokens": 1403}
- Prompt SHA-256: 9c79f2484ec753eafc471deedeab657165c36e54e700c2b7d1149374c3ba154a
- Safety boundary: no confidential, customer, credential, or private data should be included in this run.
Prompt Excerpt
SUPPORT POLICY KNOWLEDGE BASE — STABLE PREFIX FOR PROMPT CACHE TEST
You are a customer-support routing assistant for a fictional SaaS company called "Meridian Notes", a note-taking and task app. The following policy text is intentionally long and static so it can serve as a cacheable prompt prefix during an Effloow Lab measurement. None of this content is confidential. It describes only invented, illustrative policies for a fictional product.
SECTION 1 — PLANS AND BILLING
Meridian Notes offers three plans. The Free plan includes up to 200 notes, 1 device sync, and community support. The Pro plan costs a flat monthly fee and adds unlimited notes, sync across up to 10 devices, version history for 30 days, and email support with a two business-day response target. The Team plan adds shared workspaces, role-based permissions, an audit log, and priority support with a one business-day response target. Billing renews monthly on the calendar day the subscription started. Customers who upgrade mid-cycle are charged a prorated amount for the remainder of the current period. Customers who downgrade keep their current plan benefits until the end of the paid period, after which the lower plan limits apply. Refund requests inside the first 14 days of an initial purchase are honored in full. After 14 days, refunds are handled case by case and are not guaranteed. Failed payments trigger three retry attempts over seven days before the account is moved to a read-only state.
SECTION 2 — DATA, EXPORT, AND DELETION
Every customer can export their notes at any time as Markdown or JSON from the account settings page. Exports include note bodies, titles, tags, and timestamps, but not deleted items. When a customer deletes a note, it moves to a trash area for 30 days and is then permanently removed. When a customer deletes their entire account, all associated notes, attachments, and metadata are scheduled for permanent deletion within 30 days, except where a longer retention period is r
Output
account-recovery. Adv
Limitations
This API run is a bounded lab check. It is not a production benchmark, user study, or proof that an external product works in a real customer environment.
Effloow Lab OpenAI API Run: openai-prompt-cache-retention-24h-cost-proof-2026
- Date: 2026-06-29T00:37:59.085169+00:00
- Model: gpt-5.5-2026-04-23
- Purpose: prompt-cache mechanics run 5
- Request ID: 16cedf77-da97-4cf8-8ef8-bd4c82a8c4b9
- Usage: {"input_tokens": 1323, "input_tokens_details": {"cached_tokens": 0}, "output_tokens": 80, "output_tokens_details": {"reasoning_tokens": 80}, "total_tokens": 1403}
- Prompt SHA-256: 9c79f2484ec753eafc471deedeab657165c36e54e700c2b7d1149374c3ba154a
- Safety boundary: no confidential, customer, credential, or private data should be included in this run.
Prompt Excerpt
SUPPORT POLICY KNOWLEDGE BASE — STABLE PREFIX FOR PROMPT CACHE TEST
You are a customer-support routing assistant for a fictional SaaS company called "Meridian Notes", a note-taking and task app. The following policy text is intentionally long and static so it can serve as a cacheable prompt prefix during an Effloow Lab measurement. None of this content is confidential. It describes only invented, illustrative policies for a fictional product.
SECTION 1 — PLANS AND BILLING
Meridian Notes offers three plans. The Free plan includes up to 200 notes, 1 device sync, and community support. The Pro plan costs a flat monthly fee and adds unlimited notes, sync across up to 10 devices, version history for 30 days, and email support with a two business-day response target. The Team plan adds shared workspaces, role-based permissions, an audit log, and priority support with a one business-day response target. Billing renews monthly on the calendar day the subscription started. Customers who upgrade mid-cycle are charged a prorated amount for the remainder of the current period. Customers who downgrade keep their current plan benefits until the end of the paid period, after which the lower plan limits apply. Refund requests inside the first 14 days of an initial purchase are honored in full. After 14 days, refunds are handled case by case and are not guaranteed. Failed payments trigger three retry attempts over seven days before the account is moved to a read-only state.
SECTION 2 — DATA, EXPORT, AND DELETION
Every customer can export their notes at any time as Markdown or JSON from the account settings page. Exports include note bodies, titles, tags, and timestamps, but not deleted items. When a customer deletes a note, it moves to a trash area for 30 days and is then permanently removed. When a customer deletes their entire account, all associated notes, attachments, and metadata are scheduled for permanent deletion within 30 days, except where a longer retention period is r
Output
[NO OUTPUT TEXT]
Limitations
This API run is a bounded lab check. It is not a production benchmark, user study, or proof that an external product works in a real customer environment.
Effloow Lab OpenAI API Run: openai-prompt-cache-retention-24h-cost-proof-2026
- Date: 2026-06-29T00:38:01.126564+00:00
- Model: gpt-5.5-2026-04-23
- Purpose: prompt-cache mechanics run 6
- Request ID: 69d5f118-b505-49aa-a167-d232613fe333
- Usage: {"input_tokens": 1323, "input_tokens_details": {"cached_tokens": 0}, "output_tokens": 80, "output_tokens_details": {"reasoning_tokens": 80}, "total_tokens": 1403}
- Prompt SHA-256: 9c79f2484ec753eafc471deedeab657165c36e54e700c2b7d1149374c3ba154a
- Safety boundary: no confidential, customer, credential, or private data should be included in this run.
Prompt Excerpt
SUPPORT POLICY KNOWLEDGE BASE — STABLE PREFIX FOR PROMPT CACHE TEST
You are a customer-support routing assistant for a fictional SaaS company called "Meridian Notes", a note-taking and task app. The following policy text is intentionally long and static so it can serve as a cacheable prompt prefix during an Effloow Lab measurement. None of this content is confidential. It describes only invented, illustrative policies for a fictional product.
SECTION 1 — PLANS AND BILLING
Meridian Notes offers three plans. The Free plan includes up to 200 notes, 1 device sync, and community support. The Pro plan costs a flat monthly fee and adds unlimited notes, sync across up to 10 devices, version history for 30 days, and email support with a two business-day response target. The Team plan adds shared workspaces, role-based permissions, an audit log, and priority support with a one business-day response target. Billing renews monthly on the calendar day the subscription started. Customers who upgrade mid-cycle are charged a prorated amount for the remainder of the current period. Customers who downgrade keep their current plan benefits until the end of the paid period, after which the lower plan limits apply. Refund requests inside the first 14 days of an initial purchase are honored in full. After 14 days, refunds are handled case by case and are not guaranteed. Failed payments trigger three retry attempts over seven days before the account is moved to a read-only state.
SECTION 2 — DATA, EXPORT, AND DELETION
Every customer can export their notes at any time as Markdown or JSON from the account settings page. Exports include note bodies, titles, tags, and timestamps, but not deleted items. When a customer deletes a note, it moves to a trash area for 30 days and is then permanently removed. When a customer deletes their entire account, all associated notes, attachments, and metadata are scheduled for permanent deletion within 30 days, except where a longer retention period is r
Output
[NO OUTPUT TEXT]
Limitations
This API run is a bounded lab check. It is not a production benchmark, user study, or proof that an external product works in a real customer environment.
Effloow Lab OpenAI API Run: openai-prompt-cache-retention-24h-cost-proof-2026
- Date: 2026-06-29T00:38:03.365506+00:00
- Model: gpt-5.5-2026-04-23
- Purpose: prompt-cache mechanics run 7
- Request ID: 827d48f1-33bd-416e-aeb7-21fdc314446e
- Usage: {"input_tokens": 1323, "input_tokens_details": {"cached_tokens": 0}, "output_tokens": 80, "output_tokens_details": {"reasoning_tokens": 80}, "total_tokens": 1403}
- Prompt SHA-256: 9c79f2484ec753eafc471deedeab657165c36e54e700c2b7d1149374c3ba154a
- Safety boundary: no confidential, customer, credential, or private data should be included in this run.
Prompt Excerpt
SUPPORT POLICY KNOWLEDGE BASE — STABLE PREFIX FOR PROMPT CACHE TEST
You are a customer-support routing assistant for a fictional SaaS company called "Meridian Notes", a note-taking and task app. The following policy text is intentionally long and static so it can serve as a cacheable prompt prefix during an Effloow Lab measurement. None of this content is confidential. It describes only invented, illustrative policies for a fictional product.
SECTION 1 — PLANS AND BILLING
Meridian Notes offers three plans. The Free plan includes up to 200 notes, 1 device sync, and community support. The Pro plan costs a flat monthly fee and adds unlimited notes, sync across up to 10 devices, version history for 30 days, and email support with a two business-day response target. The Team plan adds shared workspaces, role-based permissions, an audit log, and priority support with a one business-day response target. Billing renews monthly on the calendar day the subscription started. Customers who upgrade mid-cycle are charged a prorated amount for the remainder of the current period. Customers who downgrade keep their current plan benefits until the end of the paid period, after which the lower plan limits apply. Refund requests inside the first 14 days of an initial purchase are honored in full. After 14 days, refunds are handled case by case and are not guaranteed. Failed payments trigger three retry attempts over seven days before the account is moved to a read-only state.
SECTION 2 — DATA, EXPORT, AND DELETION
Every customer can export their notes at any time as Markdown or JSON from the account settings page. Exports include note bodies, titles, tags, and timestamps, but not deleted items. When a customer deletes a note, it moves to a trash area for 30 days and is then permanently removed. When a customer deletes their entire account, all associated notes, attachments, and metadata are scheduled for permanent deletion within 30 days, except where a longer retention period is r
Output
[NO OUTPUT TEXT]
Limitations
This API run is a bounded lab check. It is not a production benchmark, user study, or proof that an external product works in a real customer environment.
Read the article
This note supports the public article and records what was actually checked.