A
argbe.tech
18min read

The Freshness Moat: Freshness Is Truth Distribution (Signal Observatory Edition)

Freshness is not rewriting content. Freshness is verifiable state + disciplined signals + observability — so your updates trigger recrawl and you can prove propagation happened.

The web has a language problem:

It can publish text. It can’t publish state.

In a retrieval world, that’s fatal.

Because “freshness” is no longer a vibe. It’s a measurable distribution property: did the systems that matter re-fetch your updated truth?

We can verify some crawlers; for the rest, we measure behavior and propagation patterns.

The Old Web Can’t Express State

If your knowledge can’t emit deltas, you don’t have knowledge—you have archaeology.

Static pages are great for humans, but they do a terrible job at communicating “what changed” to machines. A crawler doesn’t want your new paragraph. It wants to know:

  • what changed
  • when it changed
  • which machine surfaces should be revalidated
  • whether the update is material

Without that, “freshness” becomes date theater: a constant churn of timestamps with no measurable recrawl.

Non-Negotiable Definitions (Use These Words)

A) Principle Pages vs. Volatile Pages

Principle Pages are stable: timeless architecture, frameworks, positioning, long-lived methods. They should rarely change.

Volatile Pages are stateful: model/tool versions, pricing, policies, compliance, SLAs, supported regions, legal constraints. They must be continuously validated.

Example: Argbe.tech can keep a principle page like “How we do GEO” stable, while volatile claims like supported markets (DACH, United States) must stay current.

B) Material Update

A material update changes one of:

  • main content meaning (not cosmetic edits)
  • structured data / JSON-LD
  • key links that function as evidence (anchors, registries, audit packets)
  • claim validity (pricing model, eligibility, scope, availability)

Google’s examples of “significant modification” include changes to main content, structured data, and links; changing a copyright date is not significant 1 .

Google also notes it only uses <lastmod> when it’s consistently and verifiably accurate 1 .

C) Propagation

Propagation is the time between shipping a material update and verified re-fetch of the updated URLs (including your machine surfaces).

D) Propagation Half-Life (PHL)

Propagation Half-Life (PHL) is a metric:

“How quickly do key fetchers re-fetch updated surfaces after an UpdateEvent?”

Diagram 1 — The Freshness Moat Triad (With Feedback Loop)

feedback loop

Verifiable State
what is true as-of now

Signal Stack
how updates move crawlers

Signal Observatory
prove propagation happened

Freshness Isn’t Rewriting: It’s Re-Verification

Here’s the operational reframing:

  • Evergreen principles should stay stable (you don’t rewrite your architecture every week).
  • Verification state must evolve (you continuously confirm what’s still true).

So instead of “update the blog post,” you ship a Verification & Updates module that publishes accountable state:

  • what is true now
  • what changed
  • when it was verified
  • which surfaces were revalidated

A Minimal “Verification & Updates” Module (Public)

This can be a page section or a dedicated route. The point is not aesthetics — it’s machine-legible state.

{
	"entity": "Argbe.tech",
	"as_of": "2026-01-27",
	"entity_version": "0.79.0",
	"volatile_claims": {
		"availability_regions": [
			"DACH",
			"United States"
		],
		"pricing_model": "Fixed weekly rate"
	},
	"materiality_rule": "Material update = meaning/structured-data/link/claim validity change; cosmetic edits do not count.",
	"surfaces": {
		"canonical_pages": [
			"/",
			"/contact",
			"/geo-seo"
		],
		"machine_surfaces": [
			"/entity.json",
			"/llms.txt",
			"/changes.json",
			"/sitemap.xml"
		]
	}
}

Those volatile claims are already present in your golden record:

  • Pricing model: Fixed weekly rate Fixed weekly rate
  • Regions: DACH, United States ["DACH","United States"]

The Signal Stack (What Actually Moves Crawlers)

Your Signal Stack is not an SEO hack. It’s a cost-reduction and distribution layer.

PHL drops when revalidation becomes cheap (ETag) and change notification becomes explicit (IndexNow + honest sitemap lastmod) 3 4 1 .

1) HTTP caching validators (ETag + Last-Modified)

Use conditional requests to make revalidation cheap:

  • ETag + If-None-Match
  • Last-Modified + If-Modified-Since

Google explicitly recommends considering both ETag and Last-Modified; when both exist, Google uses ETag (and recommends it for efficient revalidation) 3 .

Why this matters for GEO: it turns “checking freshness” into a low-cost operation, increasing the likelihood systems will re-fetch your machine surfaces.

2) Sitemap lastmod discipline (only significant modifications)

If you push “fake freshness” via trivial edits and stamp new dates, you’re training crawlers to ignore you. lastmod should reflect significant modification, not footer changes 1 .

3) Push notification where relevant (IndexNow)

For participating engines, IndexNow provides a push mechanism to notify about URL changes (added/updated/deleted) 4 .

Be precise: IndexNow does not guarantee crawling or indexing; it shortens propagation when the ecosystem responds 15 13 .

4) Structured data timestamps (disciplined, not performative)

If a material update touches structured data, reflect it explicitly in JSON-LD:

  • use datePublished when something is first published (where appropriate)
  • bump dateModified only when the update is material (same discipline as sitemap lastmod)

Keep semantics clean: “Last verified” can update on validation cycles without implying the page materially changed; “Last updated” should only move on material changes 16 .

Anti-Pattern Box — Date-Churn / Fake lastmod / Cosmetic Updates

Anti-patternWhat it looks likeWhy it failsThe fix
Date-churn freshness theaterUpdating “Last updated” every weekCrawlers learn it’s noiseTie lastmod to material updates only 1
Cosmetic diffsswapping adjectives, changing orderDoesn’t change claim validityPublish state changes (deltas), not prose
“We updated!” with no signalsedits shipped, but no machine surfaces touchedrevalidation never triggersETag/Last-Modified + sitemap discipline + ChangeFeed

/llms.txt as the Agent Routing Table

Treat /llms.txt as a pragmatic entrypoint (not magic): a place to declare where your machine surfaces live.

The llms.txt format is an emerging proposal for helping LLMs and agents find and use websites at inference time 5 .

The interlock: Golden Record + Fan-Out + ChangeFeed

/llms.txt should route to:

  • your golden record (/entity.json) (canonical entity truth)
  • your fan-out exports (packet endpoints)
  • your change feed (/changes.json) (what changed + when)

/llms.txt
routing table

/entity.json
golden record

/changes.json
ChangeFeed

Fan-Out exports
(e.g. /contact.json)

Canonical pages
(JSON-LD embedded)

Minimal example (/llms.txt):

# argbe.tech machine surfaces

Entity: https://argbe.tech/entity.json
Changes: https://argbe.tech/changes.json
Sitemap: https://argbe.tech/sitemap.xml

# Canonical pages (humans + embedded JSON-LD)
Pages:
- https://argbe.tech/contact
- https://argbe.tech/geo-seo

# Exports (Fan-Out)
Exports:
- https://argbe.tech/contact.json

ChangeFeed: Patch Notes for Truth

Agents and crawlers don’t want essays. They want deltas.

Your ChangeFeed is a public, append-only record of UpdateEvents:

  • what changed
  • when it changed
  • why it changed (reason category)
  • which surfaces were impacted
  • whether it was material

What a ChangeFeed entry contains (fields + semantics)

FieldMeaning
idstable event id (monotonic or UUID)
observed_atwhen the new state became canonical
materialboolean per your materiality rules
changed_pathswhat fields changed in the golden record
affected_urlscanonical pages + machine surfaces that should be revalidated
evidenceoptional URLs to proof artifacts (release tags, audit packets)

Example (/changes.json), intentionally compact:

{
	"version": "0.79.0",
	"as_of": "2026-01-27",
	"events": [
		{
			"id": "update-2026-01-19-001",
			"observed_at": "2026-01-19T09:10:00Z",
			"material": true,
			"changed_paths": [
				"pricing.model",
				"markets.regions",
				"meta.releases_repo"
			],
			"affected_urls": [
				"/",
				"/contact",
				"/entity.json",
				"/llms.txt",
				"/changes.json",
				"/sitemap.xml"
			],
			"evidence": [
				"https://github.com/argbe-tech/releases/releases"
			]
		}
	]
}

Metric Box — Propagation Half-Life (PHL)

PHL turns freshness into an operational KPI.

Define:

  • t0 = time you ship a material UpdateEvent
  • R(t) = fraction of “key fetchers” that have re-fetched the affected surfaces by time t

PHL is the time where R(t) crosses 50%.

Practical targets (illustrative):

SurfaceGoal PHLWhy
/entity.json + /changes.jsonhours → 1 daythe machine truth should refresh fast
money pages (/contact)1–3 dayshumans care; citations follow
deep content3–14 daysdepends on crawl budget and demand

Your targets depend on volatility + crawl demand.

Your moat is not “updated weekly.” It’s: “PHL is low and provable.”

Signal Observatory (Cloudflare) = Freshness Without Theater

If you don’t measure propagation, freshness is performative.

What the Observatory measures

  • who re-fetches updated surfaces after UpdateEvents (as behavior buckets)
  • how fast (PHL)
  • whether /llms.txt and /changes.json are actually used
  • why universal “agent detection” fails (Cloudflare has documented examples of stealth/undeclared crawling behavior in the ecosystem) 12

Behavior buckets (realistic, non-magical)

You can’t identify every agent. You can:

  1. Verified crawlers (where verification exists)
    • Google documents crawler verification via reverse DNS + forward DNS checks 2 .
  2. Declared bots (documented UAs + robots controls)
    • OpenAI documents crawlers and robots.txt controls (e.g., GPTBot, OAI-SearchBot) 6 .
    • Some agent requests can be authenticated (signed) for allowlisting, but we still treat observability as behavior-first 14 .
    • Anthropic documents Claude crawlers/modes (training vs user-initiated browsing) 8 .
    • Perplexity provides bot guidance including verification patterns 9 .
  3. Undeclared/stealth automation
    • treat as “unknown automation,” measured by behavior, not identity.

Baseline Observatory works with Cloudflare Analytics + origin logs (or just the edge metrics you already have): UA + request patterns + cache status + refetch timing.

Optional upgrade: Logpush (HTTP requests dataset) for richer, queryable event streams, including cache status and (where available/plan-enabled) bot fields. If Bot Management is enabled, bot score / verified-bot flags can improve classification and routing (e.g., bot score variables) 11 10 .

Observatory output (what we watch)

UpdateEvent → affected URLs → time-to-first refetch (verified crawlers) → refetch rate over 24/72h → PHL

The simplest “observability loop”

  1. Emit an UpdateEvent (ChangeFeed)
  2. Ensure headers/sitemap/IndexNow are correct
  3. Watch re-fetch patterns for:
    • /entity.json
    • /changes.json
    • /llms.txt
    • the affected canonical pages
  4. Compute PHL per bucket
  5. Fix what’s slow (signals, caching, discoverability, blockers)

Optional measurement hook: OpenAI documents that publishers can track referral traffic from ChatGPT using UTMs 7 . Use it as a reality check, not as the primary freshness metric.

Close: Freshness Is a Distribution Problem

Phase 2 made truth structured.

Phase 3 makes truth alive.

Competitors can mimic your prose.

They can’t mimic a system that ships verifiable deltas + disciplined signals + measured propagation.

If your update can’t be observed propagating, it didn’t happen.

Evidence Locker