web_fetch¶
A trusted clauz3 layer for HTTP GET, with URL-prefix policies, fetch-count
bounds, and exfiltration-style content guards on the URL itself.
Unlike the layers under examples/, this effect is real, not a mock:
fetch_url actually issues an HTTP request using urllib.request from
the standard library. The prover never runs the body — it records each
call as a fact and proves the contract. The body runs only under
clauz3 run, after a program is proved and approved.
Install¶
Effects¶
Import with from tools.web_fetch.trusted.effects import fetch_url:
fetch_url(url: str) -> str— issue an HTTP GET and return the response body as UTF-8 text (witherrors='replace'). Precondition:urlstarts withhttp://orhttps://. Recorded under thenetandfetchmarkers. 30-second timeout.
Contracts¶
Import with from tools.web_fetch.trusted import contracts as web:
web.no_guarantees()— explicit null contract.web.no_fetches()— the program issues no HTTP fetches.web.only_fetch_under(prefix)— every fetched URL starts withprefix.web.never_fetch_under(prefix)— no fetched URL starts withprefix.web.https_only()— every fetched URL uses HTTPS.web.fetches_at_most(count)— at mostcountfetches occur.web.no_url_contains(substring)— no fetched URL containssubstring(exfil guard).web.url_length_at_most(max_chars)— every fetched URL is at mostmax_charslong.
only_fetch_under / never_fetch_under are URL-prefix policies compiled to
str.startswith over the symbolic URL argument; use fully-qualified prefixes
("https://api.github.com/repos/") for tight scoping.
The exfil-guard angle¶
no_url_contains is the unusual contract here. Static substring guards on
agent-built URLs let you prove, before any request fires, that a fetched URL
cannot smuggle local data into the query string. For an agent that's been
given a local secret to compute against, this is the cheapest way to ensure
the secret doesn't leak via an outbound HTTP call.
Example¶
import clauz3
from tools.web_fetch.trusted import contracts as web
from tools.web_fetch.trusted.effects import fetch_url
@clauz3.guarantee(web.https_only())
@clauz3.guarantee(web.only_fetch_under("https://api.github.com/repos/"))
@clauz3.guarantee(web.fetches_at_most(5))
@clauz3.guarantee(web.no_url_contains("token="))
def main() -> None:
fetch_url("https://api.github.com/repos/normalform-ai/clauz3")
Tests¶
tests/cases/*_pass.py must prove; tests/cases/*_fail.py must not.
The test suite is proof-only — it never actually hits the network.
Source¶
src/clauz3/stdlib/web_fetch/tools/web_fetch/trusted/effects.py
"""Trusted web-fetch effect: a real (non-mock) HTTP GET.
Carries the ``net`` and ``fetch`` markers so the contracts in ``contracts.py``
can scope reachable URLs and bound the number of fetches.
The body uses ``urllib.request`` from the standard library — no third-party
HTTP dependency. The 30-second timeout is a defense against trivially-slow
responses; it is not a contract surface (the prover has no model of wall time).
"""
from urllib.request import Request, urlopen
import deal
DEFAULT_USER_AGENT = "clauz3-web-fetch/0.1"
DEFAULT_TIMEOUT_SECONDS = 30
@deal.pre(
lambda url: url.startswith("http://") or url.startswith("https://"),
message="url must start with http:// or https://",
)
@deal.has("net", "fetch", "global", "import", "trusted")
def fetch_url(url: str) -> str:
"""Issue an HTTP GET against ``url`` and return the response body as text.
Uses a fixed UTF-8 decoding (with ``errors='replace'``) and a 30-second
timeout. Per-call URL scheme is checked as a precondition; allowed host
prefixes, fetch count, and exfil-style URL-content guards are agent-stated
guarantees.
"""
req = Request(url, headers={"User-Agent": DEFAULT_USER_AGENT})
with urlopen(req, timeout=DEFAULT_TIMEOUT_SECONDS) as response: # noqa: S310
body = response.read()
return body.decode("utf-8", errors="replace")
src/clauz3/stdlib/web_fetch/tools/web_fetch/trusted/contracts.py
"""Web-fetch URL policies built from generic trusted effect facts.
Fetches are recorded under the ``net`` and ``fetch`` markers. These contracts
let an agent state which URLs may be fetched, how many fetches may occur, and
which substrings the URL itself must (not) contain — the last is the genuinely
novel exfil guard a typical HTTP client cannot statically prove.
"""
from clauz3.spec import ContractSpec, contract, effect
from clauz3.spec import no_guarantees as core_no_guarantees
Fetch = effect("fetch")
@contract
def no_guarantees() -> ContractSpec:
"""Make no guarantees about web-fetch effects."""
return core_no_guarantees()
@contract
def no_fetches() -> ContractSpec:
"""Guarantee the program issues no HTTP fetches."""
return Fetch.empty()
@contract
def only_fetch_under(prefix: str) -> ContractSpec:
"""Guarantee every fetched URL starts with ``prefix``.
Use fully-qualified prefixes (``"https://api.github.com/repos/"``) to
bound both scheme + host + path. A bare scheme like ``"https://"`` works
too but admits any host; combine with multiple ``only_fetch_under`` for
disjunctive host allowlists.
"""
return Fetch.all(lambda e: e.url.startswith(prefix))
@contract
def never_fetch_under(prefix: str) -> ContractSpec:
"""Guarantee no fetched URL starts with ``prefix``."""
return Fetch.all(lambda e: not e.url.startswith(prefix))
@contract
def https_only() -> ContractSpec:
"""Guarantee every fetched URL uses HTTPS."""
return Fetch.all(lambda e: e.url.startswith("https://"))
@contract
def fetches_at_most(count: int) -> ContractSpec:
"""Guarantee at most ``count`` fetches occur."""
return Fetch.count() <= count
@contract
def no_url_contains(substring: str) -> ContractSpec:
"""Guarantee no fetched URL contains ``substring`` as a substring.
Useful as an exfiltration guard: an agent asked to look up data must not
smuggle local secrets into the URL path or query string. For example,
``no_url_contains("BEGIN PRIVATE KEY")`` rejects any URL with a literal
key dumped into a query parameter.
"""
return Fetch.all(lambda e: substring not in e.url)
@contract
def url_length_at_most(max_chars: int) -> ContractSpec:
"""Guarantee every fetched URL is at most ``max_chars`` long.
Bounds URL size to prevent oversized query-string smuggling.
"""
return Fetch.all(lambda e: len(e.url) <= max_chars)
All cases¶
Browse the full set on the all-cases page.