Skip to content

web_fetch

A trusted clauz3 layer for HTTP GET, with URL-prefix policies, fetch-count bounds, and exfiltration-style content guards on the URL itself.

Unlike the layers under examples/, this effect is real, not a mock: fetch_url actually issues an HTTP request using urllib.request from the standard library. The prover never runs the body — it records each call as a fact and proves the contract. The body runs only under clauz3 run, after a program is proved and approved.

Install

clauz3 install stdlib:web_fetch

Effects

Import with from tools.web_fetch.trusted.effects import fetch_url:

  • fetch_url(url: str) -> str — issue an HTTP GET and return the response body as UTF-8 text (with errors='replace'). Precondition: url starts with http:// or https://. Recorded under the net and fetch markers. 30-second timeout.

Contracts

Import with from tools.web_fetch.trusted import contracts as web:

  • web.no_guarantees() — explicit null contract.
  • web.no_fetches() — the program issues no HTTP fetches.
  • web.only_fetch_under(prefix) — every fetched URL starts with prefix.
  • web.never_fetch_under(prefix) — no fetched URL starts with prefix.
  • web.https_only() — every fetched URL uses HTTPS.
  • web.fetches_at_most(count) — at most count fetches occur.
  • web.no_url_contains(substring) — no fetched URL contains substring (exfil guard).
  • web.url_length_at_most(max_chars) — every fetched URL is at most max_chars long.

only_fetch_under / never_fetch_under are URL-prefix policies compiled to str.startswith over the symbolic URL argument; use fully-qualified prefixes ("https://api.github.com/repos/") for tight scoping.

The exfil-guard angle

no_url_contains is the unusual contract here. Static substring guards on agent-built URLs let you prove, before any request fires, that a fetched URL cannot smuggle local data into the query string. For an agent that's been given a local secret to compute against, this is the cheapest way to ensure the secret doesn't leak via an outbound HTTP call.

Example

import clauz3
from tools.web_fetch.trusted import contracts as web
from tools.web_fetch.trusted.effects import fetch_url


@clauz3.guarantee(web.https_only())
@clauz3.guarantee(web.only_fetch_under("https://api.github.com/repos/"))
@clauz3.guarantee(web.fetches_at_most(5))
@clauz3.guarantee(web.no_url_contains("token="))
def main() -> None:
    fetch_url("https://api.github.com/repos/normalform-ai/clauz3")

Tests

clauz3 test stdlib:web_fetch

tests/cases/*_pass.py must prove; tests/cases/*_fail.py must not.

The test suite is proof-only — it never actually hits the network.

Source

src/clauz3/stdlib/web_fetch/tools/web_fetch/trusted/effects.py

"""Trusted web-fetch effect: a real (non-mock) HTTP GET.

Carries the ``net`` and ``fetch`` markers so the contracts in ``contracts.py``
can scope reachable URLs and bound the number of fetches.

The body uses ``urllib.request`` from the standard library — no third-party
HTTP dependency. The 30-second timeout is a defense against trivially-slow
responses; it is not a contract surface (the prover has no model of wall time).
"""

from urllib.request import Request, urlopen

import deal

DEFAULT_USER_AGENT = "clauz3-web-fetch/0.1"
DEFAULT_TIMEOUT_SECONDS = 30


@deal.pre(
    lambda url: url.startswith("http://") or url.startswith("https://"),
    message="url must start with http:// or https://",
)
@deal.has("net", "fetch", "global", "import", "trusted")
def fetch_url(url: str) -> str:
    """Issue an HTTP GET against ``url`` and return the response body as text.

    Uses a fixed UTF-8 decoding (with ``errors='replace'``) and a 30-second
    timeout. Per-call URL scheme is checked as a precondition; allowed host
    prefixes, fetch count, and exfil-style URL-content guards are agent-stated
    guarantees.
    """
    req = Request(url, headers={"User-Agent": DEFAULT_USER_AGENT})
    with urlopen(req, timeout=DEFAULT_TIMEOUT_SECONDS) as response:  # noqa: S310
        body = response.read()
    return body.decode("utf-8", errors="replace")

src/clauz3/stdlib/web_fetch/tools/web_fetch/trusted/contracts.py

"""Web-fetch URL policies built from generic trusted effect facts.

Fetches are recorded under the ``net`` and ``fetch`` markers. These contracts
let an agent state which URLs may be fetched, how many fetches may occur, and
which substrings the URL itself must (not) contain — the last is the genuinely
novel exfil guard a typical HTTP client cannot statically prove.
"""

from clauz3.spec import ContractSpec, contract, effect
from clauz3.spec import no_guarantees as core_no_guarantees

Fetch = effect("fetch")


@contract
def no_guarantees() -> ContractSpec:
    """Make no guarantees about web-fetch effects."""
    return core_no_guarantees()


@contract
def no_fetches() -> ContractSpec:
    """Guarantee the program issues no HTTP fetches."""
    return Fetch.empty()


@contract
def only_fetch_under(prefix: str) -> ContractSpec:
    """Guarantee every fetched URL starts with ``prefix``.

    Use fully-qualified prefixes (``"https://api.github.com/repos/"``) to
    bound both scheme + host + path. A bare scheme like ``"https://"`` works
    too but admits any host; combine with multiple ``only_fetch_under`` for
    disjunctive host allowlists.
    """
    return Fetch.all(lambda e: e.url.startswith(prefix))


@contract
def never_fetch_under(prefix: str) -> ContractSpec:
    """Guarantee no fetched URL starts with ``prefix``."""
    return Fetch.all(lambda e: not e.url.startswith(prefix))


@contract
def https_only() -> ContractSpec:
    """Guarantee every fetched URL uses HTTPS."""
    return Fetch.all(lambda e: e.url.startswith("https://"))


@contract
def fetches_at_most(count: int) -> ContractSpec:
    """Guarantee at most ``count`` fetches occur."""
    return Fetch.count() <= count


@contract
def no_url_contains(substring: str) -> ContractSpec:
    """Guarantee no fetched URL contains ``substring`` as a substring.

    Useful as an exfiltration guard: an agent asked to look up data must not
    smuggle local secrets into the URL path or query string. For example,
    ``no_url_contains("BEGIN PRIVATE KEY")`` rejects any URL with a literal
    key dumped into a query parameter.
    """
    return Fetch.all(lambda e: substring not in e.url)


@contract
def url_length_at_most(max_chars: int) -> ContractSpec:
    """Guarantee every fetched URL is at most ``max_chars`` long.

    Bounds URL size to prevent oversized query-string smuggling.
    """
    return Fetch.all(lambda e: len(e.url) <= max_chars)

All cases

Browse the full set on the all-cases page.