web_search¶
A trusted clauz3 layer for web search with query-length, query-count, and
substring-based privacy / exfil guards. The backend search service is
configurable; the stdlib does not pick a vendor.
Unlike the layers under examples/, this effect is real, not a mock:
web_search actually POSTs the query to a JSON endpoint. The prover never
runs the body — it records each call as a fact and proves the contract.
The body runs only under clauz3 run, after a program is proved and approved.
Install¶
Configuring the backend¶
Set two environment variables before running a proved program that calls
web_search:
CLAUZ3_SEARCH_URL— JSON endpoint that accepts{"query": str, "key": str}via POST and returns a JSON array of result URL strings.CLAUZ3_SEARCH_API_KEY— optional API key forwarded in the JSON body (empty if unset).
If CLAUZ3_SEARCH_URL is unset, the body raises RuntimeError. This means
test cases (which exercise only the prover, not the body) work without any
backend configured.
Effects¶
Import with from tools.web_search.trusted.effects import web_search:
web_search(query: str) -> list[str]— return a list of result URLs. Precondition:queryis non-empty. Recorded under thenetandsearchmarkers.
Contracts¶
Import with from tools.web_search.trusted import contracts as srch:
srch.no_guarantees()— explicit null contract.srch.no_searches()— the program issues no searches.srch.searches_at_most(count)— at mostcountsearches occur.srch.query_length_at_most(max_chars)— every query is at mostmax_charslong.srch.no_query_contains(substring)— no query containssubstring(privacy / exfil guard).
Example¶
import clauz3
from tools.web_search.trusted import contracts as srch
from tools.web_search.trusted.effects import web_search
@clauz3.guarantee(srch.searches_at_most(3))
@clauz3.guarantee(srch.query_length_at_most(100))
@clauz3.guarantee(srch.no_query_contains("BEGIN PRIVATE KEY"))
def main() -> None:
web_search("clauz3 static contracts")
Tests¶
tests/cases/*_pass.py must prove; tests/cases/*_fail.py must not.
The test suite is proof-only — it never hits the configured backend and
does not require CLAUZ3_SEARCH_URL to be set.
Source¶
src/clauz3/stdlib/web_search/tools/web_search/trusted/effects.py
"""Trusted web-search effect: a real (non-mock) search via a configurable backend.
The stdlib deliberately does not pick a search vendor. The body of
``web_search`` POSTs the query as JSON to a backend whose URL is taken from
the ``CLAUZ3_SEARCH_URL`` environment variable; an optional API key is read
from ``CLAUZ3_SEARCH_API_KEY``. The backend is expected to return a JSON
array of result URL strings (a single ``["https://...", ...]`` object).
This keeps the contract layer — which is the durable part of stdlib — vendor-
independent. The agent's guarantees about query length, query count, and
banned query substrings hold no matter which search service is on the other
end of the wire.
If ``CLAUZ3_SEARCH_URL`` is unset, the body raises ``RuntimeError`` at
``clauz3 run`` time. Proof never executes the body, so unset environment
during testing is fine.
"""
import json
import os
from urllib.request import Request, urlopen
import deal
DEFAULT_USER_AGENT = "clauz3-web-search/0.1"
DEFAULT_TIMEOUT_SECONDS = 30
@deal.pre(lambda query: len(query) > 0, message="query must be non-empty")
@deal.has("net", "search", "global", "import", "trusted")
def web_search(query: str) -> list[str]:
"""Search for ``query`` and return a list of result URLs as strings.
The backend is configured by the ``CLAUZ3_SEARCH_URL`` env var; an
optional API key in ``CLAUZ3_SEARCH_API_KEY`` is forwarded in the JSON
payload. The backend must return a JSON array of URL strings.
"""
endpoint = os.environ.get("CLAUZ3_SEARCH_URL")
if not endpoint:
raise RuntimeError(
"CLAUZ3_SEARCH_URL is not set; cannot perform a real web search. "
'Set it to a JSON endpoint that accepts {"query": str, ...} and '
"returns a JSON array of result URL strings."
)
api_key = os.environ.get("CLAUZ3_SEARCH_API_KEY", "")
payload = json.dumps({"query": query, "key": api_key}).encode("utf-8")
req = Request(
endpoint,
data=payload,
headers={
"Content-Type": "application/json",
"User-Agent": DEFAULT_USER_AGENT,
},
method="POST",
)
with urlopen(req, timeout=DEFAULT_TIMEOUT_SECONDS) as response: # noqa: S310
body = response.read()
parsed = json.loads(body.decode("utf-8", errors="replace"))
return [str(url) for url in parsed]
src/clauz3/stdlib/web_search/tools/web_search/trusted/contracts.py
"""Web-search query policies built from generic trusted effect facts.
Searches are recorded under the ``net`` and ``search`` markers. These
contracts let an agent state how many searches may occur, how long each
query may be, and which substrings the query must (not) contain — the last
is a privacy / exfil guard with the same shape as ``web_fetch``'s
``no_url_contains``.
"""
from clauz3.spec import ContractSpec, contract, effect
from clauz3.spec import no_guarantees as core_no_guarantees
Search = effect("search")
@contract
def no_guarantees() -> ContractSpec:
"""Make no guarantees about web-search effects."""
return core_no_guarantees()
@contract
def no_searches() -> ContractSpec:
"""Guarantee the program issues no searches."""
return Search.empty()
@contract
def searches_at_most(count: int) -> ContractSpec:
"""Guarantee at most ``count`` searches occur."""
return Search.count() <= count
@contract
def query_length_at_most(max_chars: int) -> ContractSpec:
"""Guarantee every search query is at most ``max_chars`` long.
Bounds query size to prevent oversized smuggling of local data into the
outbound search request.
"""
return Search.all(lambda e: len(e.query) <= max_chars)
@contract
def no_query_contains(substring: str) -> ContractSpec:
"""Guarantee no search query contains ``substring``.
Privacy / exfil guard with the same shape as ``web_fetch``'s
``no_url_contains``: an agent asked to look something up cannot smuggle
a local secret into the query string.
"""
return Search.all(lambda e: substring not in e.query)
All cases¶
Browse the full set on the all-cases page.