tools.browse

Fetch a URL and extract its readable content as markdown, so the model can decide after seeing search snippets that it wants to read the full page.

The tool

trafilatura handles both fetch and readable-content extraction in a single library. It strips boilerplate (nav, ads, footers), preserves headings/lists/code/tables, and outputs markdown — which is exactly the shape we want to feed back to the model.

Truncation is on by default. Pages are unpredictably large; a paywall warning is fine to lose, a 200KB longform article would otherwise blow context budget unannounced. We surface the truncation explicitly so the model knows there’s more if it needs to call again with a higher max_chars.

We deliberately catch fetch/extract failures and return them as readable strings rather than raising. That keeps the tool loop alive — the model sees the error, can recover (try a different URL, fall back to search snippets), and the user still gets a final answer.

The Tool pairs that function with the schema the model sees — the description is what tells it to reach for browse after a web_search result looks worth reading in full.

browse.schema['function']['name'], list(browse.schema['function']['parameters']['properties'])
('browse', ['url', 'max_chars'])

Smoke test — works offline-free if the network is reachable:

print(_browse("https://example.com", max_chars=400))
This domain is for use in documentation examples without needing permission. Avoid use in operations.

Learn more