core

Minimal IPython notebook integration for an LLM %%prompt cell magic. Inspired by fast.ai’s solveit.

Convention for the rest of this notebook: each exported function is followed by a bare-call cell whose rendered output doubles as its working example. Deeper invariants live in adjacent #| hide cells.

Imports

Foundational observations

An .ipynb is just JSON. Cells have sources, outputs, and (since nbformat 4.5) stable ids.

notebook_content = Path("99_small_demo.ipynb").read_text()
print(notebook_content[:250] + "...")
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6e33fbe2-4020-4c95-9d09-3be966a130aa",
   "metadata": {},
   "source": [
    "# Small demo\n",
    "> Small test subject used to understand and test notebook internals\n",
    "- skip_exec: true...

Cell outputs persist in the JSON on disk, so they double as a cache.

notebook_json = json.loads(notebook_content)
cell = notebook_json["cells"][3]
cell
{'cell_type': 'code',
 'execution_count': None,
 'id': '16d185be-4b09-4387-bf19-be8b03ccf817',
 'metadata': {},
 'outputs': [{'name': 'stdout',
   'output_type': 'stream',
   'text': ['Hello matey!\n']}],
 'source': ['def hello():\n', '    print("Hello matey!")\n', '\n', 'hello()']}

JupyterLab tells the kernel which cell is executing via parent_header.metadata.cellId. That’s enough to locate ourselves in the notebook without text matching.

ip = get_ipython()
ip.parent_header["metadata"]["cellId"]
'4e577b75-e6e8-4ca8-ae26-cacb869d4636'

System prompt

Kept short and honest about the actual environment. The model takes its tone, structure, and capabilities cues from here, so lies about features we don’t have (tool use, variable injection, etc.) leak into responses.

Locating the current notebook

Two-step strategy. JupyterLab ≥ 3.5 sets JPY_SESSION_NAME to the notebook’s path; use it to determine the current notebook’s path.

_current_notebook()
Path('/Users/pablo/src/nbdialog/nbs/00_core.ipynb')

Cells → chat messages

Prompt cells contribute one user message (the prompt body with the %%prompt line stripped) and, when they already have rendered output, one assistant message replaying that output. Code and markdown cells contribute their source as user, plus any captured outputs as a follow-up user message. The cell whose id matches up_to_id is the boundary — included for its prompt but never for its (stale) cached response, since that response is what the current call is producing.

_join(["hello\n", "world"])
'hello\nworld'
_is_prompt({"cell_type": "code", "source": "%%prompt\nhi"}), _is_prompt({"cell_type": "code", "source": "x = 1"})
(True, False)
_strip_magic("%%prompt -f\nhello\nworld")
'hello\nworld'
_output_text({"output_type": "stream", "text": "hi\n"}), _response_markdown({"outputs": [{"data": {"text/markdown": "**reply**"}}]})
('hi\n', '**reply**')

source

notebook_to_messages


def notebook_to_messages(
    cells, up_to_id, system:NoneType=None
):

Build chat messages from cells up to and including the cell with id up_to_id.

demo = [{"id": "a", "cell_type": "code", "source": "x = 1", "outputs": []},
        {"id": "b", "cell_type": "code", "source": "%%prompt\nwhat is x?", "outputs": []}]
notebook_to_messages(demo, "b")
[{'role': 'system',
  'content': 'You are an AI assistant embedded inside an IPython notebook through a `%%prompt` cell magic. The notebook is a linear sequence of markdown, code, and prompt cells executed in a single persistent kernel — state from code cells carries forward. When the user runs a prompt cell, every cell above it (sources and captured outputs) is sent to you as the conversation history; previous prompt cells appear as your prior assistant turns. The dialog *is* the notebook.\n\nBe concise, direct, and incremental. Match response length to the question. Do not pad, restate, or end with "let me know if...". Use fenced code blocks with language tags. Default to idiomatic Python — comprehensions, broadcasting, fastcore-style brevity. Short single-line docstrings; no inline comments unless a constraint is genuinely non-obvious.\n\nYour knowledge cutoff is January 2026.'},
 {'role': 'user', 'content': 'x = 1'},
 {'role': 'user', 'content': 'what is x?'}]

Sanity check on the demo notebook:

demo = json.loads(Path("99_small_demo.ipynb").read_text())["cells"]
last = demo[-1]["id"]
[(m["role"], m["content"][:60]) for m in notebook_to_messages(demo, last)]
[('system', 'You are an AI assistant embedded inside an IPython notebook '),
 ('user', '# Small demo\n> Small test subject used to understand and tes'),
 ('user', 'from nbdialog.core import *\nfrom nbdialog.providers.openai i'),
 ('user', 'write me hello world in python, like a pirate!'),
 ('assistant', '```python\nprint("Ahoy, world!")\n```'),
 ('user', 'def hello():\n    print("Hello matey!")\n\nhello()'),
 ('user', '# Output:\nHello matey!\n'),
 ('user', 'which is better?'),
 ('assistant',
  'Yours is better for the brief.\n\n```python\ndef hello():\n    p'),
 ('user', 'Give me top 5 news about bitcoin as of May 12 2026')]

Providers & tools

The magic doesn’t know or care which vendor answers the prompt — it only needs a Provider: an object whose complete(messages, tools=None) returns a parsed Turn. We encode that as a typing.Protocol so a provider can be any duck-typed object the user happens to have, without inheriting from anything in this package.

A provider is intentionally a single round-trip primitive — translate one model call. The tool-dispatch loop (call model → run tools → re-call) lives in run_completion below, so it isn’t duplicated per provider. Provider docs say the wire format is OpenAI Chat Completions; second-vendor providers translate at their own boundary.

Tools are paired (schema, function) values: an OpenAI-shaped JSON tool envelope plus the Python implementation. We deliberately keep the schema explicit and hand-written for now — generating it from type hints/docstrings is a nice future addition, not a hard requirement.

Both registries are module-level singletons because a notebook kernel is a single global session — there is nothing for a context manager or thread-local to scope. Users register both once at the top of their notebook:

from nbdialog.providers.openai import OpenAIProvider
from nbdialog.tools.search import web_search
set_provider(OpenAIProvider())
set_tools([web_search])

source

Provider


def Provider(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

Vendor adapter: messages in (OpenAI chat-completions format), parsed Turn out. The tool-call loop lives in run_completion, not here — providers translate one round-trip and nothing else.


source

Turn


def Turn(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

One round-trip result from a Provider. Carries everything a caller or loop needs — text, parsed tool calls, usage, and the model’s finish_reason so callers can detect truncation (length) or refusals without re-parsing the raw response.


source

ToolCall


def ToolCall(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

A parsed tool invocation from one model turn.


source

get_tools


def get_tools(
    
)->list:

Return the currently registered tools.


source

set_tools


def set_tools(
    tools:list
)->None:

Register the Tools available on every %%prompt.


source

get_provider


def get_provider(
    
)->Provider:

Return the active provider, or raise with the fix in the message.


source

set_provider


def set_provider(
    p:Provider
)->None:

Register p as the provider the %%prompt magic will call.


source

Tool


def Tool(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

A schema/function pair the model can call. schema is an OpenAI-shaped tool envelope.

Tracing the immediate loop

Opt-in transparency for %%prompt. A Trace records just the loop that produces the answer — the user’s prompt, each model turn, each tool call/result, timings, and token usage. It deliberately excludes the system prompt and the notebook history that got built into messages; those don’t change between runs of a given cell, and the interesting story is what the model did with them.

Rendering is a single collapsible disclosure block, designed to live in the cell’s stored outputs so it persists across notebook saves. Because we return HTML via _repr_html_, the Jupyter save flow handles persistence for us — the same mechanism that already caches the markdown answer.


source

Trace


def Trace(
    
):

Records the immediate loop (prompt, model turns, tool calls). Renders as a collapsible HTML block.

The tool-call loop

run_completion drives a Provider until it produces a final answer. The provider returns one parsed Turn per call; if that turn has tool calls, we append the assistant message (with its tool_calls) and one role: "tool" message per dispatched result, then call the provider again. max_tool_steps bounds the loop in case the model gets stuck in a tool-calling loop.

Pulled out of any individual provider because the loop is vendor-agnostic — every chat-completion provider that supports tools repeats the same dance. Keeping it here means a new provider only translates one round-trip; it doesn’t re-implement dispatch.


source

run_completion


def run_completion(
    provider:Provider, messages:list, tools:list=None, trace:Trace=None, max_tool_steps:int=8
)->Turn:

Drive provider in a tool-call loop and return the final answering Turn.

The %%prompt magic

When the user runs a %%prompt cell we parse the magic line, read this cell’s id from parent_header.metadata.cellId, and either replay the cached output (if one exists and --force wasn’t passed) or build the message list from the notebook so far and call the model. The cache lives in the .ipynb itself — outputs persist through Jupyter’s normal save flow, which is the whole point.

_parse_prompt_args("-f --trace")
Namespace(force=True, trace=True)

source

prompt


def prompt(
    line, cell
):

Send the notebook-so-far to the LLM and render its reply as markdown.