nburi, cellid = cellspec_from('/a/b/d#Y114sZmlsZQ==')
nburi, cellid('/a/b/d', 'Y114sZmlsZQ==')
cellspec_from (s)
VSCode cell id format.
IPython cell execution info and __nb__ updater.
Simple IPython event callback that captures cell id and source code of last run cell.
NOTE: in this notebook, __nb__ is NOT a valid notebook state yet. It doesn’t reflect markdown cells or deleted cells or cells order. For a valid, updated in real-time, nbformat compliant notebook state (NB), see 21_nb_state.ipynb.
get_csi (start=False)
get_lastinfo ()
get_info ()
CellExecInfo (start=False)
Initialize self. See help(type(self)) for accurate signature.
17
__cellinfo__ stores information about current cell execution, interactiveshell.ExecutionInfo. __cellinfo__.exec_result stores the result of the cell execution, interactiveshell.ExecutionResult, only valid after the cell run.
X43sZmlsZQ==
26
26 26
26 None
23 23
__cellinfo__.source corresponds to In[-1] or _ih[-1].
__lastcellinfo__.exec_result.result corresponds to _ or Out[__lastcellinfo__.exec_result.execution_count].
But we don’t want the result of the last cell, we want the result of the current cell. For that, keep reading.
Caveat: the front-end is not required to send the cell id. See nbformat Cell ids, run_cell.
VSCode reports nbformat of newly created notebook as 4.4, but it does send the cell id, though not well formed.
nbclassic does not set the cell ID even if the reported nbformat version is 4.5. Bridget creates one in this case.
Note: autoid here is different from fasthtml fh_cfg['auto_id'] option. Here we’re trying to automatically set the id attribute of the wrapper element of a cell output in the front-end. If we can do so, we’ll be able to target especific cell outputs.
Inspect “bbbb” output of previous cell, the parent element should have class “bridge” and id.
VSCode wipe out the element when updating the cell. We need to send again the autoid.
An attempt to provide a IPython display wrapper that automatically handles the display ID to allow us to target especific cells. Not working, for now it’s essentially just IPython display.
class DisplayId(DisplayHandle):
def __init__(self, display_id=None):
super().__init__(display_id or new_id())
self._contents = None
self._sc = to_xml(autoid(self.display_id)[0]) if bridge_cfg.auto_id else ''
def display(self, obj='', **kwargs):
self._contents = str(obj)
IDISPLAY(HTML(self._contents + self._sc), display_id=self.display_id, **kwargs)
def update(self, obj='', **kwargs):
kwargs['update'] = True
self.display(obj, **kwargs)
def contents(self): return self._contents
display_pubhook fordisplay_idandbrd-mark
Tag cell outputs with bridge metadata to target them.
In particular, it will transform every display message to transient if the message has a session metadata id (brd_did). It will set the display_id of each output with that same brd_id value. For HTML display objects, it also marks the DOM parent element in the front-end. With this (session unique) tag, we can easily address specific outputs from Python.
This will be handy to target specific cell outputs when we can capture the notebook state down the road.
NOTE: display_pub hooks are thread dependent. Here we assume we only set the hook from the main thread.
get_bridged (start=False)
Bridged (start=False)
Augment display messages with bridge stuff.
{
...,
'msg_type': 'display_data',
'content': {
'data': {
'text/plain': '<IPython.core.display.HTML object>',
'text/html': "<div>I'm marked!... MAAARKED!!</div>"
},
'metadata': {'text/html': {'brd_did': 'b8b568b9a-c02e1576-c3a3c120-167cedda'}},
'transient': {}
},
'metadata': {}
}
{
...,
'msg_type': 'display_data',
'content': {
'data': {
'text/plain': '<IPython.core.display.HTML object>',
'text/html': '<div>I\'m marked!... MAAARKED!!</div><brd-mark id="b8b568b9a-c02e1576-c3a3c120-167cedda"></brd-mark>'
},
'metadata': {'text/html': {'brd_did': 'b8b568b9a-c02e1576-c3a3c120-167cedda'}},
'transient': {'display_id': 'b8b568b9a-c02e1576-c3a3c120-167cedda'}
},
'metadata': {}
}did='b595555a9-c867f02c-47d4b361-005ec7c4'
21
At cell runtime, current cell can be accesed as __nb__[__cellinfo__.cell_id]; after cell execution, it can be accesed as __nb__[__lastcellinfo__.cell_id].
Note NBCell instance lifecycle: - Before code execution: an instance is created with source and id - After display statement: the instance is updated with display_data output - After cell run: the instance is updated with execute_result output
For convenience, Bridged stores in dhs the last display handles used.
{
...,
'msg_type': 'update_display_data',
'content': {
'data': {
'text/plain': '<IPython.core.display.HTML object>',
'text/html': "<div>I'm doomed!... DOOOOOMED!!</div>"},
'metadata': {},
'transient': {'display_id': 'b5c00d851-9da4a95e-36473c24-3d04534d'}
},
'metadata': {}
}
{
...,
'msg_type': 'update_display_data',
'content': {
'data': {
'text/plain': '<IPython.core.display.HTML object>',
'text/html': '<div>I\'m doomed!... DOOOOOMED!!</div><brd-mark id="b5c00d851-9da4a95e-36473c24-3d04534d"></brd-mark>'
},
'metadata': {},
'transient': {'display_id': 'b5c00d851-9da4a95e-36473c24-3d04534d'}
},
'metadata': {}
}i.e.,
display(..., display_id=True)ordisplay(..., display_id="...")
{
...,
'msg_type': 'display_data',
'content': {
'data': {
'text/plain': '<IPython.core.display.HTML object>',
'text/html': "<div>I'm marked!... MAAARKED!!</div>"
},
'metadata': {},
'transient': {'display_id': '2307db4acc4fda0ba305ffdda518748a'}
},
'metadata': {}
}
{
...,
'msg_type': 'display_data',
. 'content': {
'data': {
'text/plain': '<IPython.core.display.HTML object>',
'text/html': '<div>I\'m marked!... MAAARKED!!</div><brd-mark id="2307db4acc4fda0ba305ffdda518748a"></brd-mark>'
},
'metadata': {'brd_did': '2307db4acc4fda0ba305ffdda518748a'},
'transient': {'display_id': '2307db4acc4fda0ba305ffdda518748a'}
},
'metadata': {}
}{
...,
'msg_type': 'update_display_data',
'content': {
'data': {'text/plain': '<IPython.core.display.HTML object>', 'text/html': "<div>I'm doomed!... DOOOOOMED!!</div>"},
'metadata': {},
'transient': {'display_id': 'c3d21633d341d2463f13ef40730e8c4a'}
},
'metadata': {}
}
{
...,
'msg_type': 'update_display_data',
'content': {
'data': {
'text/plain': '<IPython.core.display.HTML object>',
'text/html': '<div>I\'m doomed!... DOOOOOMED!!</div><brd-mark id="c3d21633d341d2463f13ef40730e8c4a"></brd-mark>'
},
'metadata': {},
'transient': {'display_id': 'c3d21633d341d2463f13ef40730e8c4a'}
},
'metadata': {}
}ffff
When using transient display messages with the display function, multi objects display is ill-defined.
IPython display assign the same display_id to each object. The front-end however will handle it differently.
VSCode displays all objects but only consider transient the last one.
Lab/Notebook repeats the last object as many times as the number of objects sent.
We can sidestep the issue by using specific Bridge metadata.
Skipped
'Me too'
<IPython.core.display.JSON object>
Skip tagging all display objects.
Skip specific display object.
if bridge_cfg.auto_id is True, there’s no need to use bridge metadata. Every supported displayed object (see _BRDD_MIMES) will receive an auto-generated display id.
Caveat: be aware that VSCode limits the number of transient display ids (1000 last time I checked); not Jupyter, I believe.
'i+1=1'
'i+1=2'
'i+1=3'
'i+1=4'
'i+1=5'
def show_msgs(brdd: Bridged):
for msg in brdd.msgs.copy():
d = msg.copy()
# d['parent_header'] = {'...': '...'}
# d['header'] = {'...': '...'}
del d['parent_header'], d['header'], d['tracker'], d['msg_id']
if not d['metadata']: del d['metadata']
try: del d['content']['data']['text/plain']
except: pass
if h := d['content']['data'].get('text/html'): d['content']['data']['text/html'] = shorten(h, 'r', 120)
cprint(d)
if DEBUG(): show_msgs(brdd)Bridget goal is to control (at least) all HTML output. Bridge can now set metadata of any display message, those that go through display_pub. Bridge captures all display, direct or FastHTML bridge.
But there’s other way to produce output that doesn’t follow the display_pub path: auto display of cell’s final expression. That goes through another code path, display_hook.
Here Bridget leverage IPython’s own capture mechanism to intercept cell results and redirect to display, a path that Bridge already control.
Bridge just captures cell outputs, not stdout/err (yet)
def _transform(lines):
"Input transformer function"
cpt = get_capturer()
if not lines or cpt._capturing or cpt._debugging: return lines
if DEBUG(): cpt._lines.append(lines)
if lines[0].startswith('import debugpy'):
cpt._debugging = True
return lines
elif lines[0].startswith('import debugpy;debugpy.listen('): return lines
elif lines[0].startswith('import debugpy\ndebugpy.debug_this_thread()'): return lines
elif lines[0].startswith('def __jupyter_exec_background__()'): return lines
elif lines[0].startswith('import builtins') and lines[1].startswith('import ipykernel'): return lines
elif lines[0].startswith('import os as _VSCODE_os') and lines[1].startswith('_VSCODE_fileList ='): return lines
return ['get_capturer()(%r)\n' % ''.join(lines)]
_transform.has_side_effects = False
class OutputCapture:
shell: InteractiveShell
def __init__(self):
super().__init__()
self._active, self.shell = False, get_ipython() # type: ignore
if DEBUG(): self._captures = deque(maxlen=100); self._lines = deque(maxlen=100)
self._capturing, self._debugging, self.run_outputs = False, False, []
self.displayhook = CapturingDisplayHook(shell=self.shell, outputs=self.run_outputs)
@property
def active(self): return self._active
def start(self):
if self._active: return
self._active = True
self.shell.user_ns['get_capturer'] = get_capturer
if DEBUG(): self._captures = deque(maxlen=100)
# shell.input_transformer_manager.line_transforms.append(_transform)
self.shell.input_transformers_post.append(_transform)
def stop(self):
if not self._active: return
self._active = False
# try: shell.input_transformer_manager.line_transforms.remove(_transform)
try: self.shell.input_transformers_post.remove(_transform)
except (ValueError, NameError): pass
def __del__(self): self.stop()
@contextmanager
def _capture(self):
self.run_outputs.clear()
try:
save_display_hook, sys.displayhook = sys.displayhook, self.displayhook
self._capturing = True
yield CapturedIO(stdout=None, stderr=None, outputs=self.run_outputs)
finally:
self._capturing = False
sys.displayhook = save_display_hook
def __call__(self, cell):
info: AD = self.shell.user_ns.get('__cellinfo__') # type: ignore
with self._capture() as io:
self.shell.run_cell(cell, cell_id=info.cell_id)
if DEBUG(): self._captures.append([cell, io._outputs.copy()])
if io._outputs:
assert len(io._outputs) <= 1, "Only one output is supported"
info.exec_result.result = io._outputs[-1]
display(io.outputs[-1], metadata={'bridge': {'captured': True}})
__capturer__ = None
def get_capturer(start:bool=False):
global __capturer__
get_csi(True)
if __capturer__ is None: __capturer__ = OutputCapture()
if start: __capturer__.start()
return __capturer____cellinfo__.result has a valid value only after the display(...) call. The cell with 1+3 captures the output and then displays it with display(...). That occurs after the cell is executed. So, __cellinfo__.result is None during the cell execution. It’s only possible to get the output after the cell has run.
--------------------------------------------------------------------------- ZeroDivisionError Traceback (most recent call last) Cell In[90], line 1 ----> 1 1/0 ZeroDivisionError: division by zero
['1+3\n', [{'data': {'text/plain': '4'}, 'metadata': {}}]]
---------
[ "info = __lastcellinfo__\ntest_eq(__nb__[__lastcellinfo__.cell_id].outputs[0].data, {'text/plain': '4'})\ntest_eq(len(brdd.dhs), 1)\nshow(DetailsJSON(__lastcellinfo__, openall=True))\n", [] ]
---------
[ "HTML('<div>asdf</div>')\n", [ { 'data': { 'text/plain': '<IPython.core.display.HTML object>', 'text/html': '<div>asdf</div>\n<brd-mark id="b3c829493-53426d78-6c9050ad-6e20e069"></brd-mark>' }, 'metadata': {} } ] ]
---------
[ 'if DEBUG(): \n output_data = get_capturer()._captures[-1][1][0][\'data\']\n test_eq(output_data[\'text/plain\'], \'<IPython.core.display.HTML object>\')\n html = output_data[\'text/html\']\n test_eq(f\'<brd-mark id="{brdd.dh.display_id}"\' in html, True) # type: ignore\ntest_eq(len(brdd.dhs), 3)\n', [] ]
---------
['print(10)\n17\n', [{'data': {'text/plain': '17'}, 'metadata': {}}]]
---------
['1/0\n', []]
---------
['get_bridged() .stop()\nget_csi().stop()\nget_capturer().stop()\n', []]
---------
{ 'msg_type': 'display_data', 'content': { 'data': {}, 'metadata': {'bridge': {'captured': True}}, 'transient': {'display_id': 'b34bd6418-582983a0-eaaadd4f-12e98d9e'} } }
{ 'msg_type': 'display_data', 'content': { 'data': { 'text/html': '<style>details ul { list-style-type:none; list-style-position: outside; padding-inline-start: 22px; margin: 0; } details…' }, 'metadata': {}, 'transient': {'display_id': 'bc4743314-1703b8af-012c6759-46fba1e2'} } }
{ 'msg_type': 'display_data', 'content': { 'data': {'text/html': '<div>asdf</div>\n<brd-mark id="b3c829493-53426d78-6c9050ad-6e20e069"></brd-mark>'}, 'metadata': {'bridge': {'captured': True}}, 'transient': {'display_id': 'b3c829493-53426d78-6c9050ad-6e20e069'} } }
{ 'msg_type': 'display_data', 'content': { 'data': {}, 'metadata': {'bridge': {'captured': True}}, 'transient': {'display_id': 'b4e7c9b47-1eeb4595-2b85a414-25b2e188'} } }
Output capture with AST hooks.
OutputCapture works correctly, but conflicts with the debugger abounds as it alters the source code. Fortunately, IPython has another much more powerful hook mechanism, ast_transformers, that is cleaner and more flexible.
get_capturer (start=False)
CaptureTransformer (mode='direct')
*A :class:NodeVisitor subclass that walks the abstract syntax tree and allows modification of nodes.
The NodeTransformer will walk the AST and use the return value of the visitor methods to replace or remove the old node. If the return value of the visitor method is None, the node will be removed from its location, otherwise it is replaced with the return value. The return value may be the original node in which case no replacement takes place.
Here is an example transformer that rewrites all occurrences of name lookups (foo) to data['foo']::
class RewriteName(NodeTransformer):
def visit_Name(self, node):
return Subscript(
value=Name(id='data', ctx=Load()),
slice=Constant(value=node.id),
ctx=node.ctx
)
Keep in mind that if the node you’re operating on has child nodes you must either transform the child nodes yourself or call the :meth:generic_visit method for the node first.
For nodes that were part of a collection of statements (that applies to all statement nodes), the visitor may also return a list of nodes rather than just a single node.
Usually you use the transformer like this::
node = YourTransformer().visit(node)*
Note that CaptureTranformer changes the semantics of IPython code execution because is effectively disabling the Output caching system as it’s intercepting all cell outputs. If you have any use for _|_<n>|_oh|Out variables, CaptureTranformer has the same effect as setting InteractiveShell.cache_size to 0.
During cell execution, IPython replaces sys.displayhook with a custom DisplayHook instance responsible for displaying the result of the cell execution (among many other things). Bridget now handles cell outputs (and does what displayhook did before to show the cell result). The shell displayhook then always receives a result of None. Bridget must replicate some (not sure what to do about the shell.history_manager) of the functionality of displayhook to ensure that output cache variables are updated correctly (see __call__).
{'application/javascript',
'application/json',
'application/pdf',
'image/jpeg',
'image/png',
'image/svg+xml',
'text/html',
'text/latex',
'text/markdown',
'text/plain'}
What does this module do?
transient calls with a display_id connected to the cell_id.This way, Bridget effectively knows what each cell output is (except stdout/err for now) and how to reference and modify it using standard IPython features.
Is this enough to make Bridget a functional notebook editor? Not really, we need real-time updates of the notebook structure (the notebook state, what would be saved to disk as .ipynb) to be able to navigate the notebook.
That unfortunately requires to navigate the procellous waters of widgets and extensions. Good ol’ IPython and Jupyter are not designed to give the kernel knowledge about the notebook state in real time. We pythonistas deluded ourselves into thinking Jupiter is all about us, but a Jupyter notebook is really a JavaScript application that controls everything, including the model and the view. The kernel is a second class citizen that knows next to nothing about or even what is a notebook.
We’re going to fix that next.