This adds a new extra package dependency on `json-logging`
and an environment variable, which when set to "true" regardless
of case, will try to use the json-logging non-web app log formatter.
If the `json-logging` package is not installed but the environment
variable is True, something like this will be logged but it will not
crash the application:
```
$ ENABLE_JSON_LOGGING=true jupyter notebook
Unable to use json logging due to missing packages. Run "pip install notebook[json-logging]" to fix.
Traceback (most recent call last):
File "/home/osboxes/jupyter/notebook/notebook/notebookapp.py", line 144, in <module>
import json_logging
ModuleNotFoundError: No module named 'json_logging'
```
Initially I tried to add a new Bool config option to toggle this but
the problem is (from my limited understanding of traitlets and tornado)
is that `_log_formatter_cls` needs to be set early and trying to se the
log formatter later in `init_logging` is too late - or at least I couldn't
figure out a way to reset the log formatter dynamically (I tried calling
`_log_format_changed` [1] but that didn't work).
With this working you get logs like this:
```
{"written_at": "2020-10-07T21:10:51.606Z", "written_ts": 1602105051606265000, "msg": "404 GET /nbextensions/widgets/notebook/js/extension.js (127.0.0.1) 9.26ms referer=http://localhost:8888/notebooks/Untitled.ipynb", "type": "log", "logger": "NotebookApp", "thread": "MainThread", "level": "WARNING", "module": "log", "line_no": 49}
{"written_at": "2020-10-07T21:10:54.443Z", "written_ts": 1602105054443309000, "msg": "Starting buffering for f260ddd7-938c-42d0-ac3b-455bea76694f:49f30b53fc4b4ec6a8f2fb748a171613", "type": "log", "logger": "NotebookApp", "thread": "MainThread", "level": "INFO", "module": "kernelmanager", "line_no": 222}
{"written_at": "2020-10-07T21:10:54.446Z", "written_ts": 1602105054446264000, "msg": "Kernel shutdown: f260ddd7-938c-42d0-ac3b-455bea76694f", "type": "log", "logger": "NotebookApp", "thread": "MainThread", "level": "INFO", "module": "multikernelmanager", "line_no": 201}
```
An obvious improvement here would to be able to split the `msg` fields apart
so that we can log things like response status code, request method, request
URL, response_time_ms etc. That should be possible with the `JSONLogWebFormatter`
from `json-logging` but when I tried using that I was getting errors from
the library about a correlation id (which is based on a request header we
don't use). The `json-logging` library supports several web frameworks like
Flask but unfortunately does not have built in support for Tornado, but it does
support custom formatters so that might be a follow up option to improve on this.
[1] https://github.com/ipython/traitlets/blob/4.3.3/traitlets/config/application.py#L195
Closes #5798
pull/5799/head
parent
2ba296039a
commit
1929fb53b8
Loading…
Reference in new issue