Notebook Statistics

stats can show the following statistics:

  • summary of required notebook metadata (always shown)

  • list of other notebook metadata fields

  • list of additional notebook fields (outside metadata; should normally not be present)

  • cell statistics:

    • count per cell type (markdown, code, raw), and their total

    • size statistics for cell sources for all cells, and for markdown, code, and raw cells separately:

      • number of empty cells

      • total number of lines (non-empty), words, characters (non-whitespace).

    • count per cell metadata field, and

      • count per tag in tags metadata

    • count attachments, and

      • count per attachment MIME type

    • count outputs for code cells (note that there can be multiple outputs per code cell)

      • count of code cells without output

      • count per output type (execute_result, stream, display_data, error), and their total

      • count per stream (stdout, stderr)

      • count per error (by ename)

    • execution counts for code cells:

      • count executed

      • count executed in linear order

      • maximum execution count (max # in In[#])

      • count not executed

      • count not executed in linear order

    • list of additional cell fields (outside metadata; should normally not be present)

Use options to select reporting of specific statistics.

Options

The following options are supported by stats:

--all, --no-all       show all statistics (default: False)
-c, --cell-types, -C, --no-cell-types
                      count cell types (default: True)
-s, --sources, -S, --no-sources
                      statistics for cell sources (default: False)
-m, --metadata, -M, --no-metadata
                      show notebook metadata and count cell metadata
                      (default: False)
-t, --tags, -T, --no-tags
                      count individual cell tags (default: False)
-a, --attachments, -A, --no-attachments
                      count cell attachment MIME types (default: False)
-o, --outputs, -O, --no-outputs
                      count code cell outputs (default: False)
--streams, --no-streams
                      count code cell output stream names (default: False)
-e, --errors, -E, --no-errors
                      count code cell error names (default: False)
-x, --execution, -X, --no-execution
                      statistics for code execution (default: False)
--extra, --no-extra   report extra fields outside metadata (default: False)

JSON Output

See Write JSON Output for general information about JSON output.

stats produces the following members in the JSON output, where (*) refers to details below:

Name

Value

"notebook_metadata"

object with required notebook metadata

"notebook_other_metadata"

object with 1 for each other notebook metadata field

"notebook_extra_fields"

object with 1 for each extra notebook fields

"cell_types"

object with count per cell type (*)

"sources"

object with empty/line/word/char/total source counts per cell type and totals (*)

"cell_metadata"

object with counts of cell metadata fields

"cell_attachments"

object counts of cells attachments per type and totals

"code_execution"

object with counts of for cell execution (*)

"code_outputs"

object with counts of code cell output per type and totals (*)

"cell_extra"

object with count for each extra cell field

Details for "cell_types":

Name

Value

"markdown"

count of markdown cells

"code"

count of code cells

"raw"

count of raw cells

"total cell count"

count of all cells

Details for "sources":

Name

Value

"CT source W"

count of W for cell type CT

"CT empty sources"

count of empty sources for cell type CT

Where cell type CT is on of markdown, code, raw, or total, and what W is one of chars, lines, words.

Details for "code_execution":

Name

Value

"executed"

count of executed code cells

"executed in linear order"

count of code cells executed in linear order from beginning

"maximum In[#]"

maximum execution code of executed code cells

"not executed"

count of code cells not executed

"not executed in linear order"

count of code cells not executed in linear order from beginning

Details for "code_outputs":

Name

Value

"empty outputs"

count of code cells without output

"display_data"

count of display_data output

"error"

count of error output (total)

"error E"

count of error output with exception E

"execute_result"

count of code cells with an execution result as output

"stream"

count of code cells with stream output (total)

"stream S"

count of code cells with stream S output

"total output count"

count of all output items (total over all code cells)

Note that each code cell can have zero or more output items in its outputs array.

Examples

Report required notebook metadata and cell types for two notebooks:

$ nbtb stats short.ipynb test.ipynb --output-json nbtb-stats-output.json

::::::::::::::
short.ipynb
::::::::::::::
Notebook metadata:
         4.2 format version
     python3 kernel
  python 3.6.1 language
Cell types:
           1 code
           1 markdown
           2 total cell count

::::::::::::::
test.ipynb
::::::::::::::
Notebook metadata:
         4.2 format version
     python3 kernel
  python 3.6.1 language
Cell types:
          12 code
           7 markdown
          19 total cell count

Totals
======

Cell types:
          13 code
           8 markdown
          21 total cell count

All statistics for one notebook:

$ nbtb stats --all test.ipynb
Notebook metadata:
         4.2 format version
     python3 kernel
  python 3.6.1 language
Other notebook metadata fields:
           1 celltoolbar
           1 toc
Cell types:
          12 code
           7 markdown
          19 total cell count
Cell sources:
         561 code source chars
          20 code source lines
         115 code source words
         454 markdown source chars
          21 markdown source lines
          89 markdown source words
        1015 total source chars
          41 total source lines
         204 total source words
Cell metadata fields:
           6 collapsed
           2 scrolled
           1 tag Test
           3 tag YourTurn
           3 tags
Cell attachments:
           1 image/png
           1 total attachments count
           1 total count of cells with attachments
Code cell outputs:
           3 code cells without outputs
           1 display_data
           2 error
           1 error NameError
           1 error ZeroDivisionError
           4 execute_result
           5 stream
           1 stream stderr
           4 stream stdout
          12 total output count
Code cell execution:
          11 executed
           3 executed in linear order
          15 maximum In[#]
           1 not executed
           9 not executed in linear order