is a python utility to plot a histogram of the occurrences of
n-grams computed from a given string.
e.g. For a plot of bigrams
./ --text 'the quick brown fox and the quick brown hare' --n 2
[('the quick', 2), ('quick brown', 2), ('brown fox', 1), ('fox and', 1), ('and the', 1), ('brown hare', 1)]
bigram count
++++++++++++++++++++++++++++++++++++++++++++++++++ 2 the quick
++++++++++++++++++++++++++++++++++++++++++++++++++ 2 quick brown
+++++++++++++++++++++++++ 1 brown fox
+++++++++++++++++++++++++ 1 fox and
+++++++++++++++++++++++++ 1 and the
+++++++++++++++++++++++++ 1 brown hare
For a plot of trigrams
./ --text 'the quick brown fox and the quick brown hare' --n 3
[('the quick brown', 2), ('quick brown fox', 1), ('brown fox and', 1), ('fox and the', 1), ('and the quick', 1), ('quick brown hare', 1)]
bigram count
++++++++++++++++++++++++++++++++++++++++++++++++++ 2 the quick brown
+++++++++++++++++++++++++ 1 quick brown fox
+++++++++++++++++++++++++ 1 brown fox and
+++++++++++++++++++++++++ 1 fox and the
+++++++++++++++++++++++++ 1 and the quick
+++++++++++++++++++++++++ 1 quick brown hare
To pass in a file as input
./ --file /path/to/some/file
To run ./
in a docker container
make image
docker run --rm -it ngram /venv/runner --file /etc/motd
To supply a file on the docker host
docker run -it -v "$PWD:$PWD" -w "$PWD" ngram /venv/runner --file "$PWD/path/to/some/file"
Unit tests cover the n-gram parsing and counting logic and are invoked via pytest
using a make target.
$ make test
. venv/bin/activate ; \
sh -c ' \
pytest -rxXs --tap-stream tests/*/*.py --color=auto --full-trace; \
pytest -rxXs --cov=ngram tests/*/*.py; \
ok 1 tests/integration/
ok 2 tests/integration/
ok 3 tests/unit/
ok 4 tests/unit/
ok 5 tests/unit/
ok 6 tests/unit/
ok 7 tests/unit/
=================================== test session starts ====================================
platform linux -- Python 3.7.3rc1, pytest-4.4.0, py-1.8.0, pluggy-0.9.0
rootdir: /home/unop/projects/bigram
plugins: tap-2.3, cov-2.6.1
collected 7 items
tests/integration/ .. [ 28%]
tests/unit/ ..... [100%]
===================================== warnings summary =====================================
--------- coverage: platform linux, python 3.7.3-candidate-1 ---------
Name Stmts Miss Cover
ngram/ 14 0 100%
ngram/ 13 0 100%
ngram/ 2 0 100%
TOTAL 29 0 100%
=========================== 7 passed, 1 warnings in 0.04 seconds =======================
A Makefile
is included to install the necessary python dependencies
into a venv and run tests, etc.
make venv # Setup a venv and install all prerequisites on a clean slate
make run # Run ./
make image # Create a docker image with a venv to run ./
make test # Run NgramParser unit tests
make clean # Clean up workspace and remove venv back to a clean slate
make all # Do a complete end-to-end and run all make targets
# i.e. make venv, make test, make run, make clean
make venv
will attempt to install these for a debian-based system/container
if they are found to be missing.
or similar)venv
or similar)curl