Welcome to mongodb_toolbox’s documentation!

Package mongodb_toolbox provides helpers for a few mongodb read/write patterns.

mongodb_toolbox.toolbox.bulk_write(db, colname, ops, stats_callback=None)

Apply multiple data operations to the collection using mongodb bulk interface.

Parameters:
  • db (Database[RawBSONDocument]) – database object

  • colname (str) – name of collection to write items into

  • ops (list[Union[pymongo.operations.UpdateOne, pymongo.operations.InsertOne, pymongo.operations.DeleteOne]]) – list of mongodb operations

  • stats_callback (Callable[[...], None] | None) – callback to track statistics

Return type:

BulkWriteResult

class mongodb_toolbox.BulkWriter

Class to collect mongodb operations and execute them with bulk interface.

Actual bulk write happens when number of pending operations reaches defined threshold.

__init__(db, colname, bulk_size=100, stats_callback=None)

Build BulkWriter instance.

Parameters:
  • db (Database[RawBSONDocument]) – database object

  • colname (str) – name of collection to apply operations to

  • bulk_size (int) – number of operations to store before run them with bulk interface

  • stats_callback (Callable[[...], None] | None) – callback to track statistics

Return type:

None

flush()

Run all pending operations.

Return type:

BulkWriteResult | None

insert_one(*args, **kwargs)

Add new InsertOne operation to list of pending operations.

Parameters:
  • *args (Any) – goes directly to UpdateOne constructor

  • **kwargs (Any) – goes directly to UpdateOne constructor

Return type:

BulkWriteResult | None

If number of operations reaches threshold then execute all operations with mongodb bulk interface.

update_one(*args, **kwargs)

Add new UpdateOne operation to list of pending operations.

Parameters:
  • *args (Any) – goes directly to UpdateOne constructor

  • **kwargs (Any) – goes directly to UpdateOne constructor

Return type:

BulkWriteResult | None

If number of operations reaches threshold then execute all operations with mongodb bulk interface.

mongodb_toolbox.toolbox.iterate_collection(db, colname, query, sort_field, chunk_len=1000, fields=None, infinite=False, limit=None, recent_id=None, no_items_sleep_time=5)

Iterate items in a collection.

The function fetches chunk of items at once, iterates over it, then gets next chunk.

Parameters:
  • db (Database[RawBSONDocument]) –

  • colname (str) –

  • query (dict[str, Any]) –

  • sort_field (str) –

  • chunk_len (int) –

  • fields (dict[str, int] | None) –

  • infinite (bool) –

  • limit (int | None) –

  • recent_id (int | None) –

  • no_items_sleep_time (int) –

Return type:

Iterable[Any]

mongodb_toolbox.toolbox.bulk_insert_dup_retok(db, colname, ops, dup_key, stats_callback=None)
Parameters:
  • db (Database[RawBSONDocument]) –

  • colname (str) –

  • ops (list[pymongo.operations.InsertOne[Any]]) –

  • dup_key (str | list[str]) –

  • stats_callback (Callable[[...], None] | None) –

Return type:

list[Any]

mongodb_toolbox.toolbox.bulk_insert_dup(db, colname, ops, stats_callback=None)

Write multiple insert operations ignoring all duplicate key errors.

Parameters:
  • db (Database[RawBSONDocument]) –

  • colname (str) –

  • ops (list[pymongo.operations.InsertOne[Any]]) –

  • stats_callback (Callable[[...], None] | None) –

Return type:

None

Indices and tables