clappform.utils

This module defines type aliases and a data structure used for configuring gRPC RPC call options.

clappform.utils.default_options(max_attemps=5, initial_backoff='0.1s', max_backoff='1s', backoff_multiplier=2, retryable_status_codes=None)[source]

Generates default gRPC channel options with retry configuration.

This function creates a list of gRPC channel options that include retry policies and service configuration. It constructs a service configuration JSON for gRPC retry policies based on the provided parameters.

Parameters:
  • max_attemps (int) – The maximum number of retry attempts for a failed RPC call. Default is 5.

  • initial_backoff (str) – The initial backoff duration between retry attempts, specified as a string with units (e.g., “0.1s”). Default is “0.1s”.

  • max_backoff (str) – The maximum backoff duration between retry attempts, specified as a string with units (e.g., “1s”). Default is “1s”.

  • backoff_multiplier (int) – The multiplier applied to the backoff duration for each retry attempt. Default is 2.

  • retryable_status_codes (Optional[list[str]]) – A list of gRPC status codes that are considered retryable. If not provided, defaults to [“UNAVAILABLE”].

Returns:

A list of gRPC channel options as tuples, where each tuple consists of an option name and its value.

Return type:

GrpcChannelOptions

Example:

>>> options = default_options()
>>> print(options)
[("grpc.enable_retries", 1), ("grpc.service_config", '{"methodConfig": [{"name": [{}], "retryPolicy": {"maxAttempts": 5, "initialBackoff": "0.1s", "maxBackoff": "1s", "backoffMultiplier": 2, "retryableStatusCodes": ["UNAVAILABLE"]}}]}')]

This function is used to configure gRPC channel options with retry policies for better resilience in network operations.

clappform.utils.insert_many_dataframe(collection, df, size=2500, encoding='utf-8')[source]

Yields InsertRequest objects for chunks of a DataFrame.

This function splits a pandas DataFrame into smaller chunks and yields InsertRequest objects containing JSON-encoded data from each chunk.

Parameters:
  • collection (str) – The name of the collection where data will be inserted.

  • df (pandas.DataFrame) – The DataFrame to be split into chunks and inserted.

  • size (int, optional) – The size of each chunk. Defaults to 2500.

  • encoding (str, optional) – The encoding to be used for JSON data. Defaults to “utf-8”.

Returns:

An iterator over InsertRequest objects containing the JSON-encoded data.

Return type:

Iterator[InsertRequest]

Raises:

ValueError – If the DataFrame is empty.