Rapid

class rapid.rapid.Rapid(auth: Optional[RapidAuth] = None)

Bases: object

The rAPId class is the main SDK class for the rAPId API. It acts as a wrapper for the various API endpoints, providing a simple and intuitive programmatic interface.

Parameters:

auth (rapid.auth.RapidAuth, optional) – An instance of the rAPId auth class, which is used for authentication and authorization with the API. Defaults to None.

convert_dataframe_for_file_upload(df: DataFrame)

Converts a pandas DataFrame to a format that can be used for file uploads to the API.

Parameters:

df (DataFrame) – The pandas DataFrame to convert.

Returns:

A dictionary containing the converted DataFrame in a format suitable for file uploads to the API.

fetch_job_progress(_id: str)

Makes a GET request to the API to fetch the progress of a specific job.

Parameters:

_id (str) – The ID of the job to fetch the progress for.

Returns:

A JSON response of the API’s response.

For more details on the response structure, see the API documentation: https://getrapid.link/api/docs#/Jobs/get_job_jobs__job_id__get

generate_headers() Dict
generate_info(df: DataFrame, domain: str, dataset: str)

Generates metadata information for a pandas DataFrame and a specified dataset in the API.

Parameters:
  • df (DataFrame) – The pandas DataFrame to generate metadata for.

  • domain (str) – The domain of the dataset to generate metadata for.

  • dataset (str) – The name of the dataset to generate metadata for.

Raises:

rapid.exceptions.DatasetInfoFailedException – If an error occurs while generating the metadata information.

Returns:

A dictionary containing the metadata information for the DataFrame and dataset.

generate_schema(df: DataFrame, domain: str, dataset: str, sensitivity: str)

Generates a schema for a pandas DataFrame and a specified dataset in the API.

Parameters:
  • df (DataFrame) – The pandas DataFrame to generate a schema for.

  • domain (str) – The domain of the dataset to generate a schema for.

  • dataset (str) – The name of the dataset to generate a schema for.

  • sensitivity (str) – The sensitivity level of the schema to generate.

Raises:

rapid.exceptions.SchemaGenerationFailedException – If an error occurs while generating the schema.

Returns:

A dictionary containing the generated schema for the DataFrame and dataset.

list_datasets()

Makes a POST request to the API to list the current datasets.

Returns:

A JSON response of the API’s response.

For more details on the response structure, see the API documentation: https://getrapid.link/api/docs#/Datasets/list_all_datasets_datasets_post

upload_dataframe(domain, dataset, df: DataFrame, wait_to_complete: bool = True)

Uploads a pandas DataFrame to a specified dataset in the API.

Parameters:
  • domain (str) – The domain of the dataset to upload the DataFrame to.

  • dataset (str) – The name of the dataset to upload the DataFrame to.

  • df (DataFrame) – The pandas DataFrame to upload.

  • wait_to_complete (bool, optional) – Whether to wait for the upload job to complete before returning. Defaults to True.

Raises: rapid.exceptions.DataFrameUploadValidationException: If the DataFrame’s schema is incorrect. rapid.exceptions.DataFrameUploadFailedException: If an unexpected error occurs while uploading the DataFrame.

Returns:

If wait_to_complete is True, returns “Success” if the upload is successful. If wait_to_complete is False, returns the ID of the upload job if the upload is accepted.

wait_for_job_outcome(_id: str, interval: int = 1)

Makes periodic requests to the API to wait for the outcome of a specific job.

Parameters:
  • _id (str) – The ID of the job to wait for the outcome of.

  • interval (int, optional) – The number of seconds to sleep between requests to the API. Defaults to 1.

Returns:

None if the job is successful.

Raises:

rapid.exceptions.JobFailedException – If the job outcome failed.