Rapid¶
- class rapid.rapid.Rapid(auth: Optional[RapidAuth] = None)¶
Bases:
objectThe rAPId class is the main SDK class for the rAPId API. It acts as a wrapper for the various API endpoints, providing a simple and intuitive programmatic interface.
- Parameters:
auth (
rapid.auth.RapidAuth, optional) – An instance of the rAPId auth class, which is used for authentication and authorization with the API. Defaults to None.
- convert_dataframe_for_file_upload(df: DataFrame)¶
Converts a pandas DataFrame to a format that can be used for file uploads to the API.
- Parameters:
df (DataFrame) – The pandas DataFrame to convert.
- Returns:
A dictionary containing the converted DataFrame in a format suitable for file uploads to the API.
- create_schema(schema: Schema)¶
Creates a new schema on the API.
- Parameters:
schema (
rapid.items.schema.Schema) – The schema model for which you want to create for.- Raises:
class: rapid.exceptions.SchemaAlreadyExistsException: If you try to create a schema that already exists in rAPId.
- Raises:
rapid.exceptions.SchemaCreateFailedException – If an error occurs while trying to update the schema.
- download_dataframe(domain: str, dataset: str, version: Optional[int] = None, query: Query = Query(select_columns=None, filter=None, group_by_columns=None, aggregation_conditions=None, order_by_columns=None, limit=None)) DataFrame¶
Downloads data to a pandas DataFrame based on the domain, dataset and version passed.
- Parameters:
domain (str) – The domain of the dataset to download the DataFrame from.
dataset (str) – The dataset from the domain to download the DataFrame from.
version (int, optional) – Version of the dataset to download.
query (
rapid.items.query.Query, optional) – An optional query type to provide when downloading data. Defaults to empty.
- Raises:
DatasetNotFoundException –
rapid.exceptions.DatasetNotFoundException: If the specificed domain, dataset and version to download does not exist in the rAPId instance we throw the dataset not found exception.- Returns:
A pandas DataFrame of the data
- Return type:
DataFrame
- fetch_job_progress(_id: str)¶
Makes a GET request to the API to fetch the progress of a specific job.
- Parameters:
_id (str) – The ID of the job to fetch the progress for.
- Returns:
A JSON response of the API’s response.
For more details on the response structure, see the API documentation: https://getrapid.link/api/docs#/Jobs/get_job_jobs__job_id__get
- generate_headers() Dict¶
- generate_info(df: DataFrame, domain: str, dataset: str)¶
Generates metadata information for a pandas DataFrame and a specified dataset in the API.
- Parameters:
df (DataFrame) – The pandas DataFrame to generate metadata for.
domain (str) – The domain of the dataset to generate metadata for.
dataset (str) – The name of the dataset to generate metadata for.
- Raises:
rapid.exceptions.DatasetInfoFailedException – If an error occurs while generating the metadata information.
- Returns:
A dictionary containing the metadata information for the DataFrame and dataset.
- generate_schema(df: DataFrame, domain: str, dataset: str, sensitivity: str) Schema¶
Generates a schema for a pandas DataFrame and a specified dataset in the API.
- Parameters:
df (DataFrame) – The pandas DataFrame to generate a schema for.
domain (str) – The domain of the dataset to generate a schema for.
dataset (str) – The name of the dataset to generate a schema for.
sensitivity (str) – The sensitivity level of the schema to generate.
- Raises:
rapid.exceptions.SchemaGenerationFailedException – If an error occurs while generating the schema.
- Returns:
A Schema class type from the generated schema for the DataFrame and dataset.
- Return type:
- list_datasets()¶
Makes a POST request to the API to list the current datasets.
- Returns:
A JSON response of the API’s response.
For more details on the response structure, see the API documentation: https://getrapid.link/api/docs#/Datasets/list_all_datasets_datasets_post
- update_schema(schema: Schema)¶
Uploads a new updated schema to the API.
- Parameters:
schema (
rapid.items.schema.Schema) – The new schema model that will be used for the update.- Raises:
rapid.exceptions.SchemaUpdateFailedException – If an error occurs while trying to update the schema.
- upload_dataframe(domain: str, dataset: str, df: DataFrame, wait_to_complete: bool = True)¶
Uploads a pandas DataFrame to a specified dataset in the API.
- Parameters:
domain (str) – The domain of the dataset to upload the DataFrame to.
dataset (str) – The name of the dataset to upload the DataFrame to.
df (DataFrame) – The pandas DataFrame to upload.
wait_to_complete (bool, optional) – Whether to wait for the upload job to complete before returning. Defaults to True.
Raises:
rapid.exceptions.DataFrameUploadValidationException: If the DataFrame’s schema is incorrect.rapid.exceptions.DataFrameUploadFailedException: If an unexpected error occurs while uploading the DataFrame.- Returns:
If wait_to_complete is True, returns “Success” if the upload is successful. If wait_to_complete is False, returns the ID of the upload job if the upload is accepted.
- wait_for_job_outcome(_id: str, interval: int = 1)¶
Makes periodic requests to the API to wait for the outcome of a specific job.
- Parameters:
_id (str) – The ID of the job to wait for the outcome of.
interval (int, optional) – The number of seconds to sleep between requests to the API. Defaults to 1.
- Returns:
None if the job is successful.
- Raises:
rapid.exceptions.JobFailedException – If the job outcome failed.