DatasetApi¶
- class DatasetApi[source]¶
Bases:
supervisely.api.module_api.UpdateableModule
,supervisely.api.module_api.RemoveableModuleApi
API for working with
Dataset
.DatasetApi
object is immutable.- Parameters
- api : Api
API connection to the server.
- Usage example
import os from dotenv import load_dotenv import supervisely as sly # Load secrets and create API object from .env file (recommended) # Learn more here: https://developer.supervisely.com/getting-started/basics-of-authentication if sly.is_development(): load_dotenv(os.path.expanduser("~/supervisely.env")) api = sly.Api.from_env() # Pass values into the API constructor (optional, not recommended) # api = sly.Api(server_address="https://app.supervise.ly", token="4r47N...xaTatb") project_id = 1951 ds = api.dataset.get_list(project_id)
Methods
Copies given Dataset in destination Project by ID.
Copy given Datasets to the destination Project by IDs.
Create Dataset with given name in the given Project.
Checks if an entity with the given parent_id and name exists
Generates a free name for an entity with the given parent_id and name.
Get Datasets information by ID.
Return Dataset information by name or None if Dataset does not exist.
Returns list of dataset in the given project, or list of nested datasets in the dataset with specified parent_id.
List all available datasets from all available teams for the user that match the specified filtering criteria.
Get list of all or limited quantity entities from the Supervisely server.
This generator function retrieves a list of all or a limited quantity of entities from the Supervisely server, yielding batches of entities as they are retrieved
Checks if Dataset with given name already exists in the Project, if not creates Dataset with the given name.
Returns a tree of all datasets in the project as a dictionary, where the keys are the DatasetInfo objects and the values are dictionaries containing the children of the dataset.
NamedTuple DatasetInfo information about Dataset.
NamedTuple name - DatasetInfo.
Moves given Dataset in destination Project by ID.
Moves given Datasets to the destination Project by IDs.
Moves dataset with specified ID to the dataset with specified destination ID.
Remove an entity with the specified ID from the Supervisely server.
Remove entities with given IDs from the Supervisely server.
!!! WARNING !!! Be careful, this method deletes data from the database, recovery is not possible.
Yields tuples of (path, dataset) for all datasets in the project.
Attributes
MAX_WAIT_ATTEMPTS
Maximum number of attempts that will be made to wait for a certain condition to be met.
WAIT_ATTEMPT_TIMEOUT_SEC
Number of seconds for intervals between attempts.
- InfoType¶
alias of
supervisely.api.module_api.DatasetInfo
-
copy(dst_project_id, id, new_name=
None
, change_name_if_conflict=False
, with_annotations=False
)[source]¶ Copies given Dataset in destination Project by ID.
- Parameters
- dst_project_id : int
Destination Project ID in Supervisely.
- id : int
ID of copied Dataset.
- new_name : str, optional
New Dataset name.
- change_name_if_conflict : bool, optional
Checks if given name already exists and adds suffix to the end of the name.
- with_annotations : bool, optional
If True copies Dataset with annotations, otherwise copies just items from Dataset without annotation.
- Returns
Information about Dataset. See
info_sequence
- Return type
DatasetInfo
- Usage example
import supervisely as sly os.environ['SERVER_ADDRESS'] = 'https://app.supervise.ly' os.environ['API_TOKEN'] = 'Your Supervisely API Token' api = sly.Api.from_env() dst_proj_id = 1982 ds = api.dataset.get_list(dst_proj_id) print(len(ds)) # 0 new_ds = api.dataset.copy(dst_proj_id, id=2540, new_name="banana", with_annotations=True) ds = api.dataset.get_list(dst_proj_id) print(len(ds)) # 1
-
copy_batch(dst_project_id, ids, new_names=
None
, change_name_if_conflict=False
, with_annotations=False
)[source]¶ Copy given Datasets to the destination Project by IDs.
- Parameters
- dst_project_id : int
Destination Project ID in Supervisely.
- ids : List[int]
IDs of copied Datasets.
- new_names : List[str], optional
New Datasets names.
- change_name_if_conflict : bool, optional
Checks if given name already exists and adds suffix to the end of the name.
- with_annotations : bool, optional
If True copies Datasets with annotations, otherwise copies just items from Datasets without annotations.
- Raises
RuntimeError
if can not match “ids” and “new_names” lists, len(ids) != len(new_names)- Returns
Information about Datasets. See
info_sequence
- Return type
List[DatasetInfo]
- Usage example
import supervisely as sly os.environ['SERVER_ADDRESS'] = 'https://app.supervise.ly' os.environ['API_TOKEN'] = 'Your Supervisely API Token' api = sly.Api.from_env() dst_proj_id = 1980 ds = api.dataset.get_list(dst_proj_id) print(len(ds)) # 0 ds_ids = [2532, 2557] ds_names = ["lemon_test", "kiwi_test"] copied_datasets = api.dataset.copy_batch(dst_proj_id, ids=ds_ids, new_names=ds_names, with_annotations=True) ds = api.dataset.get_list(dst_proj_id) print(len(ds)) # 2
-
create(project_id, name, description=
''
, change_name_if_conflict=False
, parent_id=None
)[source]¶ Create Dataset with given name in the given Project.
- Parameters
- project_id : int
Project ID in Supervisely where Dataset will be created.
- name : str
Dataset Name.
- description : str, optional
Dataset description.
- change_name_if_conflict : bool, optional
Checks if given name already exists and adds suffix to the end of the name.
- parent_id :
Optional
[int
] Parent Dataset ID. If set to None, then the Dataset will be created at the top level of the Project, otherwise the Dataset will be created in a specified Dataset.
- Returns
Information about Dataset. See
info_sequence
- Return type
DatasetInfo
- Usage example
import supervisely as sly project_id = 116482 os.environ['SERVER_ADDRESS'] = 'https://app.supervise.ly' os.environ['API_TOKEN'] = 'Your Supervisely API Token' api = sly.Api.from_env() ds_info = api.dataset.get_list(project_id) print(len(ds_info)) # 1 new_ds = api.dataset.create(project_id, 'new_ds') new_ds_info = api.dataset.get_list(project_id) print(len(new_ds_info)) # 2
- exists(parent_id, name)¶
Checks if an entity with the given parent_id and name exists
- Parameters
- Returns
Returns True if entity exists, and False if not
- Return type
- Usage example
import supervisely as sly # You can connect to API directly address = 'https://app.supervise.ly/' token = 'Your Supervisely API Token' api = sly.Api(address, token) # Or you can use API from environment os.environ['SERVER_ADDRESS'] = 'https://app.supervise.ly' os.environ['API_TOKEN'] = 'Your Supervisely API Token' api = sly.Api.from_env() name = "IMG_0315.jpeg" dataset_id = 55832 exists = api.image.exists(dataset_id, name) print(exists) # True
- get_free_name(parent_id, name)¶
Generates a free name for an entity with the given parent_id and name. Adds an increasing suffix to original name until a unique name is found.
- Parameters
- Returns
Returns free name.
- Return type
- Usage example
import supervisely as sly # You can connect to API directly address = 'https://app.supervise.ly/' token = 'Your Supervisely API Token' api = sly.Api(address, token) # Or you can use API from environment os.environ['SERVER_ADDRESS'] = 'https://app.supervise.ly' os.environ['API_TOKEN'] = 'Your Supervisely API Token' api = sly.Api.from_env() name = "IMG_0315.jpeg" dataset_id = 55832 free_name = api.image.get_free_name(dataset_id, name) print(free_name) # IMG_0315_001.jpeg
-
get_info_by_id(id, raise_error=
False
)[source]¶ Get Datasets information by ID.
- Parameters
- id : int
Dataset ID in Supervisely.
- Returns
Information about Dataset. See
info_sequence
- Return type
DatasetInfo
- Usage example
import supervisely as sly dataset_id = 384126 os.environ['SERVER_ADDRESS'] = 'https://app.supervise.ly' os.environ['API_TOKEN'] = 'Your Supervisely API Token' api = sly.Api.from_env() ds_info = api.dataset.get_info_by_id(dataset_id)
-
get_info_by_name(project_id, name, fields=
None
, parent_id=None
)[source]¶ Return Dataset information by name or None if Dataset does not exist. If parent_id is not None, the search will be performed in the specified Dataset. Otherwise the search will be performed at the top level of the Project.
- Parameters
- Returns
Information about Dataset. See
info_sequence
- Return type
Union[DatasetInfo, None]
-
get_list(project_id, filters=
None
, recursive=False
, parent_id=None
)[source]¶ Returns list of dataset in the given project, or list of nested datasets in the dataset with specified parent_id. To get list of all datasets including nested, recursive parameter should be set to True. Otherwise, the method will return only datasets in the top level.
- Parameters
- project_id : int
Project ID in which the Datasets are located.
- filters : List[dict], optional
List of params to sort output Datasets.
- parent_id :
Optional
[int
] Parent Dataset ID. If set to None, the search will be performed at the top level of the Project, otherwise the search will be performed in the specified Dataset.
- Recursive
If True, returns all Datasets from the given Project including nested Datasets.
- Returns
List of all Datasets with information for the given Project. See
info_sequence
- Return type
List[DatasetInfo]
- Usage example
import supervisely as sly project_id = 1951 os.environ['SERVER_ADDRESS'] = 'https://app.supervise.ly' os.environ['API_TOKEN'] = 'Your Supervisely API Token' api = sly.Api.from_env() ds = api.dataset.get_list(project_id) print(ds) # Output: [ # DatasetInfo(id=2532, # name="lemons", # description="", # size="861069", # project_id=1951, # images_count=6, # items_count=6, # created_at="2021-03-02T10:04:33.973Z", # updated_at="2021-03-10T09:31:50.341Z", # reference_image_url="http://app.supervise.ly/z6ut6j8bnaz1vj8aebbgs4-public/images/original/...jpg"), # DatasetInfo(id=2557, # name="kiwi", # description="", # size="861069", # project_id=1951, # images_count=6, # items_count=6, # created_at="2021-03-10T09:31:33.701Z", # updated_at="2021-03-10T09:31:44.196Z", # reference_image_url="http://app.supervise.ly/h5un6l2bnaz1vj8a9qgms4-public/images/original/...jpg") # ]
-
get_list_all(filters=
None
, sort=None
, sort_order=None
, per_page=None
, page='all'
)[source]¶ List all available datasets from all available teams for the user that match the specified filtering criteria.
- Parameters
- filters : List[Dict[str, str]], optional
List of parameters for filtering the available Datasets. Every Dict must consist of keys: - ‘field’: Takes values ‘id’, ‘projectId’, ‘workspaceId’, ‘groupId’, ‘createdAt’, ‘updatedAt’ - ‘operator’: Takes values ‘=’, ‘eq’, ‘!=’, ‘not’, ‘in’, ‘!in’, ‘>’, ‘gt’, ‘>=’, ‘gte’, ‘<’, ‘lt’, ‘<=’, ‘lte’ - ‘value’: Takes on values according to the meaning of ‘field’ or null
- sort : str, optional
Specifies by which parameter to sort the project list. Takes values ‘id’, ‘name’, ‘size’, ‘createdAt’, ‘updatedAt’
- sort_order : str, optional
Determines which value to list from.
- per_page : int, optional
Number of first items found to be returned. ‘None’ will return the first page with a default size of 20000 datasets.
- page : Union[int, Literal["all"]], optional
Page number, used to retrieve the following items if the number of them found is more than per_page. The default value is ‘all’, which retrieves all available datasets. ‘None’ will return the first page with datasets, the amount of which is set in param ‘per_page’.
- Returns
Search response information and ‘DatasetInfo’ of all datasets that are searched by a given criterion.
- Return type
- Usage example
import supervisely as sly import os os.environ['SERVER_ADDRESS'] = 'https://app.supervisely.com' os.environ['API_TOKEN'] = 'Your Supervisely API Token' api = sly.Api.from_env() filter_1 = { "field": "updatedAt", "operator": "<", "value": "2023-12-03T14:53:00.952Z" } filter_2 = { "field": "updatedAt", "operator": ">", "value": "2023-04-03T14:53:00.952Z" } filters = [filter_1, filter_2] datasets = api.dataset.get_list_all(filters) print(datasets) # Output: # { # "total": 2, # "perPage": 20000, # "pagesCount": 1, # "entities": [ DatasetInfo(id = 16, # name = 'ds1', # description = None, # size = '861069', # project_id = 22, # images_count = None, # items_count = None, # created_at = '2020-04-03T13:43:24.000Z', # updated_at = '2020-04-03T14:53:00.952Z', # reference_image_url = None, # team_id = 2, # workspace_id = 2), # DatasetInfo(id = 17, # name = 'ds1', # description = None, # size = '1177212', # project_id = 23, # images_count = None, # items_count = None, # created_at = '2020-04-03T13:43:24.000Z', # updated_at = '2020-04-03T14:53:00.952Z', # reference_image_url = None, # team_id = 2, # workspace_id = 2 # ) # ] # }
-
get_list_all_pages(method, data, progress_cb=
None
, convert_json_info_cb=None
, limit=None
, return_first_response=False
)¶ Get list of all or limited quantity entities from the Supervisely server.
- Parameters
- method : str
Request method name
- data : dict
Dictionary with request body info
- progress_cb : Progress, optional
Function for tracking download progress.
- convert_json_info_cb : Callable, optional
Function for convert json info
- limit : int, optional
Number of entity to retrieve
- return_first_response : bool, optional
Specify if return first response
-
get_list_all_pages_generator(method, data, progress_cb=
None
, convert_json_info_cb=None
, limit=None
, return_first_response=False
)¶ This generator function retrieves a list of all or a limited quantity of entities from the Supervisely server, yielding batches of entities as they are retrieved
- Parameters
- method : str
Request method name
- data : dict
Dictionary with request body info
- progress_cb : Progress, optional
Function for tracking download progress.
- convert_json_info_cb : Callable, optional
Function for convert json info
- limit : int, optional
Number of entity to retrieve
- return_first_response : bool, optional
Specify if return first response
-
get_or_create(project_id, name, description=
''
, parent_id=None
)[source]¶ Checks if Dataset with given name already exists in the Project, if not creates Dataset with the given name. If parent id is specified then the search will be performed in the specified Dataset, otherwise the search will be performed at the top level of the Project.
- Parameters
- project_id : int
Project ID in Supervisely.
- name : str
Dataset name.
- description : str, optional
Dataset description.
- parent_id : Union[int, None]
Parent Dataset ID. If set to None, then the Dataset will be created at the top level of the Project, otherwise the Dataset will be created in a specified Dataset.
- Returns
Information about Dataset. See
info_sequence
- Return type
DatasetInfo
- Usage example
import supervisely as sly project_id = 116482 os.environ['SERVER_ADDRESS'] = 'https://app.supervise.ly' os.environ['API_TOKEN'] = 'Your Supervisely API Token' api = sly.Api.from_env() ds_info = api.dataset.get_list(project_id) print(len(ds_info)) # 1 api.dataset.get_or_create(project_id, 'ds1') ds_info = api.dataset.get_list(project_id) print(len(ds_info)) # 1 api.dataset.get_or_create(project_id, 'new_ds') ds_info = api.dataset.get_list(project_id) print(len(ds_info)) # 2
- get_tree(project_id)[source]¶
Returns a tree of all datasets in the project as a dictionary, where the keys are the DatasetInfo objects and the values are dictionaries containing the children of the dataset. Recommended to use with the dataset_tree method to iterate over the tree.
- Parameters
- project_id : int
Project ID for which the tree is built.
- Returns
Dictionary of datasets and their children.
- Return type
Dict[DatasetInfo, Dict]
- Usage example
import supervisely as sly api = sly.Api.from_env() project_id = 123 dataset_tree = api.dataset.get_tree(project_id) print(dataset_tree) # Output: # { # DatasetInfo(id=2532, name="lemons", description="", ...: { # DatasetInfo(id=2557, name="kiwi", description="", ...: {} # } # }
- static info_sequence()[source]¶
NamedTuple DatasetInfo information about Dataset.
- Example
DatasetInfo(id=452984, name='ds0', description='', size='3997776', project_id=118909, images_count=11, items_count=11, created_at='2021-03-03T15:54:08.802Z', updated_at='2021-03-16T09:31:37.063Z', reference_image_url='https://app.supervise.ly/h5un6l2bnaz1vj8a9qgms4-public/images/original/K/q/jf/...png'), team_id=1, workspace_id=2
-
move(dst_project_id, id, new_name=
None
, change_name_if_conflict=False
, with_annotations=False
)[source]¶ Moves given Dataset in destination Project by ID.
- Parameters
- dst_project_id : int
Destination Project ID in Supervisely.
- id : int
ID of moved Dataset.
- new_name : str, optional
New Dataset name.
- change_name_if_conflict : bool, optional
Checks if given name already exists and adds suffix to the end of the name.
- with_annotations : bool, optional
If True moves Dataset with annotations, otherwise moves just items from Dataset without annotation.
- Returns
Information about Dataset. See
info_sequence
- Return type
DatasetInfo
- Usage example
import supervisely as sly os.environ['SERVER_ADDRESS'] = 'https://app.supervise.ly' os.environ['API_TOKEN'] = 'Your Supervisely API Token' api = sly.Api.from_env() dst_proj_id = 1985 ds = api.dataset.get_list(dst_proj_id) print(len(ds)) # 0 new_ds = api.dataset.move(dst_proj_id, id=2550, new_name="cucumber", with_annotations=True) ds = api.dataset.get_list(dst_proj_id) print(len(ds)) # 1
-
move_batch(dst_project_id, ids, new_names=
None
, change_name_if_conflict=False
, with_annotations=False
)[source]¶ Moves given Datasets to the destination Project by IDs.
- Parameters
- dst_project_id : int
Destination Project ID in Supervisely.
- ids : List[int]
IDs of moved Datasets.
- new_names : List[str], optional
New Datasets names.
- change_name_if_conflict : bool, optional
Checks if given name already exists and adds suffix to the end of the name.
- with_annotations : bool, optional
If True moves Datasets with annotations, otherwise moves just items from Datasets without annotations.
- Raises
RuntimeError
if can not match “ids” and “new_names” lists, len(ids) != len(new_names)- Returns
Information about Datasets. See
info_sequence
- Return type
List[DatasetInfo]
- Usage example
import supervisely as sly os.environ['SERVER_ADDRESS'] = 'https://app.supervise.ly' os.environ['API_TOKEN'] = 'Your Supervisely API Token' api = sly.Api.from_env() dst_proj_id = 1978 ds = api.dataset.get_list(dst_proj_id) print(len(ds)) # 0 ds_ids = [2545, 2560] ds_names = ["banana_test", "mango_test"] movied_datasets = api.dataset.move_batch(dst_proj_id, ids=ds_ids, new_names=ds_names, with_annotations=True) ds = api.dataset.get_list(dst_proj_id) print(len(ds)) # 2
- move_to_dataset(dataset_id, destination_dataset_id)[source]¶
Moves dataset with specified ID to the dataset with specified destination ID.
- Parameters
- Usage example
import supervisely as sly api = sly.Api.from_env() dataset_id = 123 destination_dataset_id = 456 api.dataset.move_to_dataset(dataset_id, destination_dataset_id)
- Return type
- remove(id)¶
Remove an entity with the specified ID from the Supervisely server.
- Parameters
- id : int
Entity ID in Supervisely
-
remove_batch(ids, progress_cb=
None
)¶ Remove entities with given IDs from the Supervisely server.
- Parameters
- ids : List[int]
IDs of entities in Supervisely.
- progress_cb : Callable
Function for control remove progress.
-
remove_permanently(ids, batch_size=
50
, progress_cb=None
)[source]¶ !!! WARNING !!! Be careful, this method deletes data from the database, recovery is not possible.
Delete permanently datasets with given IDs from the Supervisely server. All dataset IDs must belong to the same team. Therefore, it is necessary to sort IDs before calling this method.
- Parameters
- ids : Union[int, List]
IDs of datasets in Supervisely.
- batch_size : int, optional
The number of entities that will be deleted by a single API call. This value must be in the range 1-50 inclusive, if you set a value out of range it will automatically adjust to the boundary values.
- progress_cb : Callable, optional
Function for control delete progress.
- Returns
A list of response content in JSON format for each API call.
- Return type
List[dict]
- tree(project_id)[source]¶
Yields tuples of (path, dataset) for all datasets in the project. Path of the dataset is a list of parents, e.g. [“ds1”, “ds2”, “ds3”]. For root datasets, the path is an empty list.
- Parameters
- project_id : int
Project ID in which the Dataset is located.
- Returns
Generator of tuples of (path, dataset).
- Return type
Generator[Tuple[List[str], DatasetInfo], None, None]
- Usage example
import supervisely as sly api = sly.Api.from_env() project_id = 123 for parents, dataset in api.dataset.tree(project_id): parents: List[str] dataset: sly.DatasetInfo print(parents, dataset.name) # Output: # [] ds1 # ["ds1"] ds2 # ["ds1", "ds2"] ds3
-
update(id, name=
None
, description=None
)¶