fs¶

Classes

`BaseRestrictedUnpickler`(file, *[, ...])	Base unpickler that enforces an allowlist of modules and classes.

Functions

`archive_directory`(dir_, tar_path[, split, ...])	Create tar archive from directory and optionally split it into parts of specified size.
`change_directory_at_index`(path, dir_name, ...)	Change directory name in path by index.
`clean_dir`(dir_[, ignore_errors])	Recursively delete a directory tree, but save root directory.
`copy_dir_recursively`(src_dir, dst_dir[, ...])
`copy_file`(src, dst)	Copy file from one path to another, if destination directory doesn't exist it will be created.
`copy_file_async`(src, dst[, progress_cb, ...])	Asynchronously copy file from one path to another, if destination directory doesn't exist it will be created.
`decode_uint32_le`(data)	Decode little-endian uint32 bytes into a list of integers.
`dir_empty`(dir)	Check whether directory is empty or not.
`dir_exists`(dir)	Check whether directory exists or not.
`dirs_filter`(input_path, check_function)	Generator that yields paths to directories that meet the requirements of the check_function.
`dirs_with_marker`(input_path, markers[, ...])	Generator that yields paths to directories that contain markers files.
`download`(url, save_path[, cache, progress, ...])	Load image from url to host by target path.
`encode_uint32_le`(values)	Encode a sequence of non-negative integers as little-endian uint32 bytes.
`ensure_base_path`(path)	Recursively create parent directory for target path.
`file_exists`(path)	Check whether file exists or not.
`get_directory_size`(dir_path)	Get the size of a directory.
`get_file_ext`(path)	Extracts file extension from a given path.
`get_file_hash`(path)	Get hash from target file.
`get_file_hash_async`(path)	Get hash from target file asynchronously.
`get_file_hash_chunked`(path[, chunk_size])	Get hash from target file by reading it in chunks.
`get_file_hash_chunked_async`(path[, chunk_size])	Asynchronously get hash from target file by reading it in chunks.
`get_file_name`(path)	Extracts file name from a given path.
`get_file_name_with_ext`(path)	Extracts file name with ext from a given path.
`get_file_offsets_batch_generator`(archive_path)	Extracts offset information for files from TAR archives and returns a generator that yields the information in batches.
`get_file_size`(path)	Get the size of a file.
`get_subdirs`(dir_path[, recursive])	Get list containing the names of the directories in the given directory.
`get_subdirs_tree`(dir_path)	Returns a dictionary representing the directory tree.
`global_to_relative`(global_path, base_dir)	Converts global path to relative path.
`hardlink_or_copy_file`(src, dst)	Creates a hard link pointing to src named dst.
`hardlink_or_copy_tree`(src, dst)	Creates a hard links pointing to src named dst files recursively.
`is_archive`(file_path)	Checks if the file is an archive by its mimetype using list of the most common archive mimetypes.
`is_on_agent`(remote_path)	Check if remote_path starts is on agent (e.g. starts with 'agent://<agent-id>/').
`list_dir_recursively`(dir[, include_subdirs, ...])	Recursively walks through directory and returns list with all file paths, and optionally subdirectory paths.
`list_files`(dir[, valid_extensions, ...])	Returns list with file paths presented in given directory.
`list_files_recursively`(dir[, ...])	Recursively walks through directory and returns list with all file paths.
`list_files_recursively_async`(dir_path[, ...])	Recursively list files in the directory asynchronously.
`log_tree`(dir_path, logger[, level])	Get tree for target directory and displays it in the log.
`mkdir`(dir[, remove_content_if_exists])	Creates a leaf directory and all intermediate ones.
`parse_agent_id_and_path`(remote_path)	Return agent id and path in agent folder from remote_path.
`remove_dir`(dir_)	Recursively delete a directory tree.
`remove_junk_from_dir`(dir)	Cleans the given directory from junk files and dirs (e.g. .DS_Store, __MACOSX, Thumbs.db, etc.).
`save_blob_offsets_pkl`(blob_file_path, output_dir)	Processes blob file locally and creates a pickle file with offset information.
`silent_remove`(file_path)	Remove file which may not exist.
`str_is_url`(string)	Check if string is a valid URL.
`string_to_byte_size`(string)	Returns integer representation of byte size from string representation.
`subdirs_tree`(dir_path[, ignore, ignore_content])	Generator that yields directories in the directory tree, starting from the level below the root directory and then going down the tree.
`touch`(path)	Sets access and modification times for a file.
`touch_async`(path)	Sets access and modification times for a file asynchronously.
`tree`(dir_path)	Get tree for target directory.
`unpack_archive`(archive_path, target_dir[, ...])	Unpacks archive to the target directory, removes junk files and directories.
`unpack_archive_async`(archive_path, target_dir)	Unpacks archive to the target directory, removes junk files and directories.

Description

File system utilities for Supervisely.

class BaseRestrictedUnpickler(file, *, fix_imports=True, encoding='ASCII', errors='strict', buffers=())[source]¶

Bases: Unpickler

Base unpickler that enforces an allowlist of modules and classes.

Any class not covered by the allowlist raises pickle.UnpicklingError instead of being imported and instantiated.

Subclasses should configure the allowlist via two class-level attributes:

_ALLOWED — exact {module: {class_name, ...}} whitelist. Takes priority over _ALLOWED_MODULE_PREFIXES.
_ALLOWED_MODULE_PREFIXES — tuple of module-name prefixes. Every class whose module starts with one of these prefixes is allowed.

Both attributes can be combined: exact entries in _ALLOWED are checked first; if no match is found the prefix list is tried next.

archive_directory(dir_, tar_path, split=None, chunk_size_mb=50)[source]¶

Create tar archive from directory and optionally split it into parts of specified size. You can adjust the size of the chunk to read from the file, while archiving the file into parts. Be careful with this parameter, it can affect the performance of the function. When spliting, if the size of split is less than the chunk size, the chunk size will be adjusted to fit the split size.

Parameters:

dir : str: Target directory path.
tar_path : str¶: Path for output tar archive.
split : Union[int, str]¶: Split archive into parts of specified size (in bytes) or size with suffix (e.g. ‘1Kb’ = 1024, ‘1Mb’ = 1024 * 1024). Default is None.
chunk_size_mb : int¶: Size of the chunk to read from the file. Default is 50Mb.

Returns:

None or list of archive parts if split is not None

Return type:

Union[None, List[str]]

Usage Example:

from supervisely.io.fs import archive_directory

# If split is not needed.
archive_directory('/home/admin/work/projects/examples', '/home/admin/work/examples.tar')

# If split is specified.
archive_parts_paths = archive_directory('/home/admin/work/projects/examples', '/home/admin/work/examples/archive.tar', split=1000000)
print(archive_parts_paths) # ['/home/admin/work/examples/archive.tar.001', '/home/admin/work/examples/archive.tar.002']

change_directory_at_index(path, dir_name, dir_index)[source]¶

Change directory name in path by index. If you use counting from the end, keep in mind that if the path ends with a file, the file will be assigned to the last index.

Parameters:

path : str¶: The original path
dir_name : str¶: Directory name
dir_index : int¶: Index of the directory we want to change, negative values count from the end

Returns:

New path

Return type:

str

Raises:

IndexError – If the catalog index is out of bounds for a given path

Usage Example:

import supervisely as sly

input_path = 'head/dir_1/file.txt'
new_path = sly.io.fs.change_directory_at_index(input_path, 'dir_2', -2)
print(new_path)

clean_dir(dir_, ignore_errors=True)[source]¶

Recursively delete a directory tree, but save root directory.

Parameters:

dir : str: Target directory path.

Ignore_errors:

Ignore possible errors while removes directory content.

Ignore_errors:

bool

Returns:

None

Return type:

None

Usage Example:

from supervisely.io.fs import clean_dir

clean_dir('/home/admin/work/projects/examples')

copy_file(src, dst)[source]¶

Copy file from one path to another, if destination directory doesn’t exist it will be created.

Parameters:

src : str¶: Source file path.
dst : str¶: Destination file path.

Returns:

None

Return type:

None

Usage Example:

from supervisely.io.fs import copy_file

copy_file('/home/admin/work/projects/example/1.png', '/home/admin/work/tests/2.png')

async copy_file_async(src, dst, progress_cb=None, progress_cb_type='size')[source]¶

Asynchronously copy file from one path to another, if destination directory doesn’t exist it will be created.

Parameters:

src : str¶: Source file path.
dst : str¶: Destination file path.
progress_cb : Union[tqdm, Callable], optional¶: Function for tracking copy progress.
progress_cb_type : Literal["number", "size"], optional¶: Type of progress callback. Can be “number” or “size”. Default is “size”.

Returns:

None

Return type:

None

Usage Example:

import supervisely as sly
from supervisely._utils import run_coroutine

coroutine = sly.fs.copy_file_async('/home/admin/work/projects/example/1.png', '/home/admin/work/tests/2.png')
run_coroutine(coroutine)

decode_uint32_le(data)[source]¶

Decode little-endian uint32 bytes into a list of integers.

Parameters:

data : bytes¶: Little-endian uint32 byte buffer (trailing bytes that do not form a full uint32 are ignored).

Returns:

List of decoded integers.

Return type:

List[int]

dir_empty(dir)[source]¶

Check whether directory is empty or not.

Parameters:

dir : str¶: Target directory path.

Returns:

True if directory is empty, False otherwise.

Return type:

bool

Usage Example:

from supervisely.io.fs import dir_empty

dir_empty('/home/admin/work/projects/examples') # False

dir_exists(dir)[source]¶

Check whether directory exists or not.

Parameters:

dir : str¶: Target directory path.

Returns:

True if directory exists, False otherwise.

Return type:

bool

Usage Example:

from supervisely.io.fs import dir_exists

dir_exists('/home/admin/work/projects/examples') # True
dir_exists('/home/admin/work/not_exist_dir') # False

dirs_filter(input_path, check_function)[source]¶

Generator that yields paths to directories that meet the requirements of the check_function.

Parameters:

input_path : str¶: path to the directory in which the search will be performed
check_function : Callable¶: function to check that directory meets the requirements and returns bool

Usage Example:

import supervisely as sly

input_path = '/home/admin/work/projects/examples'

# Prepare the check function.
def check_function(directory) -> bool:
    images_dir = os.path.join(directory, "images")
    annotations_dir = os.path.join(directory, "annotations")
    return os.path.isdir(images_dir) and os.path.isdir(annotations_dir)

for directory in sly.fs.dirs(input_path, check_function):
    # Now you can be sure that the directory meets the requirements.
    # Do something with it.
    print(directory)

dirs_with_marker(input_path, markers, check_function=None, ignore_case=False)[source]¶

Generator that yields paths to directories that contain markers files. If the check_function is specified, then the markered directory will be yielded only if the check_function returns True. The check_function must take a single argument - the path to the markered directory and return True or False.

Parameters:

input_path : str¶: path to the directory in which the search will be performed
markers : Union[str, List[str]]¶: single marker or list of markers (e.g. ‘config.json’ or [‘config.json’, ‘config.yaml’])
check_function : Callable¶: function to check that directory meets the requirements and returns bool
ignore_case : bool¶: ignore case when searching for markers

Usage Example:

import supervisely as sly

input_path = '/home/admin/work/projects/examples'

# You can pass a string if you have only one marker.
# markers = 'config.json'

# Or a list of strings if you have several markers.
# There's no need to pass one marker in different cases, you can use ignore_case=True for this.
markers = ['config.json', 'config.yaml']


# Check function is optional, if you don't need the directories to meet any requirements,
# you can omit it.

def check_function(dir_path):
    test_file_path = os.path.join(dir_path, 'test.txt')
    return os.path.exists(test_file_path)

for directory in sly.fs.dirs_with_marker(input_path, markers, check_function, ignore_case=True):
    # Now you can be sure that the directory contains the markers and meets the requirements.
    # Do something with it.
    print(directory)

download(url, save_path, cache=None, progress=None, headers=None, timeout=None)[source]¶

Load image from url to host by target path.

Parameters:

url : str¶: Target file path.
url¶: The path where the file is saved.
cache=None¶: An instance of FileCache class that provides caching functionality for the downloaded content. If None, caching is disabled.
progress : Progress, optional¶: Function for tracking download progress.
headers : Dict, optional.¶: A dictionary of HTTP headers to include in the request.
timeout : int, optional.¶: The maximum number of seconds to wait for a response from the server. If the server does not respond within the timeout period, a TimeoutError is raised.

Returns:

Full path to downloaded image

Return type:

str

Usage Example:

from supervisely.io.fs import download

img_link = 'https://m.media-amazon.com/images/M/MV5BMTYwOTEwNjAzMl5BMl5BanBnXkFtZTcwODc5MTUwMw@@._V1_.jpg'
im_path = download(img_link, '/home/admin/work/projects/examples/avatar.jpeg')
print(im_path)
# Output:
# /home/admin/work/projects/examples/avatar.jpeg

# if you need to specify some headers
headers = {'User-Agent': 'Mozilla/5.0'}
im_path = download(img_link, '/home/admin/work/projects/examples/avatar.jpeg', headers=headers)
print(im_path)
# Output:
# /home/admin/work/projects/examples/avatar.jpeg

encode_uint32_le(values)[source]¶

Encode a sequence of non-negative integers as little-endian uint32 bytes.

Parameters:

values : Union[List[int], np.ndarray]¶: Sequence of integers in range [0, 2**32 - 1].

Returns:

Little-endian uint32 byte buffer (4 bytes per value), empty bytes for empty input.

Return type:

bytes

ensure_base_path(path)[source]¶

Recursively create parent directory for target path.

Parameters:

path : str¶: Target dir path.

Returns:

None

Return type:

None

Usage Example:

from supervisely.io.fs import ensure_base_path

ensure_base_path('/home/admin/work/projects/example')

file_exists(path)[source]¶

Check whether file exists or not.

Parameters:

dir : str: Target file path.

Returns:

True if file exists, False otherwise.

Return type:

bool

Usage Example:

from supervisely.io.fs import file_exists

file_exists('/home/admin/work/projects/examples/1.jpeg') # True
file_exists('/home/admin/work/projects/examples/not_exist_file.jpeg') # False

get_directory_size(dir_path)[source]¶

Get the size of a directory.

Parameters:

path : str: Target directory path.

Returns:

Directory size in bytes

Return type:

int

Usage Example:

from supervisely.io.fs import get_directory_size

dir_size = get_directory_size('/home/admin/work/projects/examples') # 8574563

get_file_ext(path)[source]¶

Extracts file extension from a given path.

Parameters:

path : str¶: Path to file.

Returns:

File extension without name

Return type:

str

Usage Example:

import supervisely as sly

file_ext = sly.fs.get_file_ext("/home/admin/work/projects/lemons_annotated/ds1/img/IMG_0748.jpeg")
print(file_ext)
# Output: .jpeg

get_file_hash(path)[source]¶

Get hash from target file.

Parameters:

path : str¶: Target file path.

Returns:

File hash

Return type:

str

Usage Example:

from supervisely.io.fs import get_file_hash

hash = get_file_hash('/home/admin/work/projects/examples/1.jpeg') # rKLYA/p/P64dzidaQ/G7itxIz3ZCVnyUhEE9fSMGxU4=

async get_file_hash_async(path)[source]¶

Get hash from target file asynchronously.

Parameters:

path : str¶: Target file path.

Returns:

File hash

Return type:

str

Usage Example:

import supervisely as sly
from supervisely._utils import run_coroutine

coroutine = sly.fs.get_file_hash_async('/home/admin/work/projects/examples/1.jpeg')
hash = run_coroutine(coroutine)

get_file_hash_chunked(path, chunk_size=1048576)[source]¶

Get hash from target file by reading it in chunks.

Parameters:

path : str¶: Target file path.
chunk_size : int, optional¶: Number of bytes to read per iteration. Default is 1 MB.

Returns:

File hash as a base64 encoded string.

Return type:

str

Usage Example:

from supervisely.io.fs import get_file_hash_chunked

file_hash = get_file_hash_chunked('/home/admin/work/projects/examples/1.jpeg')
print(file_hash)  # Example output: rKLYA/p/P64dzidaQ/G7itxIz3ZCVnyUhEE9fSMGxU4=

async get_file_hash_chunked_async(path, chunk_size=1048576)[source]¶

Asynchronously get hash from target file by reading it in chunks.

Parameters:

path : str¶: Target file path.
chunk_size : int, optional¶: Number of bytes to read per iteration. Default is 1 MB.

Returns:

File hash as a base64 encoded string.

Return type:

str

get_file_name(path)[source]¶

Extracts file name from a given path.

Parameters:

path : str¶: Path to file.

Returns:

File name without extension

Return type:

str

Usage Example:

import supervisely as sly

file_name = sly.fs.get_file_name("/home/admin/work/projects/lemons_annotated/ds1/img/IMG_0748.jpeg")

print(file_name)
# Output: IMG_0748

get_file_name_with_ext(path)[source]¶

Extracts file name with ext from a given path.

Parameters:

path : str¶: Path to file.

Returns:

File name with extension

Return type:

str

Usage Example:

import supervisely as sly

file_name_ext = sly.fs.get_file_name_with_ext("/home/admin/work/projects/lemons_annotated/ds1/img/IMG_0748.jpeg")
print(file_name_ext)
# Output: IMG_0748.jpeg

get_file_offsets_batch_generator(archive_path, team_file_id=None, filter_func=None, output_format='dicts', batch_size=10000)[source]¶

Extracts offset information for files from TAR archives and returns a generator that yields the information in batches.

team_file_id may be None if it’s not possible to obtain the ID at this moment. You can set the team_file_id later when uploading the file to Supervisely.

Parameters:

archive_path : str¶: Local path to the archive
team_file_id : Optional[int]¶: ID of file in Team Files. Default is None. team_file_id may be None if it’s not possible to obtain the ID at this moment. You can set the team_file_id later when uploading the file to Supervisely.
filter_func : Callable, optional¶: Function to filter files. The function should take a filename as input and return True if the file should be included.
output_format : Literal["dicts", "objects"]¶: Format of the output. Default is dicts. objects - returns a list of BlobImageInfo objects. dicts - returns a list of dictionaries.

Returns:

Generator yielding batches of file information in the specified format.

Return type:

Generator[Union[List[Dict], List[BlobImageInfo]]], None, None]

Raises:

ValueError – If the archive type is not supported or contains compressed files

Usage Example:

import supervisely as sly

archive_path = '/home/admin/work/projects/examples.tar'
file_infos = sly.fs.get_file_offsets_batch_generator(archive_path)
for batch in file_infos:
    print(batch)

# Output:
# [
#     {
#         "title": "image1.jpg",
#         "teamFileId": None,
#         "sourceBlob": {
#             "offsetStart": 0,
#             "offsetEnd": 123456
#         }
#     },
#     {
#         "title": "image2.jpg",
#         "teamFileId": None,
#         "sourceBlob": {
#             "offsetStart": 123456,
#             "offsetEnd": 234567
#         }
#     }
# ]

get_file_size(path)[source]¶

Get the size of a file.

Parameters:

path : str¶: File path.

Returns:

File size in bytes

Return type:

int

Usage Example:

from supervisely.io.fs import get_file_size

file_size = get_file_size('/home/admin/work/projects/examples/1.jpeg') # 161665

get_subdirs(dir_path, recursive=False)[source]¶

Get list containing the names of the directories in the given directory.

Parameters:

dir_path : str¶: Target directory path.
recursive : bool¶: If True, all found subdirectories will be included in the result list.

Returns:

List containing directories names.

Return type:

list

Usage Example:

from supervisely.io.fs import get_subdirs

subdirs = get_subdirs('/home/admin/work/projects/examples')
print(subdirs)
# Output: ['tests', 'users', 'ds1']

get_subdirs_tree(dir_path)[source]¶

Returns a dictionary representing the directory tree. It will have only directories and subdirectories (not files).

Parameters:

dir_path : str¶: Target directory path.

Returns:

Dictionary representing the directory tree.

Return type:

Dict[str, Union[str, Dict]]

Usage Example:

from supervisely.io.fs import get_subdirs_tree

tree = get_subdirs_tree('/home/admin/work/projects/examples')
print(tree)
# Output: {'examples': {'tests': {}, 'users': {}, 'ds1': {}}}

global_to_relative(global_path, base_dir)[source]¶

Converts global path to relative path.

Parameters:

global_path : str¶: Global path.
base_dir : str¶: Base directory path.

Returns:

Relative path.

Return type:

str

Usage Example:

from supervisely.io.fs import global_to_relative

relative_path = global_to_relative('/home/admin/work/projects/examples/1.jpeg', '/home/admin/work/projects')
print(relative_path)
# Output: examples/1.jpeg

hardlink_or_copy_file(src, dst)[source]¶

Creates a hard link pointing to src named dst. If the link cannot be created, the file will be copied.

Parameters:

src : str¶: Source file path.
dst : str¶: Destination file path.

Returns:

None

Return type:

None

Usage Example:

from supervisely.io.fs import hardlink_or_copy_file

hardlink_or_copy_file('/home/admin/work/projects/example/1.png', '/home/admin/work/tests/link.txt')

hardlink_or_copy_tree(src, dst)[source]¶

Creates a hard links pointing to src named dst files recursively. If the link cannot be created, the file will be copied.

Parameters:

src : str¶: Source dir path.
dst : str¶: Destination dir path.

Returns:

None

Return type:

None

Usage Example:

from supervisely.io.fs import hardlink_or_copy_tree

hardlink_or_copy_tree('/home/admin/work/projects/examples', '/home/admin/work/tests/links')

is_archive(file_path)[source]¶

Checks if the file is an archive by its mimetype using list of the most common archive mimetypes.

Parameters:

local_path : str: path to the local file

Returns:

True if the file is an archive, False otherwise

Return type:

bool

is_on_agent(remote_path)[source]¶

Check if remote_path starts is on agent (e.g. starts with ‘agent://<agent-id>/’).

Parameters:

remote_path : str¶: path to check

Returns:

True if remote_path starts with ‘agent://<agent-id>/’ and False otherwise

Return type:

bool

list_dir_recursively(dir, include_subdirs=False, use_global_paths=False)[source]¶

Recursively walks through directory and returns list with all file paths, and optionally subdirectory paths.

Parameters:

path : str: Path to directory.
include_subdirs : bool¶: If True, subdirectory paths will be included in the result list.
use_global_paths : bool¶: If True, absolute paths will be returned instead of relative ones.

Returns:

List containing file paths, and optionally subdirectory paths.

Return type:

List[str]

Usage Example:

import supervisely as sly

list_dir = sly.fs.list_dir_recursively("/home/admin/work/projects/lemons_annotated/")
print(list_dir)
# Output: ['meta.json', 'ds1/ann/IMG_0748.jpeg.json', 'ds1/ann/IMG_4451.jpeg.json', 'ds1/img/IMG_0748.jpeg', 'ds1/img/IMG_4451.jpeg']

list_files(dir, valid_extensions=None, filter_fn=None, ignore_valid_extensions_case=False)[source]¶

Returns list with file paths presented in given directory. Can be filtered by valid extensions and filter function. Also can be case insensitive for valid extensions.

Parameters:

dir¶: Target dir path.
dir¶: str
valid_extensions : List[str]¶: List with valid file extensions.
filter_fn : Callable, optional¶: Function with a single argument. Argument is a file path. Function determines whether to keep a given file path. Must return True or False.
ignore_valid_extensions_case : bool¶: If True, validation of file extensions will be case insensitive.

Returns:

List with file paths

Return type:

List[str]

Usage Example:

import supervisely as sly

list_files = sly.fs.list_files("/home/admin/work/projects/lemons_annotated/ds1/img/")
print(list_files)
# Output: ['/home/admin/work/projects/lemons_annotated/ds1/img/IMG_0748.jpeg', '/home/admin/work/projects/lemons_annotated/ds1/img/IMG_4451.jpeg']

list_files_recursively(dir, valid_extensions=None, filter_fn=None, ignore_valid_extensions_case=False)[source]¶

Recursively walks through directory and returns list with all file paths. Can be filtered by valid extensions and filter function.

Parameters:

dir¶: Target dir path.
dir¶: str
valid_extensions : List[str], optional¶: List with valid file extensions.
filter_fn : Callable, optional¶: Function with a single argument. Argument is a file path. Function determines whether to keep a given file path. Must return True or False.
ignore_valid_extensions_case : bool¶: If True, validation of file extensions will be case insensitive.

Returns:

List with file paths

Return type:

List[str]

Usage Example:

import supervisely as sly

list_files = sly.fs.list_files_recursively("/home/admin/work/projects/lemons_annotated/ds1/img/")
print(list_files)
# Output: ['/home/admin/work/projects/lemons_annotated/ds1/img/IMG_0748.jpeg', '/home/admin/work/projects/lemons_annotated/ds1/img/IMG_4451.jpeg']

async list_files_recursively_async(dir_path, valid_extensions=None, filter_fn=None, ignore_valid_extensions_case=False)[source]¶

Recursively list files in the directory asynchronously. Returns list with all file paths. Can be filtered by valid extensions and filter function.

Parameters:

dir_path : str¶: Target directory path.
valid_extensions : Optional[List[str]]¶: List of valid extensions. Default is None.
filter_fn : Optional[Callable[[str], bool]]¶: Filter function. Default is None.
ignore_valid_extensions_case : bool¶: Ignore case when checking valid extensions. Default is False.

Returns:

List of file paths

Return type:

List[str]

Usage Example:

import supervisely as sly
from supervisely._utils import run_coroutine

dir_path = '/home/admin/work/projects/examples'

coroutine = sly.fs.list_files_recursively_async(dir_path)
files = run_coroutine(coroutine)

log_tree(dir_path, logger, level='info')[source]¶

Get tree for target directory and displays it in the log.

Parameters:

dir_path : str¶: Target directory path.
logger : logger¶: Logger to display data.

Returns:

None

Return type:

None

Usage Example:

from supervisely.io.fs import log_tree

logger = sly.logger
log_tree('/home/admin/work/projects/examples', logger)

mkdir(dir, remove_content_if_exists=False)[source]¶

Creates a leaf directory and all intermediate ones.

Parameters:

dir¶: Target dir path.
dir¶: str

Remove_content_if_exists:

Remove directory content if it exist.

Remove_content_if_exists:

bool

Returns:

None

Return type:

None

Usage Example:

from supervisely.io.fs import mkdir

mkdir('/home/admin/work/projects/example')

parse_agent_id_and_path(remote_path)[source]¶

Return agent id and path in agent folder from remote_path.

Parameters:

remote_path : str¶: path to parse

Returns:

agent id and path in agent folder

Return type:

Tuple[int, str]

Raises:

ValueError – if remote_path doesn’t start with ‘agent://<agent-id>/’

Usage Example:

import os
from dotenv import load_dotenv

import supervisely as sly

# Load secrets and create API object from .env file (recommended)
# Learn more here: https://developer.supervisely.com/getting-started/basics-of-authentication
if sly.is_development():
    load_dotenv(os.path.expanduser("~/supervisely.env"))

api = sly.Api.from_env()

# Parse agent id and path in agent folder from remote_path
remote_path = "agent://1/agent_folder/subfolder/file.txt"
agent_id, path_in_agent_folder = sly.fs.parse_agent_id_and_path(remote_path)
print(agent_id)  # 1
print(path_in_agent_folder)  # /agent_folder/subfolder/file.txt

remove_dir(dir_)[source]¶

Recursively delete a directory tree.

Parameters:

dir : str: Target directory path.

Returns:

None

Return type:

None

Usage Example:

from supervisely.io.fs import remove_dir

remove_dir('/home/admin/work/projects/examples')

remove_junk_from_dir(dir)[source]¶

Cleans the given directory from junk files and dirs (e.g. .DS_Store, __MACOSX, Thumbs.db, etc.).

Parameters:

dir : str¶: Path to directory.

Returns:

List of global paths to removed files and dirs.

Return type:

List[str]

Usage Example:

import supervisely as sly

input_dir = "/home/admin/work/projects/lemons_annotated/"
sly.fs.remove_junk_from_dir(input_dir)

save_blob_offsets_pkl(blob_file_path, output_dir, team_file_id=None, filter_func=None, batch_size=10000, replace=False)[source]¶

Processes blob file locally and creates a pickle file with offset information.

Parameters:

blob_file_path : str¶: Path to the local blob file
output_dir : str¶: Path to the output directory
team_file_id : Optional[int]¶: ID of file in Team Files. Default is None. team_file_id may be None if it’s not possible to obtain the ID at this moment. You can set the team_file_id later when uploading the file to Supervisely.
filter_func : Callable, optional¶: Function to filter files. The function should take a filename as input and return True if the file should be included.
batch_size : int, optional¶: Number of files to process in each batch, defaults to 10000
replace : bool¶: If True, overwrite the existing file if it exists. If False, skip processing if the file already exists and return its path. Default is False.

Returns:

Path to the output pickle file

Return type:

str

Usage Example:

import supervisely as sly

archive_path = '/path/to/examples.tar'
output_dir = '/path/to/output'
sly.fs.save_blob_offsets_pkl(archive_path, output_dir)

silent_remove(file_path)[source]¶

Remove file which may not exist.

Parameters:

file_path : str¶: File path.

Returns:

None

Return type:

None

Usage Example:

from supervisely.io.fs import silent_remove

silent_remove('/home/admin/work/projects/examples/1.jpeg')

str_is_url(string)[source]¶

Check if string is a valid URL.

Parameters:

string : str¶: string to check

Returns:

True if string is a valid URL, False otherwise

Return type:

bool

Usage Example:

import supervisely as sly

url = "https://example.com/image.jpg"
is_url = sly.fs.str_is_url(url)
print(is_url)  # True

string_to_byte_size(string)[source]¶

Returns integer representation of byte size from string representation.

If input is integer, returns the same integer for convenience.

Parameters:

string : Union[str, int]¶: String representation of byte size (e.g. 1.5Kb, 2Mb, 3.7Gb, 4.2Tb) or integer.

Returns:

Integer representation of byte size (or the same integer if input is integer).

Return type:

int

Raises:

ValueError – If input string is invalid.

Usage Example:

from supervisely.io.fs import string_to_byte_size

string_size = "1.5M"
size = string_to_byte_size(string_size)
print(size)  # 1572864

subdirs_tree(dir_path, ignore=None, ignore_content=None)[source]¶

Generator that yields directories in the directory tree, starting from the level below the root directory and then going down the tree. If ignore is specified, it will ignore paths which end with the specified directory names. All subdirectories of ignored directories will still be yielded.

Parameters:

dir_path : str¶: Target directory path.
ignore : List[str]¶: List of directories to ignore. Note, that function still will yield subdirectories of ignored directories. It will only ignore paths which end with the specified directory names.
ignore_content : List[str]¶: List of directories which subdirectories should be ignored.

Returns:

Generator that yields directories in the directory tree.

Return type:

Generator[str, None, None]

touch(path)[source]¶

Sets access and modification times for a file.

Parameters:

path : str¶: Target file path.

Returns:

None

Return type:

None

Usage Example:

from supervisely.io.fs import touch

touch('/home/admin/work/projects/examples/1.jpeg')

async touch_async(path)[source]¶

Sets access and modification times for a file asynchronously.

Parameters:

path : str¶: Target file path.

Returns:

None

Return type:

None

Usage Example:

import supervisely as sly
from supervisely._utils import run_coroutine

coroutine = sly.fs.touch_async('/home/admin/work/projects/examples/1.jpeg')
run_coroutine(coroutine)

tree(dir_path)[source]¶

Get tree for target directory.

Parameters:

dir_path : str¶: Target directory path.

Returns:

Tree with directory files and subdirectories

Return type:

str

Usage Example:

from supervisely.io.fs import tree

dir_tree = tree('/home/admin/work/projects/examples')
print(dir_tree)
# Output: /home/admin/work/projects/examples
# ├── [4.0K]  1
# │   ├── [165K]  crop.jpeg
# │   ├── [169K]  fliplr.jpeg
# │   ├── [169K]  flipud.jpeg
# │   ├── [166K]  relative_crop.jpeg
# │   ├── [167K]  resize.jpeg
# │   ├── [169K]  rotate.jpeg
# │   ├── [171K]  scale.jpeg
# │   └── [168K]  translate.jpeg
# ├── [ 15K]  123.jpeg
# ├── [158K]  1.jpeg
# ├── [188K]  1.txt
# ├── [1.3M]  1.zip
# ├── [4.0K]  2
# ├── [ 92K]  acura.png
# ├── [1.2M]  acura_PNG122.png
# ├── [198K]  aston_martin_PNG55.png
# ├── [4.0K]  ds1
# │   ├── [4.0K]  ann
# │   │   ├── [4.3K]  IMG_0748.jpeg.json
# │   │   ├── [ 151]  IMG_0777.jpeg.json
# │   │   ├── [ 151]  IMG_0888.jpeg.json
# │   │   ├── [3.7K]  IMG_1836.jpeg.json
# │   │   ├── [8.1K]  IMG_2084.jpeg.json
# │   │   ├── [5.5K]  IMG_3861.jpeg.json
# │   │   ├── [6.0K]  IMG_4451.jpeg.json
# │   │   └── [5.0K]  IMG_8144.jpeg.json
# │   └── [4.0K]  img
# │       ├── [152K]  IMG_0748.jpeg
# │       ├── [210K]  IMG_0777.jpeg
# │       ├── [210K]  IMG_0888.jpeg
# │       ├── [137K]  IMG_1836.jpeg
# │       ├── [139K]  IMG_2084.jpeg
# │       ├── [145K]  IMG_3861.jpeg
# │       ├── [133K]  IMG_4451.jpeg
# │       └── [136K]  IMG_8144.jpeg
# ├── [152K]  example.jpeg
# ├── [2.4K]  example.json
# ├── [153K]  flip.jpeg
# ├── [ 65K]  hash1.jpeg
# ├── [ 336]  meta.json
# └── [5.4K]  q.jpeg
# 5 directories, 37 files

unpack_archive(archive_path, target_dir, remove_junk=True, is_split=False, chunk_size_mb=50)[source]¶

Unpacks archive to the target directory, removes junk files and directories. To extract a split archive, you must pass the path to the first part in archive_path. Archive parts must be in the same directory. Format: archive_name.tar.001, archive_name.tar.002, etc. Works with tar and zip. You can adjust the size of the chunk to read from the file, while unpacking the file from parts. Be careful with this parameter, it can affect the performance of the function.

Parameters:

archive_path : str¶: Path to the archive.
target_dir : str¶: Path to the target directory.
remove_junk : bool¶: Remove junk files and directories. Default is True.
is_split : bool¶: Determines if the source archive is split into parts. If True, archive_path must be the path to the first part. Default is False.
chunk_size_mb : int¶: Size of the chunk to read from the file. Default is 50Mb.

Returns:

None

Return type:

None

Usage Example:

import supervisely as sly

archive_path = '/home/admin/work/examples.tar'
target_dir = '/home/admin/work/projects'
sly.fs.unpack_archive(archive_path, target_dir)

async unpack_archive_async(archive_path, target_dir, remove_junk=True, is_split=False, chunk_size_mb=50)[source]¶

Parameters:

archive_path : str¶: Path to the archive.
target_dir : str¶: Path to the target directory.
remove_junk : bool¶: Remove junk files and directories. Default is True.
is_split : bool¶: Determines if the source archive is split into parts. If True, archive_path must be the path to the first part. Default is False.
chunk_size_mb : int¶: Size of the chunk to read from the file. Default is 50Mb.

Returns:

None

Return type:

None

Usage Example:

import supervisely as sly
from supervisely._utils import run_coroutine

archive_path = '/home/admin/work/examples.tar'
target_dir = '/home/admin/work/projects'

coroutine = sly.fs.unpack_archive_async(archive_path, target_dir)
run_coroutine(coroutine)