aztk package¶

aztk.client module¶

class aztk.client.CoreClient[source]¶

Bases: object

The base AZTK client that all other clients inherit from.

This client should not be used directly. Only software specific clients should be used.

class aztk.client.base.BaseOperations(context)[source]¶

Bases: object

Base operations that all other operations have as an attribute

batch_client¶: azure.batch.batch_service_client.BatchServiceClient – Client used to interact with the Azure Batch service.

blob_client¶: azure.storage.blob.BlockBlobService – Client used to interact with the Azure Storage Blob service.

secrets_configuration¶: aztk.models.SecretsConfiguration – Model that holds AZTK secrets used to authenticate with Azure and the clusters.

get_cluster_configuration(id: str) → aztk.models.cluster_configuration.ClusterConfiguration[source]¶

Open an ssh tunnel to a node

Parameters:	id (`str`) – the id of the cluster the node is in node_id (`str`) – the id of the node to open the ssh tunnel to username (`str`) – the username to authenticate the ssh session ssh_key (`str`, optional) – ssh public key to create the user with, must use ssh_key or password. Defaults to None. password (`str`, optional) – password for the user, must use ssh_key or password. Defaults to None. port_forward_list (`List[PortForwardingSpecification`, optional) – list of PortForwardingSpecifications. The defined ports will be forwarded to the client. internal (`bool`, optional) – if True, this will connect to the node using its internal IP. Only use this if running within the same VNET as the cluster. Defaults to False.
Returns:	Object representing the cluster’s configuration
Return type:	`aztk.models.ClusterConfiguration`

get_cluster_data(id: str) → aztk.internal.cluster_data.cluster_data.ClusterData[source]¶

Gets the ClusterData object to manage data related to the given cluster

Parameters:	id (`str`) – the id of the cluster to get
Returns:	Object used to manage the data and storage functions for a cluster
Return type:	`aztk.models.ClusterData`

ssh_into_node(id, node_id, username, ssh_key=None, password=None, port_forward_list=None, internal=False)[source]¶

Open an ssh tunnel to a node

Parameters:

id (str) – the id of the cluster the node is in
node_id (str) – the id of the node to open the ssh tunnel to
username (str) – the username to authenticate the ssh session
ssh_key (str, optional) – ssh public key to create the user with, must use ssh_key or password. Defaults to None.
password (str, optional) – password for the user, must use ssh_key or password. Defaults to None.
port_forward_list (List[PortForwardingSpecification, optional) – list of PortForwardingSpecifications. The defined ports will be forwarded to the client.
internal (bool, optional) – if True, this will connect to the node using its internal IP. Only use this if running within the same VNET as the cluster. Defaults to False.

Returns:

None

create_user_on_node(id, node_id, username, ssh_key=None, password=None)[source]¶

Create a user on a node

Parameters:	id (`str`) – id of the cluster to create the user on. node_id (`str`) – id of the node in the cluster to create the user on. username (`str`) – name of the user to create. ssh_key (`str`, optional) – ssh public key to create the user with, must use ssh_key or password. password (`str`, optional) – password for the user, must use ssh_key or password.
Returns:	`None`

create_user_on_cluster(id, nodes, username, ssh_pub_key=None, password=None)[source]¶

Create a user on every node in the cluster

Parameters:

username (str) – name of the user to create.
id (str) – id of the cluster to create the user on.
nodes (List[ComputeNode]) – list of nodes to create the user on
ssh_key (str, optional) – ssh public key to create the user with, must use ssh_key or password. Defaults to None.
password (str, optional) – password for the user, must use ssh_key or password. Defaults to None.

Returns:

None

generate_user_on_node(id, node_id)[source]¶

Create a user with an autogenerated username and ssh_key on the given node.

Parameters:	id (`str`) – the id of the cluster to generate the user on. node_id (`str`) – the id of the node in the cluster to generate the user on.
Returns:	A tuple of the form (username: `str`, ssh_key: `Cryptodome.PublicKey.RSA`)
Return type:	`tuple`

generate_user_on_cluster(id, nodes)[source]¶

Create a user with an autogenerated username and ssh_key on the cluster

Parameters:	id (`str`) – the id of the cluster to generate the user on. node_id (`str`) – the id of the node in the cluster to generate the user on.
Returns:	A tuple of the form (username: `str`, ssh_key: `Cryptodome.PublicKey.RSA`)
Return type:	`tuple`

delete_user_on_node(id: str, node_id: str, username: str) → str[source]¶

Delete a user on a node

Parameters:	id (`str`) – the id of the cluster to delete the user on. node_id (`str`) – the id of the node in the cluster to delete the user on. username (`str`) – the name of the user to delete.
Returns:	`None`

delete_user_on_cluster(username, id, nodes)[source]¶

Delete a user on every node in the cluster

Parameters:	id (`str`) – the id of the cluster to delete the user on. node_id (`str`) – the id of the node in the cluster to delete the user on. username (`str`) – the name of the user to delete.
Returns:	`None`

node_run(id, node_id, command, internal, container_name=None, timeout=None, block=True)[source]¶

Run a bash command on the given node

Parameters:	id (`str`) – the id of the cluster to run the command on. node_id (`str`) – the id of the node in the cluster to run the command on. command (`str`) – the bash command to execute on the node. internal (`bool`) – if True, this will connect to the node using its internal IP. Only use this if running within the same VNET as the cluster. Defaults to False. container_name=None (`str`, optional) – the name of the container to run the command in. If None, the command will run on the host VM. Defaults to None. timeout=None (`str`, optional) – The timeout in seconds for establishing a connection to the node. Defaults to None. block=True (`bool`, optional) – If True, the command blocks until execution is complete.
Returns:	object containing the output of the run command
Return type:	`aztk.models.NodeOutput`

get_remote_login_settings(id: str, node_id: str)[source]¶

Get the remote login information for a node in a cluster

Parameters:	id (`str`) – the id of the cluster the node is in node_id (`str`) – the id of the node in the cluster
Returns:	Object that contains the ip address and port combination to login to a node
Return type:	`aztk.models.RemoteLogin`

run(id, command, internal, container_name=None, timeout=None)[source]¶

Run a bash command on every node in the cluster

Parameters:	id (`str`) – the id of the cluster to run the command on. command (`str`) – the bash command to execute on the node. internal (`bool`) – if true, this will connect to the node using its internal IP. Only use this if running within the same VNET as the cluster. Defaults to False. container_name=None (`str`, optional) – the name of the container to run the command in. If None, the command will run on the host VM. Defaults to None. timeout=None (`str`, optional) – The timeout in seconds for establishing a connection to the node. Defaults to None.
Returns:	list of NodeOutput objects containing the output of the run command
Return type:	`List[azkt.models.NodeOutput]`

get_application_log(id: str, application_name: str, tail=False, current_bytes: int = 0)[source]¶

Get the log for a running or completed application

Parameters:	id (`str`) – the id of the cluster to run the command on. application_name (`str`) – str tail (`bool`, optional) – If True, get the remaining bytes after current_bytes. Otherwise, the whole log will be retrieved. Only use this if streaming the log as it is being written. Defaults to False. current_bytes (`int`) – Specifies the last seen byte, so only the bytes after current_bytes are retrieved. Only useful is streaming the log as it is being written. Only used if tail is True.
Returns:	a model representing the output of the application.
Return type:	`aztk.models.ApplicationLog`

create_task_table(id: str)[source]¶

Create an Azure Table Storage to track tasks

Parameters:	id (`str`) – the id of the cluster

list_task_table_entries(id)[source]¶

list tasks in a storage table

Parameters:	id (`str`) – the id of the cluster
Returns:	a list of models representing all entries in the Task table
Return type:	`[aztk.models.Task]`

get_task_from_table(id, task_id)[source]¶

Create a storage table to track tasks

Parameters:	id (`str`) – the id of the cluster
Returns:	the task with id task_id from the cluster’s storage table
Return type:	`[aztk.models.Task]`

insert_task_into_task_table(id, task)[source]¶

Insert a task into the table

Parameters:	id (`str`) – the id of the cluster
Returns:	a model representing an entry in the Task table
Return type:	`aztk.models.Task`

update_task_in_task_table(id, task)[source]¶

Update a task in the table

Parameters:	id (`str`) – the id of the cluster
Returns:	a model representing an entry in the Task table
Return type:	`aztk.models.Task`

delete_task_table(id)[source]¶

Delete the table that tracks tasks

Parameters:	id (`str`) – the id of the cluster
Returns:	if True, the deletion was successful
Return type:	`bool`

list_tasks(id)[source]¶

list tasks in a storage table

Parameters:	id (`str`) – the id of the cluster
Returns:	a list of models representing all entries in the Task table
Return type:	`[aztk.models.Task]`

get_recent_job(id)[source]¶

Get the most recently run job in an Azure Batch job schedule

Parameters:	id (`str`) – the id of the job schedule
Returns:	the most recently run job on the job schedule
Return type:	`[azure.batch.models.Job]`

get_task_state(id: str, task_name: str)[source]¶

Get the status of a submitted task

Parameters:	id (`str`) – the name of the cluster the task was submitted to task_name (`str`) – the name of the task to get
Returns:	the status state of the task
Return type:	`str`

list_batch_tasks(id: str)[source]¶

Get the status of a submitted task

Parameters:	id (`str`) – the name of the cluster the task was submitted to
Returns:	list of aztk tasks
Return type:	`[aztk.models.Task]`

get_batch_task(id: str, task_id: str)[source]¶

Get the status of a submitted task

Parameters:	id (`str`) – the name of the cluster the task was submitted to task_id (`str`) – the name of the task to get
Returns:	aztk Task representing the Batch Task
Return type:	`aztk.models.Task`

class aztk.client.cluster.CoreClusterOperations(context)[source]¶

Bases: aztk.client.base.base_operations.BaseOperations

create(cluster_configuration: aztk.models.cluster_configuration.ClusterConfiguration, software_metadata_key: str, start_task, vm_image_model)[source]¶

Create a cluster.

Parameters:	cluster_configuration (`aztk.models.ClusterConfiguration`) – Configuration for the cluster to be created software_metadata_key (`str`) – the key for the primary software that will be run on the cluster start_task (`azure.batch.models.StartTask`) – Batch StartTask defintion to configure the Batch Pool vm_image_model (`azure.batch.models.VirtualMachineConfiguration`) – Configuration of the virtual machine image and settings
Returns:	A Cluster object representing the state and configuration of the cluster.
Return type:	`aztk.models.Cluster`

get(id: str)[source]¶

Get the state and configuration of a cluster

Parameters:	id (`str`) – the id of the cluster to get.
Returns:	A Cluster object representing the state and configuration of the cluster.
Return type:	`aztk.models.Cluster`

copy(id, source_path, destination_path=None, container_name=None, internal=False, get=False, timeout=None)[source]¶

Copy files to or from every node in a cluster.

Parameters:	id (`str`) – the id of the cluster to copy files with. source_path (`str`) – the path of the file to copy from. destination_path (`str`, optional) – the local directory path where the output should be written. If None, a SpooledTemporaryFile will be returned in the NodeOutput object, else the file will be written to this path. Defaults to None. container_name (`str`, optional) – the name of the container to copy to or from. If None, the copy operation will occur on the host VM, Defaults to None. internal (`bool`, optional) – if True, this will connect to the node using its internal IP. Only use this if running within the same VNET as the cluster. Defaults to False. get (`bool`, optional) – If True, the file are downloaded from every node in the cluster. Else, the file is copied from the client to the node. Defaults to False. timeout (`int`, optional) – The timeout in seconds for establishing a connection to the node. Defaults to None.
Returns:	A list of NodeOutput objects representing the output of the copy command.
Return type:	`List[aztk.models.NodeOutput]`

delete(id: str, keep_logs: bool = False)[source]¶

Copy files to or from every node in a cluster.

Parameters:	id (`str`) – the id of the cluster to delete keep_logs (`bool`) – If True, the logs related to this cluster in Azure Storage are not deleted. Defaults to False.
Returns:	A list of NodeOutput objects representing the output of the copy command.
Return type:	`List[aztk.models.NodeOutput]`

list(software_metadata_key)[source]¶

List clusters running the specified software.

Parameters:	software_metadata_key (`str`) – the key of the primary softare running on the cluster. This filters out non-aztk clusters and aztk clusters running other software.
Returns:	list of clusters running the software defined by software_metadata_key
Return type:	`List[aztk.models.Cluster]`

wait(id, task_name)[source]¶

Wait until the task has completed

Parameters:	id (`str`) – the id of the job the task was submitted to task_name (`str`) – the name of the task to wait for
Returns:	`None`

class aztk.client.job.CoreJobOperations(context)[source]¶

Bases: aztk.client.base.base_operations.BaseOperations

submit(job_configuration, start_task, job_manager_task, autoscale_formula, software_metadata_key: str, vm_image_model, application_metadata)[source]¶

Submit a job

Jobs are a cluster definition and one or many application definitions which run on the cluster. The job’s cluster will be allocated and configured, then the applications will be executed with their output stored in Azure Storage. When all applications have completed, the cluster will be automatically deleted.

Parameters:	job_configuration (`aztk.models.JobConfiguration`) – Model defining the job’s configuration. start_task (`azure.batch.models.StartTask`) – Batch StartTask defintion to configure the Batch Pool job_manager_task (`azure.batch.models.JobManagerTask`) – Batch JobManagerTask defintion to schedule the defined applications on the cluster. autoscale_formula (`str`) – formula that defines the numbers of nodes allocated to the cluster. software_metadata_key (`str`) – the key of the primary softare running on the cluster. vm_image_model – application_metadata (`List[str]`) – list of the names of all applications that will be run as a part of the job
Returns:	Model representing the Azure Batch JobSchedule state.
Return type:	`azure.batch.models.CloudJobSchedule`

aztk.error module¶

Contains all errors used in Aztk. All error should inherit from AztkError

exception aztk.error.AztkError[source]¶: Bases: Exception

exception aztk.error.AztkAttributeError[source]¶: Bases: aztk.error.AztkError

exception aztk.error.ClusterNotReadyError[source]¶: Bases: aztk.error.AztkError

exception aztk.error.AzureApiInitError[source]¶: Bases: aztk.error.AztkError

exception aztk.error.InvalidPluginConfigurationError[source]¶: Bases: aztk.error.AztkError

exception aztk.error.InvalidModelError(message: str, model=None)[source]¶: Bases: aztk.error.AztkError

exception aztk.error.MissingRequiredAttributeError(message: str, model=None)[source]¶: Bases: aztk.error.InvalidModelError

exception aztk.error.InvalidPluginReferenceError(message: str, model=None)[source]¶: Bases: aztk.error.InvalidModelError

exception aztk.error.InvalidModelFieldError(message: str, model=None, field=None)[source]¶: Bases: aztk.error.InvalidModelError