aztk package

aztk.client module

class aztk.client.CoreClient[source]

Bases: object

The base AZTK client that all other clients inherit from.

This client should not be used directly. Only software specific clients should be used.

class aztk.client.base.BaseOperations(context)[source]

Bases: object

Base operations that all other operations have as an attribute

batch_client

azure.batch.batch_service_client.BatchServiceClient – Client used to interact with the Azure Batch service.

blob_client

azure.storage.blob.BlockBlobService – Client used to interact with the Azure Storage Blob service.

secrets_configuration

aztk.models.SecretsConfiguration – Model that holds AZTK secrets used to authenticate with Azure and the clusters.

get_cluster_configuration(id: str) → aztk.models.cluster_configuration.ClusterConfiguration[source]

Open an ssh tunnel to a node

Parameters:
  • id (str) – the id of the cluster the node is in
  • node_id (str) – the id of the node to open the ssh tunnel to
  • username (str) – the username to authenticate the ssh session
  • ssh_key (str, optional) – ssh public key to create the user with, must use ssh_key or password. Defaults to None.
  • password (str, optional) – password for the user, must use ssh_key or password. Defaults to None.
  • port_forward_list (List[PortForwardingSpecification, optional) – list of PortForwardingSpecifications. The defined ports will be forwarded to the client.
  • internal (bool, optional) – if True, this will connect to the node using its internal IP. Only use this if running within the same VNET as the cluster. Defaults to False.
Returns:

Object representing the cluster’s configuration

Return type:

aztk.models.ClusterConfiguration

get_cluster_data(id: str) → aztk.internal.cluster_data.cluster_data.ClusterData[source]

Gets the ClusterData object to manage data related to the given cluster

Parameters:id (str) – the id of the cluster to get
Returns:Object used to manage the data and storage functions for a cluster
Return type:aztk.models.ClusterData
ssh_into_node(id, node_id, username, ssh_key=None, password=None, port_forward_list=None, internal=False)[source]

Open an ssh tunnel to a node

Parameters:
  • id (str) – the id of the cluster the node is in
  • node_id (str) – the id of the node to open the ssh tunnel to
  • username (str) – the username to authenticate the ssh session
  • ssh_key (str, optional) – ssh public key to create the user with, must use ssh_key or password. Defaults to None.
  • password (str, optional) – password for the user, must use ssh_key or password. Defaults to None.
  • port_forward_list (List[PortForwardingSpecification, optional) – list of PortForwardingSpecifications. The defined ports will be forwarded to the client.
  • internal (bool, optional) – if True, this will connect to the node using its internal IP. Only use this if running within the same VNET as the cluster. Defaults to False.
Returns:

None

create_user_on_node(id, node_id, username, ssh_key=None, password=None)[source]

Create a user on a node

Parameters:
  • id (str) – id of the cluster to create the user on.
  • node_id (str) – id of the node in the cluster to create the user on.
  • username (str) – name of the user to create.
  • ssh_key (str, optional) – ssh public key to create the user with, must use ssh_key or password.
  • password (str, optional) – password for the user, must use ssh_key or password.
Returns:

None

create_user_on_cluster(id, nodes, username, ssh_pub_key=None, password=None)[source]

Create a user on every node in the cluster

Parameters:
  • username (str) – name of the user to create.
  • id (str) – id of the cluster to create the user on.
  • nodes (List[ComputeNode]) – list of nodes to create the user on
  • ssh_key (str, optional) – ssh public key to create the user with, must use ssh_key or password. Defaults to None.
  • password (str, optional) – password for the user, must use ssh_key or password. Defaults to None.
Returns:

None

generate_user_on_node(id, node_id)[source]

Create a user with an autogenerated username and ssh_key on the given node.

Parameters:
  • id (str) – the id of the cluster to generate the user on.
  • node_id (str) – the id of the node in the cluster to generate the user on.
Returns:

A tuple of the form (username: str, ssh_key: Cryptodome.PublicKey.RSA)

Return type:

tuple

generate_user_on_cluster(id, nodes)[source]

Create a user with an autogenerated username and ssh_key on the cluster

Parameters:
  • id (str) – the id of the cluster to generate the user on.
  • node_id (str) – the id of the node in the cluster to generate the user on.
Returns:

A tuple of the form (username: str, ssh_key: Cryptodome.PublicKey.RSA)

Return type:

tuple

delete_user_on_node(id: str, node_id: str, username: str) → str[source]

Delete a user on a node

Parameters:
  • id (str) – the id of the cluster to delete the user on.
  • node_id (str) – the id of the node in the cluster to delete the user on.
  • username (str) – the name of the user to delete.
Returns:

None

delete_user_on_cluster(username, id, nodes)[source]

Delete a user on every node in the cluster

Parameters:
  • id (str) – the id of the cluster to delete the user on.
  • node_id (str) – the id of the node in the cluster to delete the user on.
  • username (str) – the name of the user to delete.
Returns:

None

node_run(id, node_id, command, internal, container_name=None, timeout=None, block=True)[source]

Run a bash command on the given node

Parameters:
  • id (str) – the id of the cluster to run the command on.
  • node_id (str) – the id of the node in the cluster to run the command on.
  • command (str) – the bash command to execute on the node.
  • internal (bool) – if True, this will connect to the node using its internal IP. Only use this if running within the same VNET as the cluster. Defaults to False.
  • container_name=None (str, optional) – the name of the container to run the command in. If None, the command will run on the host VM. Defaults to None.
  • timeout=None (str, optional) – The timeout in seconds for establishing a connection to the node. Defaults to None.
  • block=True (bool, optional) – If True, the command blocks until execution is complete.
Returns:

object containing the output of the run command

Return type:

aztk.models.NodeOutput

get_remote_login_settings(id: str, node_id: str)[source]

Get the remote login information for a node in a cluster

Parameters:
  • id (str) – the id of the cluster the node is in
  • node_id (str) – the id of the node in the cluster
Returns:

Object that contains the ip address and port combination to login to a node

Return type:

aztk.models.RemoteLogin

run(id, command, internal, container_name=None, timeout=None)[source]

Run a bash command on every node in the cluster

Parameters:
  • id (str) – the id of the cluster to run the command on.
  • command (str) – the bash command to execute on the node.
  • internal (bool) – if true, this will connect to the node using its internal IP. Only use this if running within the same VNET as the cluster. Defaults to False.
  • container_name=None (str, optional) – the name of the container to run the command in. If None, the command will run on the host VM. Defaults to None.
  • timeout=None (str, optional) – The timeout in seconds for establishing a connection to the node. Defaults to None.
Returns:

list of NodeOutput objects containing the output of the run command

Return type:

List[azkt.models.NodeOutput]

get_application_log(id: str, application_name: str, tail=False, current_bytes: int = 0)[source]

Get the log for a running or completed application

Parameters:
  • id (str) – the id of the cluster to run the command on.
  • application_name (str) – str
  • tail (bool, optional) – If True, get the remaining bytes after current_bytes. Otherwise, the whole log will be retrieved. Only use this if streaming the log as it is being written. Defaults to False.
  • current_bytes (int) – Specifies the last seen byte, so only the bytes after current_bytes are retrieved. Only useful is streaming the log as it is being written. Only used if tail is True.
Returns:

a model representing the output of the application.

Return type:

aztk.models.ApplicationLog

create_task_table(id: str)[source]

Create an Azure Table Storage to track tasks

Parameters:id (str) – the id of the cluster
list_task_table_entries(id)[source]

list tasks in a storage table

Parameters:id (str) – the id of the cluster
Returns:a list of models representing all entries in the Task table
Return type:[aztk.models.Task]
get_task_from_table(id, task_id)[source]

Create a storage table to track tasks

Parameters:id (str) – the id of the cluster
Returns:the task with id task_id from the cluster’s storage table
Return type:[aztk.models.Task]
insert_task_into_task_table(id, task)[source]

Insert a task into the table

Parameters:id (str) – the id of the cluster
Returns:a model representing an entry in the Task table
Return type:aztk.models.Task
update_task_in_task_table(id, task)[source]

Update a task in the table

Parameters:id (str) – the id of the cluster
Returns:a model representing an entry in the Task table
Return type:aztk.models.Task
delete_task_table(id)[source]

Delete the table that tracks tasks

Parameters:id (str) – the id of the cluster
Returns:if True, the deletion was successful
Return type:bool
list_tasks(id)[source]

list tasks in a storage table

Parameters:id (str) – the id of the cluster
Returns:a list of models representing all entries in the Task table
Return type:[aztk.models.Task]
get_recent_job(id)[source]

Get the most recently run job in an Azure Batch job schedule

Parameters:id (str) – the id of the job schedule
Returns:the most recently run job on the job schedule
Return type:[azure.batch.models.Job]
get_task_state(id: str, task_name: str)[source]

Get the status of a submitted task

Parameters:
  • id (str) – the name of the cluster the task was submitted to
  • task_name (str) – the name of the task to get
Returns:

the status state of the task

Return type:

str

list_batch_tasks(id: str)[source]

Get the status of a submitted task

Parameters:id (str) – the name of the cluster the task was submitted to
Returns:list of aztk tasks
Return type:[aztk.models.Task]
get_batch_task(id: str, task_id: str)[source]

Get the status of a submitted task

Parameters:
  • id (str) – the name of the cluster the task was submitted to
  • task_id (str) – the name of the task to get
Returns:

aztk Task representing the Batch Task

Return type:

aztk.models.Task

class aztk.client.cluster.CoreClusterOperations(context)[source]

Bases: aztk.client.base.base_operations.BaseOperations

create(cluster_configuration: aztk.models.cluster_configuration.ClusterConfiguration, software_metadata_key: str, start_task, vm_image_model)[source]

Create a cluster.

Parameters:
  • cluster_configuration (aztk.models.ClusterConfiguration) – Configuration for the cluster to be created
  • software_metadata_key (str) – the key for the primary software that will be run on the cluster
  • start_task (azure.batch.models.StartTask) – Batch StartTask defintion to configure the Batch Pool
  • vm_image_model (azure.batch.models.VirtualMachineConfiguration) – Configuration of the virtual machine image and settings
Returns:

A Cluster object representing the state and configuration of the cluster.

Return type:

aztk.models.Cluster

get(id: str)[source]

Get the state and configuration of a cluster

Parameters:id (str) – the id of the cluster to get.
Returns:A Cluster object representing the state and configuration of the cluster.
Return type:aztk.models.Cluster
copy(id, source_path, destination_path=None, container_name=None, internal=False, get=False, timeout=None)[source]

Copy files to or from every node in a cluster.

Parameters:
  • id (str) – the id of the cluster to copy files with.
  • source_path (str) – the path of the file to copy from.
  • destination_path (str, optional) – the local directory path where the output should be written. If None, a SpooledTemporaryFile will be returned in the NodeOutput object, else the file will be written to this path. Defaults to None.
  • container_name (str, optional) – the name of the container to copy to or from. If None, the copy operation will occur on the host VM, Defaults to None.
  • internal (bool, optional) – if True, this will connect to the node using its internal IP. Only use this if running within the same VNET as the cluster. Defaults to False.
  • get (bool, optional) – If True, the file are downloaded from every node in the cluster. Else, the file is copied from the client to the node. Defaults to False.
  • timeout (int, optional) – The timeout in seconds for establishing a connection to the node. Defaults to None.
Returns:

A list of NodeOutput objects representing the output of the copy command.

Return type:

List[aztk.models.NodeOutput]

delete(id: str, keep_logs: bool = False)[source]

Copy files to or from every node in a cluster.

Parameters:
  • id (str) – the id of the cluster to delete
  • keep_logs (bool) – If True, the logs related to this cluster in Azure Storage are not deleted. Defaults to False.
Returns:

A list of NodeOutput objects representing the output of the copy command.

Return type:

List[aztk.models.NodeOutput]

list(software_metadata_key)[source]

List clusters running the specified software.

Parameters:software_metadata_key (str) – the key of the primary softare running on the cluster. This filters out non-aztk clusters and aztk clusters running other software.
Returns:list of clusters running the software defined by software_metadata_key
Return type:List[aztk.models.Cluster]
wait(id, task_name)[source]

Wait until the task has completed

Parameters:
  • id (str) – the id of the job the task was submitted to
  • task_name (str) – the name of the task to wait for
Returns:

None

class aztk.client.job.CoreJobOperations(context)[source]

Bases: aztk.client.base.base_operations.BaseOperations

submit(job_configuration, start_task, job_manager_task, autoscale_formula, software_metadata_key: str, vm_image_model, application_metadata)[source]

Submit a job

Jobs are a cluster definition and one or many application definitions which run on the cluster. The job’s cluster will be allocated and configured, then the applications will be executed with their output stored in Azure Storage. When all applications have completed, the cluster will be automatically deleted.

Parameters:
  • job_configuration (aztk.models.JobConfiguration) – Model defining the job’s configuration.
  • start_task (azure.batch.models.StartTask) – Batch StartTask defintion to configure the Batch Pool
  • job_manager_task (azure.batch.models.JobManagerTask) – Batch JobManagerTask defintion to schedule the defined applications on the cluster.
  • autoscale_formula (str) – formula that defines the numbers of nodes allocated to the cluster.
  • software_metadata_key (str) – the key of the primary softare running on the cluster.
  • vm_image_model
  • application_metadata (List[str]) – list of the names of all applications that will be run as a part of the job
Returns:

Model representing the Azure Batch JobSchedule state.

Return type:

azure.batch.models.CloudJobSchedule

aztk.error module

Contains all errors used in Aztk. All error should inherit from AztkError

exception aztk.error.AztkError[source]

Bases: Exception

exception aztk.error.AztkAttributeError[source]

Bases: aztk.error.AztkError

exception aztk.error.ClusterNotReadyError[source]

Bases: aztk.error.AztkError

exception aztk.error.AzureApiInitError[source]

Bases: aztk.error.AztkError

exception aztk.error.InvalidPluginConfigurationError[source]

Bases: aztk.error.AztkError

exception aztk.error.InvalidModelError(message: str, model=None)[source]

Bases: aztk.error.AztkError

exception aztk.error.MissingRequiredAttributeError(message: str, model=None)[source]

Bases: aztk.error.InvalidModelError

exception aztk.error.InvalidPluginReferenceError(message: str, model=None)[source]

Bases: aztk.error.InvalidModelError

exception aztk.error.InvalidModelFieldError(message: str, model=None, field=None)[source]

Bases: aztk.error.InvalidModelError