class Aws::SageMaker::Types::HyperParameterTrainingJobDefinition

Defines the training jobs launched by a hyperparameter tuning job.

@note When making an API call, you may pass HyperParameterTrainingJobDefinition

data as a hash:

    {
      definition_name: "HyperParameterTrainingJobDefinitionName",
      tuning_objective: {
        type: "Maximize", # required, accepts Maximize, Minimize
        metric_name: "MetricName", # required
      },
      hyper_parameter_ranges: {
        integer_parameter_ranges: [
          {
            name: "ParameterKey", # required
            min_value: "ParameterValue", # required
            max_value: "ParameterValue", # required
            scaling_type: "Auto", # accepts Auto, Linear, Logarithmic, ReverseLogarithmic
          },
        ],
        continuous_parameter_ranges: [
          {
            name: "ParameterKey", # required
            min_value: "ParameterValue", # required
            max_value: "ParameterValue", # required
            scaling_type: "Auto", # accepts Auto, Linear, Logarithmic, ReverseLogarithmic
          },
        ],
        categorical_parameter_ranges: [
          {
            name: "ParameterKey", # required
            values: ["ParameterValue"], # required
          },
        ],
      },
      static_hyper_parameters: {
        "HyperParameterKey" => "HyperParameterValue",
      },
      algorithm_specification: { # required
        training_image: "AlgorithmImage",
        training_input_mode: "Pipe", # required, accepts Pipe, File
        algorithm_name: "ArnOrName",
        metric_definitions: [
          {
            name: "MetricName", # required
            regex: "MetricRegex", # required
          },
        ],
      },
      role_arn: "RoleArn", # required
      input_data_config: [
        {
          channel_name: "ChannelName", # required
          data_source: { # required
            s3_data_source: {
              s3_data_type: "ManifestFile", # required, accepts ManifestFile, S3Prefix, AugmentedManifestFile
              s3_uri: "S3Uri", # required
              s3_data_distribution_type: "FullyReplicated", # accepts FullyReplicated, ShardedByS3Key
              attribute_names: ["AttributeName"],
            },
            file_system_data_source: {
              file_system_id: "FileSystemId", # required
              file_system_access_mode: "rw", # required, accepts rw, ro
              file_system_type: "EFS", # required, accepts EFS, FSxLustre
              directory_path: "DirectoryPath", # required
            },
          },
          content_type: "ContentType",
          compression_type: "None", # accepts None, Gzip
          record_wrapper_type: "None", # accepts None, RecordIO
          input_mode: "Pipe", # accepts Pipe, File
          shuffle_config: {
            seed: 1, # required
          },
        },
      ],
      vpc_config: {
        security_group_ids: ["SecurityGroupId"], # required
        subnets: ["SubnetId"], # required
      },
      output_data_config: { # required
        kms_key_id: "KmsKeyId",
        s3_output_path: "S3Uri", # required
      },
      resource_config: { # required
        instance_type: "ml.m4.xlarge", # required, accepts ml.m4.xlarge, ml.m4.2xlarge, ml.m4.4xlarge, ml.m4.10xlarge, ml.m4.16xlarge, ml.g4dn.xlarge, ml.g4dn.2xlarge, ml.g4dn.4xlarge, ml.g4dn.8xlarge, ml.g4dn.12xlarge, ml.g4dn.16xlarge, ml.m5.large, ml.m5.xlarge, ml.m5.2xlarge, ml.m5.4xlarge, ml.m5.12xlarge, ml.m5.24xlarge, ml.c4.xlarge, ml.c4.2xlarge, ml.c4.4xlarge, ml.c4.8xlarge, ml.p2.xlarge, ml.p2.8xlarge, ml.p2.16xlarge, ml.p3.2xlarge, ml.p3.8xlarge, ml.p3.16xlarge, ml.p3dn.24xlarge, ml.p4d.24xlarge, ml.c5.xlarge, ml.c5.2xlarge, ml.c5.4xlarge, ml.c5.9xlarge, ml.c5.18xlarge, ml.c5n.xlarge, ml.c5n.2xlarge, ml.c5n.4xlarge, ml.c5n.9xlarge, ml.c5n.18xlarge
        instance_count: 1, # required
        volume_size_in_gb: 1, # required
        volume_kms_key_id: "KmsKeyId",
      },
      stopping_condition: { # required
        max_runtime_in_seconds: 1,
        max_wait_time_in_seconds: 1,
      },
      enable_network_isolation: false,
      enable_inter_container_traffic_encryption: false,
      enable_managed_spot_training: false,
      checkpoint_config: {
        s3_uri: "S3Uri", # required
        local_path: "DirectoryPath",
      },
      retry_strategy: {
        maximum_retry_attempts: 1, # required
      },
    }

@!attribute [rw] definition_name

The job definition name.
@return [String]

@!attribute [rw] tuning_objective

Defines the objective metric for a hyperparameter tuning job.
Hyperparameter tuning uses the value of this metric to evaluate the
training jobs it launches, and returns the training job that results
in either the highest or lowest value for this metric, depending on
the value you specify for the `Type` parameter.
@return [Types::HyperParameterTuningJobObjective]

@!attribute [rw] hyper_parameter_ranges

Specifies ranges of integer, continuous, and categorical
hyperparameters that a hyperparameter tuning job searches. The
hyperparameter tuning job launches training jobs with hyperparameter
values within these ranges to find the combination of values that
result in the training job with the best performance as measured by
the objective metric of the hyperparameter tuning job.

<note markdown="1"> You can specify a maximum of 20 hyperparameters that a
hyperparameter tuning job can search over. Every possible value of a
categorical parameter range counts against this limit.

 </note>
@return [Types::ParameterRanges]

@!attribute [rw] static_hyper_parameters

Specifies the values of hyperparameters that do not change for the
tuning job.
@return [Hash<String,String>]

@!attribute [rw] algorithm_specification

The HyperParameterAlgorithmSpecification object that specifies the
resource algorithm to use for the training jobs that the tuning job
launches.
@return [Types::HyperParameterAlgorithmSpecification]

@!attribute [rw] role_arn

The Amazon Resource Name (ARN) of the IAM role associated with the
training jobs that the tuning job launches.
@return [String]

@!attribute [rw] input_data_config

An array of Channel objects that specify the input for the training
jobs that the tuning job launches.
@return [Array<Types::Channel>]

@!attribute [rw] vpc_config

The VpcConfig object that specifies the VPC that you want the
training jobs that this hyperparameter tuning job launches to
connect to. Control access to and from your training container by
configuring the VPC. For more information, see [Protect Training
Jobs by Using an Amazon Virtual Private Cloud][1].

[1]: https://docs.aws.amazon.com/sagemaker/latest/dg/train-vpc.html
@return [Types::VpcConfig]

@!attribute [rw] output_data_config

Specifies the path to the Amazon S3 bucket where you store model
artifacts from the training jobs that the tuning job launches.
@return [Types::OutputDataConfig]

@!attribute [rw] resource_config

The resources, including the compute instances and storage volumes,
to use for the training jobs that the tuning job launches.

Storage volumes store model artifacts and incremental states.
Training algorithms might also use storage volumes for scratch
space. If you want Amazon SageMaker to use the storage volume to
store the training data, choose `File` as the `TrainingInputMode` in
the algorithm specification. For distributed training algorithms,
specify an instance count greater than 1.
@return [Types::ResourceConfig]

@!attribute [rw] stopping_condition

Specifies a limit to how long a model hyperparameter training job
can run. It also specifies how long a managed spot training job has
to complete. When the job reaches the time limit, Amazon SageMaker
ends the training job. Use this API to cap model training costs.
@return [Types::StoppingCondition]

@!attribute [rw] enable_network_isolation

Isolates the training container. No inbound or outbound network
calls can be made, except for calls between peers within a training
cluster for distributed training. If network isolation is used for
training jobs that are configured to use a VPC, Amazon SageMaker
downloads and uploads customer data and model artifacts through the
specified VPC, but the training container does not have network
access.
@return [Boolean]

@!attribute [rw] enable_inter_container_traffic_encryption

To encrypt all communications between ML compute instances in
distributed training, choose `True`. Encryption provides greater
security for distributed training, but training might take longer.
How long it takes depends on the amount of communication between
compute instances, especially if you use a deep learning algorithm
in distributed training.
@return [Boolean]

@!attribute [rw] enable_managed_spot_training

A Boolean indicating whether managed spot training is enabled
(`True`) or not (`False`).
@return [Boolean]

@!attribute [rw] checkpoint_config

Contains information about the output location for managed spot
training checkpoint data.
@return [Types::CheckpointConfig]

@!attribute [rw] retry_strategy

The number of times to retry the job when the job fails due to an
`InternalServerError`.
@return [Types::RetryStrategy]

@see docs.aws.amazon.com/goto/WebAPI/sagemaker-2017-07-24/HyperParameterTrainingJobDefinition AWS API Documentation

Constants

SENSITIVE