class Aws::SageMaker::Types::Channel

A channel is a named input source that training algorithms can consume.

@note When making an API call, you may pass Channel

data as a hash:

    {
      channel_name: "ChannelName", # required
      data_source: { # required
        s3_data_source: {
          s3_data_type: "ManifestFile", # required, accepts ManifestFile, S3Prefix, AugmentedManifestFile
          s3_uri: "S3Uri", # required
          s3_data_distribution_type: "FullyReplicated", # accepts FullyReplicated, ShardedByS3Key
          attribute_names: ["AttributeName"],
        },
        file_system_data_source: {
          file_system_id: "FileSystemId", # required
          file_system_access_mode: "rw", # required, accepts rw, ro
          file_system_type: "EFS", # required, accepts EFS, FSxLustre
          directory_path: "DirectoryPath", # required
        },
      },
      content_type: "ContentType",
      compression_type: "None", # accepts None, Gzip
      record_wrapper_type: "None", # accepts None, RecordIO
      input_mode: "Pipe", # accepts Pipe, File
      shuffle_config: {
        seed: 1, # required
      },
    }

@!attribute [rw] channel_name

The name of the channel.
@return [String]

@!attribute [rw] data_source

The location of the channel data.
@return [Types::DataSource]

@!attribute [rw] content_type

The MIME type of the data.
@return [String]

@!attribute [rw] compression_type

If training data is compressed, the compression type. The default
value is `None`. `CompressionType` is used only in Pipe input mode.
In File mode, leave this field unset or set it to None.
@return [String]

@!attribute [rw] record_wrapper_type

Specify RecordIO as the value when input data is in raw format but
the training algorithm requires the RecordIO format. In this case,
Amazon SageMaker wraps each individual S3 object in a RecordIO
record. If the input data is already in RecordIO format, you don't
need to set this attribute. For more information, see [Create a
Dataset Using RecordIO][1].

In File mode, leave this field unset or set it to None.

[1]: https://mxnet.apache.org/api/architecture/note_data_loading#data-format
@return [String]

@!attribute [rw] input_mode

(Optional) The input mode to use for the data channel in a training
job. If you don't set a value for `InputMode`, Amazon SageMaker
uses the value set for `TrainingInputMode`. Use this parameter to
override the `TrainingInputMode` setting in a AlgorithmSpecification
request when you have a channel that needs a different input mode
from the training job's general setting. To download the data from
Amazon Simple Storage Service (Amazon S3) to the provisioned ML
storage volume, and mount the directory to a Docker volume, use
`File` input mode. To stream data directly from Amazon S3 to the
container, choose `Pipe` input mode.

To use a model for incremental training, choose `File` input model.
@return [String]

@!attribute [rw] shuffle_config

A configuration for a shuffle option for input data in a channel. If
you use `S3Prefix` for `S3DataType`, this shuffles the results of
the S3 key prefix matches. If you use `ManifestFile`, the order of
the S3 object references in the `ManifestFile` is shuffled. If you
use `AugmentedManifestFile`, the order of the JSON lines in the
`AugmentedManifestFile` is shuffled. The shuffling order is
determined using the `Seed` value.

For Pipe input mode, shuffling is done at the start of every epoch.
With large datasets this ensures that the order of the training data
is different for each epoch, it helps reduce bias and possible
overfitting. In a multi-node training job when ShuffleConfig is
combined with `S3DataDistributionType` of `ShardedByS3Key`, the data
is shuffled across nodes so that the content sent to a particular
node on the first epoch might be sent to a different node on the
second epoch.
@return [Types::ShuffleConfig]

@see docs.aws.amazon.com/goto/WebAPI/sagemaker-2017-07-24/Channel AWS API Documentation

Constants

SENSITIVE