class Aws::SageMaker::Types::Channel
A channel is a named input source that training algorithms can consume.
@note When making an API call, you may pass Channel
data as a hash: { channel_name: "ChannelName", # required data_source: { # required s3_data_source: { s3_data_type: "ManifestFile", # required, accepts ManifestFile, S3Prefix, AugmentedManifestFile s3_uri: "S3Uri", # required s3_data_distribution_type: "FullyReplicated", # accepts FullyReplicated, ShardedByS3Key attribute_names: ["AttributeName"], }, file_system_data_source: { file_system_id: "FileSystemId", # required file_system_access_mode: "rw", # required, accepts rw, ro file_system_type: "EFS", # required, accepts EFS, FSxLustre directory_path: "DirectoryPath", # required }, }, content_type: "ContentType", compression_type: "None", # accepts None, Gzip record_wrapper_type: "None", # accepts None, RecordIO input_mode: "Pipe", # accepts Pipe, File shuffle_config: { seed: 1, # required }, }
@!attribute [rw] channel_name
The name of the channel. @return [String]
@!attribute [rw] data_source
The location of the channel data. @return [Types::DataSource]
@!attribute [rw] content_type
The MIME type of the data. @return [String]
@!attribute [rw] compression_type
If training data is compressed, the compression type. The default value is `None`. `CompressionType` is used only in Pipe input mode. In File mode, leave this field unset or set it to None. @return [String]
@!attribute [rw] record_wrapper_type
Specify RecordIO as the value when input data is in raw format but the training algorithm requires the RecordIO format. In this case, Amazon SageMaker wraps each individual S3 object in a RecordIO record. If the input data is already in RecordIO format, you don't need to set this attribute. For more information, see [Create a Dataset Using RecordIO][1]. In File mode, leave this field unset or set it to None. [1]: https://mxnet.apache.org/api/architecture/note_data_loading#data-format @return [String]
@!attribute [rw] input_mode
(Optional) The input mode to use for the data channel in a training job. If you don't set a value for `InputMode`, Amazon SageMaker uses the value set for `TrainingInputMode`. Use this parameter to override the `TrainingInputMode` setting in a AlgorithmSpecification request when you have a channel that needs a different input mode from the training job's general setting. To download the data from Amazon Simple Storage Service (Amazon S3) to the provisioned ML storage volume, and mount the directory to a Docker volume, use `File` input mode. To stream data directly from Amazon S3 to the container, choose `Pipe` input mode. To use a model for incremental training, choose `File` input model. @return [String]
@!attribute [rw] shuffle_config
A configuration for a shuffle option for input data in a channel. If you use `S3Prefix` for `S3DataType`, this shuffles the results of the S3 key prefix matches. If you use `ManifestFile`, the order of the S3 object references in the `ManifestFile` is shuffled. If you use `AugmentedManifestFile`, the order of the JSON lines in the `AugmentedManifestFile` is shuffled. The shuffling order is determined using the `Seed` value. For Pipe input mode, shuffling is done at the start of every epoch. With large datasets this ensures that the order of the training data is different for each epoch, it helps reduce bias and possible overfitting. In a multi-node training job when ShuffleConfig is combined with `S3DataDistributionType` of `ShardedByS3Key`, the data is shuffled across nodes so that the content sent to a particular node on the first epoch might be sent to a different node on the second epoch. @return [Types::ShuffleConfig]
@see docs.aws.amazon.com/goto/WebAPI/sagemaker-2017-07-24/Channel AWS API Documentation
Constants
- SENSITIVE