class Aws::SageMaker::Types::DatasetDefinition
Configuration for Dataset Definition inputs. The Dataset Definition input must specify exactly one of either `AthenaDatasetDefinition` or `RedshiftDatasetDefinition` types.
@note When making an API call, you may pass DatasetDefinition
data as a hash: { athena_dataset_definition: { catalog: "AthenaCatalog", # required database: "AthenaDatabase", # required query_string: "AthenaQueryString", # required work_group: "AthenaWorkGroup", output_s3_uri: "S3Uri", # required kms_key_id: "KmsKeyId", output_format: "PARQUET", # required, accepts PARQUET, ORC, AVRO, JSON, TEXTFILE output_compression: "GZIP", # accepts GZIP, SNAPPY, ZLIB }, redshift_dataset_definition: { cluster_id: "RedshiftClusterId", # required database: "RedshiftDatabase", # required db_user: "RedshiftUserName", # required query_string: "RedshiftQueryString", # required cluster_role_arn: "RoleArn", # required output_s3_uri: "S3Uri", # required kms_key_id: "KmsKeyId", output_format: "PARQUET", # required, accepts PARQUET, CSV output_compression: "None", # accepts None, GZIP, BZIP2, ZSTD, SNAPPY }, local_path: "ProcessingLocalPath", data_distribution_type: "FullyReplicated", # accepts FullyReplicated, ShardedByS3Key input_mode: "Pipe", # accepts Pipe, File }
@!attribute [rw] athena_dataset_definition
Configuration for Athena Dataset Definition input. @return [Types::AthenaDatasetDefinition]
@!attribute [rw] redshift_dataset_definition
Configuration for Redshift Dataset Definition input. @return [Types::RedshiftDatasetDefinition]
@!attribute [rw] local_path
The local path where you want Amazon SageMaker to download the Dataset Definition inputs to run a processing job. `LocalPath` is an absolute path to the input data. This is a required parameter when `AppManaged` is `False` (default). @return [String]
@!attribute [rw] data_distribution_type
Whether the generated dataset is `FullyReplicated` or `ShardedByS3Key` (default). @return [String]
@!attribute [rw] input_mode
Whether to use `File` or `Pipe` input mode. In `File` (default) mode, Amazon SageMaker copies the data from the input source onto the local Amazon Elastic Block Store (Amazon EBS) volumes before starting your training algorithm. This is the most commonly used input mode. In `Pipe` mode, Amazon SageMaker streams input data from the source directly to your algorithm without using the EBS volume. @return [String]
@see docs.aws.amazon.com/goto/WebAPI/sagemaker-2017-07-24/DatasetDefinition AWS API Documentation
Constants
- SENSITIVE