machinelearning_create_data_source_from_s3 {paws.machine.learning}R Documentation

Creates a DataSource object

Description

Creates a DataSource object. A DataSource references data that can be used to perform create_ml_model, create_evaluation, or create_batch_prediction operations.

See https://www.paws-r-sdk.com/docs/machinelearning_create_data_source_from_s3/ for full documentation.

Usage

machinelearning_create_data_source_from_s3(
  DataSourceId,
  DataSourceName = NULL,
  DataSpec,
  ComputeStatistics = NULL
)

Arguments

DataSourceId

[required] A user-supplied identifier that uniquely identifies the DataSource.

DataSourceName

A user-supplied name or description of the DataSource.

DataSpec

[required] The data specification of a DataSource:

  • DataLocationS3 - The Amazon S3 location of the observation data.

  • DataSchemaLocationS3 - The Amazon S3 location of the DataSchema.

  • DataSchema - A JSON string representing the schema. This is not required if DataSchemaUri is specified.

  • DataRearrangement - A JSON string that represents the splitting and rearrangement requirements for the Datasource.

    Sample - ⁠ "{\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}"⁠

ComputeStatistics

The compute statistics for a DataSource. The statistics are generated from the observation data referenced by a DataSource. Amazon ML uses the statistics internally during MLModel training. This parameter must be set to true if the DataSource needs to be used for MLModel training.


[Package paws.machine.learning version 0.7.0 Index]