module Sequel::Impala::DatabaseMethods

Public Instance Methods

compute_stats(table_name) click to toggle source
   # File lib/sequel/adapters/shared/impala.rb
23 def compute_stats(table_name)
24   run(compute_stats_sql(table_name))
25 end
create_join_table(hash, options=OPTS) click to toggle source

Do not use a composite primary key, foreign keys, or an index when creating a join table, as Impala doesn't support those.

   # File lib/sequel/adapters/shared/impala.rb
10 def create_join_table(hash, options=OPTS)
11   keys = hash.keys.sort_by(&:to_s)
12   create_table(join_table_name(hash, options), options) do
13     keys.each do |key|
14       Integer key
15     end
16   end
17 end
create_schema(schema, options=OPTS) click to toggle source

Create a database/schema in Imapala.

Options:

:if_not_exists

Don't raise an error if the schema already exists.

:location

Set the file system location to store the data for tables in the created schema.

Examples:

create_schema(:s)
# CREATE SCHEMA `s`

create_schema(:s, :if_not_exists=>true)
# CREATE SCHEMA IF NOT EXISTS `s`

create_schema(:s, :location=>'/a/b')
# CREATE SCHEMA `s` LOCATION '/a/b'
   # File lib/sequel/adapters/shared/impala.rb
44 def create_schema(schema, options=OPTS)
45   run(create_schema_sql(schema, options))
46 end
database_type() click to toggle source

Set the database_type for this database to :impala.

   # File lib/sequel/adapters/shared/impala.rb
49 def database_type
50   :impala
51 end
describe(table, opts=OPTS) click to toggle source

Return the DESCRIBE output for the table, showing table columns, types, and comments. If the :formatted option is given, use DESCRIBE FORMATTED and return a lot more information about the table. Both of these return arrays of hashes.

Examples:

describe(:t)
# DESCRIBE `t`

describe(:t, :formatted=>true)
# DESCRIBE FORMATTED `t`
   # File lib/sequel/adapters/shared/impala.rb
66 def describe(table, opts=OPTS)
67   if ds = opts[:dataset]
68     ds = ds.naked
69   else
70     ds = dataset.clone
71     ds.identifier_input_method = identifier_input_method
72   end
73   ds.identifier_output_method = nil
74   ds.with_sql("DESCRIBE #{'FORMATTED ' if opts[:formatted]} ?", table).all
75 end
drop_schema(schema, options=OPTS) click to toggle source

Drop a database/schema from Imapala.

Options:

:if_exists

Don't raise an error if the schema doesn't exist.

Examples:

drop_schema(:s)
# DROP SCHEMA `s`

create_schema(:s, :if_exists=>true)
# DROP SCHEMA IF EXISTS `s`
   # File lib/sequel/adapters/shared/impala.rb
89 def drop_schema(schema, options=OPTS)
90   run(drop_schema_sql(schema, options))
91 end
implicit_qualify(table) click to toggle source

Implicitly quailfy the table if using the :search_path option. This will look at all of the tables and views in the schemas, and if an unqualified table is used and appears in one of the schemas, it will be implicitly qualified with the given schema name.

    # File lib/sequel/adapters/shared/impala.rb
 98 def implicit_qualify(table)
 99   return table unless opts[:search_path]
100 
101   case table
102   when Symbol
103     s, t, a = Sequel.split_symbol(table)
104     if s
105       return table
106     end
107     t = implicit_qualify(t)
108     a ? Sequel.as(t, a) : t
109   when String
110     if schema = search_path_table_schemas[table]
111       Sequel.qualify(schema, table)
112     else
113       invalidate_table_schemas
114       if schema = search_path_table_schemas[table]
115         Sequel.qualify(schema, table)
116       else
117         Sequel.identifier(table)
118       end
119     end
120   when SQL::Identifier
121     implicit_qualify(table.value.to_s)
122   when SQL::AliasedExpression
123     SQL::AliasedExpression.new(implicit_qualify(table.expression), table.alias)
124   else
125     table
126   end
127 end
invalidate_table_schemas() click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
189 def invalidate_table_schemas
190   @search_path_table_schemas = nil
191 end
load_data(path, table, options=OPTS) click to toggle source

Load data from HDFS into Impala.

Options:

:overwrite

Overwrite the existing table instead of appending to it.

Examples:

load_data('/user/foo', :bar)
LOAD DATA INPATH '/user/foo' INTO TABLE `bar`

load_data('/user/foo', :bar, :overwrite=>true)
LOAD DATA INPATH '/user/foo' OVERWRITE INTO TABLE `bar`
    # File lib/sequel/adapters/shared/impala.rb
141 def load_data(path, table, options=OPTS)
142   run(load_data_sql(path, table, options))
143 end
refresh(table_name) click to toggle source
   # File lib/sequel/adapters/shared/impala.rb
19 def refresh(table_name)
20   run(refresh_sql(table_name))
21 end
serial_primary_key_options() click to toggle source

Don't use PRIMARY KEY or AUTOINCREMENT on Impala, as Impala doesn't support either.

    # File lib/sequel/adapters/shared/impala.rb
147 def serial_primary_key_options
148   {:type=>Integer}
149 end
set(opts) click to toggle source

Sets options in the current db connection for each key/value pair

    # File lib/sequel/adapters/shared/impala.rb
206 def set(opts)
207   set_sql(opts).each do |sql|
208     run(sql)
209   end
210 end
supports_create_table_if_not_exists?() click to toggle source

Impala supports CREATE TABLE IF NOT EXISTS.

    # File lib/sequel/adapters/shared/impala.rb
152 def supports_create_table_if_not_exists?
153   true
154 end
supports_foreign_key_parsing?() click to toggle source

Impala does not support foreign keys.

    # File lib/sequel/adapters/shared/impala.rb
157 def supports_foreign_key_parsing?
158   false
159 end
supports_index_parsing?() click to toggle source

Impala does not support indexes.

    # File lib/sequel/adapters/shared/impala.rb
162 def supports_index_parsing?
163   false
164 end
tables(opts=OPTS) click to toggle source

Check that the tables returned by the JDBC driver are actually valid tables and not views. The Hive2 JDBC driver returns views when listing tables and nothing when listing views.

    # File lib/sequel/adapters/shared/impala.rb
169 def tables(opts=OPTS)
170   _tables(opts).select{|t| is_valid_table?(t, opts)}
171 end
transaction(opts=OPTS) { |c| ... } click to toggle source

Impala doesn't support transactions, so instead of issuing a transaction, just checkout a connection. This ensures the same connection is used for the transaction block, but as Impala doesn't support transactions, you can't rollback.

    # File lib/sequel/adapters/shared/impala.rb
177 def transaction(opts=OPTS)
178   synchronize(opts[:server]) do |c|
179     yield c
180   end
181 end
values(v) click to toggle source

Creates a dataset that uses the VALUES clause:

DB.values([[1, 2], [3, 4]])
VALUES ((1, 2), (3, 4))
    # File lib/sequel/adapters/shared/impala.rb
197 def values(v)
198   @default_dataset.clone(:values=>v)
199 end
views(opts=OPTS) click to toggle source

Determine the available views for listing all tables via JDBC (which includes both tables and views), and removing all valid tables.

    # File lib/sequel/adapters/shared/impala.rb
185 def views(opts=OPTS)
186   _tables(opts).reject{|t| is_valid_table?(t, opts)}
187 end

Private Instance Methods

_tables(opts) click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
214 def _tables(opts)
215   m = output_identifier_meth
216   metadata_dataset.with_sql("SHOW TABLES#{" IN #{quote_identifier(opts[:schema])}" if opts[:schema]}").
217     select_map(:name).map do |table|
218       m.call(table)
219     end
220 end
alter_table_add_column_sql(table, op) click to toggle source

Impala uses ADD COLUMNS instead of ADD COLUMN. As its use of ADD COLUMNS implies, it supports adding multiple columns at once, but this adapter doesn't offer an API for that.

    # File lib/sequel/adapters/shared/impala.rb
225 def alter_table_add_column_sql(table, op)
226   "ADD COLUMNS (#{column_definition_sql(op)})"
227 end
alter_table_rename_column_sql(table, op) click to toggle source

Impala uses CHANGE instead of having separate RENAME syntax for renaming tables. As CHANGE requires a type, look up the type from the database schema.

    # File lib/sequel/adapters/shared/impala.rb
232 def alter_table_rename_column_sql(table, op)
233   old_name = op[:name]
234   opts = schema(table).find{|x| x.first == old_name}
235   opts = opts ? opts.last : {}
236   unless opts[:db_type]
237     raise Error, "cannot determine database type to use for CHANGE COLUMN operation"
238   end
239   new_col = op.merge(:type=>opts[:db_type], :name=>op[:new_name])
240   "CHANGE #{quote_identifier(old_name)} #{column_definition_sql(new_col)}"
241 end
alter_table_set_column_type_sql(table, op) click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
243 def alter_table_set_column_type_sql(table, op)
244   "CHANGE #{quote_identifier(op[:name])} #{column_definition_sql(op)}"
245 end
column_definition_comment_sql(sql, column) click to toggle source

Add COMMENT when defining the column, if :comment is present.

    # File lib/sequel/adapters/shared/impala.rb
248 def column_definition_comment_sql(sql, column)
249   sql << " COMMENT #{literal(column[:comment])}" if column[:comment]
250 end
column_definition_order() click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
252 def column_definition_order
253   [:comment]
254 end
compute_stats_sql(table_name) click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
298 def compute_stats_sql(table_name)
299   "COMPUTE STATS #{quote_schema_table(table_name)}"
300 end
create_schema_sql(schema, options) click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
256 def create_schema_sql(schema, options)
257   "CREATE SCHEMA #{'IF NOT EXISTS ' if options[:if_not_exists]}#{quote_identifier(schema)}#{" LOCATION #{literal(options[:location])}" if options[:location]}"
258 end
create_table_as_sql(name, sql, options) click to toggle source

Support using table parameters for CREATE TABLE AS, necessary for creating parquet files from datasets.

    # File lib/sequel/adapters/shared/impala.rb
262 def create_table_as_sql(name, sql, options)
263   "#{create_table_prefix_sql(name, options)}#{create_table_parameters_sql(options) } AS #{sql}"
264 end
create_table_parameters_sql(options) click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
276 def create_table_parameters_sql(options)
277   sql = ""
278   sql << " COMMENT #{literal(options[:comment])}" if options[:comment]
279   if options[:field_term] || options[:line_term]
280     sql << " ROW FORMAT DELIMITED"
281     if options[:field_term]
282       sql << " FIELDS TERMINATED BY #{literal(options[:field_term])}"
283       sql << " ESCAPED BY #{literal(options[:field_escape])}" if options[:field_escape]
284     end
285     if options[:line_term]
286       sql << " LINES TERMINATED BY #{literal(options[:line_term])}"
287     end
288   end
289   sql << " STORED AS #{options[:stored_as]}" if options[:stored_as]
290   sql << " LOCATION #{literal(options[:location])}" if options[:location]
291   sql
292 end
create_table_prefix_sql(name, options) click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
266 def create_table_prefix_sql(name, options)
267   "CREATE #{'EXTERNAL ' if options[:external]}TABLE#{' IF NOT EXISTS' if options[:if_not_exists]} #{quote_schema_table(name)}"
268 end
create_table_sql(name, generator, options) click to toggle source
Calls superclass method
    # File lib/sequel/adapters/shared/impala.rb
270 def create_table_sql(name, generator, options)
271   sql = super
272   sql += create_table_parameters_sql(options)
273   sql
274 end
drop_schema_sql(schema, options) click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
302 def drop_schema_sql(schema, options)
303   "DROP SCHEMA #{'IF EXISTS ' if options[:if_exists]}#{quote_identifier(schema)}"
304 end
identifier_input_method_default() click to toggle source

Impala folds identifiers to lowercase, quoted or not, and is actually case insensitive, so don't use an identifier input or output method.

    # File lib/sequel/adapters/shared/impala.rb
308 def identifier_input_method_default
309   nil
310 end
identifier_output_method_default() click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
311 def identifier_output_method_default
312   nil
313 end
is_valid_table?(t, opts=OPTS) click to toggle source

SHOW TABLE STATS will raise an error if given a view and not a table, so use that to differentiate tables from views.

    # File lib/sequel/adapters/shared/impala.rb
335 def is_valid_table?(t, opts=OPTS)
336   t = [opts[:schema], t].map(&:to_s).join('__').to_sym if opts[:schema]
337   rows = describe(t, :formatted=>true)
338   if row = rows.find{|r| r[:name].to_s.strip == 'Table Type:'}
339     row[:type].to_s.strip !~ /VIEW/
340   end
341 end
load_data_sql(path, table, options) click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
343 def load_data_sql(path, table, options)
344   "LOAD DATA INPATH #{literal(path)}#{' OVERWRITE' if options[:overwrite]} INTO TABLE #{literal(table)}"
345 end
metadata_dataset() click to toggle source

Metadata queries on JDBC use uppercase keys, so set the identifier output method to downcase so that metadata queries work correctly.

    # File lib/sequel/adapters/shared/impala.rb
349 def metadata_dataset
350   @metadata_dataset ||= (
351     ds = dataset;
352     ds.identifier_input_method = identifier_input_method_default;
353     ds.identifier_output_method = :downcase;
354     ds
355   )
356 end
quote_identifiers_default() click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
315 def quote_identifiers_default
316   true
317 end
refresh_sql(table_name) click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
294 def refresh_sql(table_name)
295   "REFRESH #{quote_schema_table(table_name)}"
296 end
search_path_table_schemas() click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
319 def search_path_table_schemas
320   @search_path_table_schemas ||= begin
321     search_path = opts[:search_path]
322     search_path = search_path.split(',') if search_path.is_a?(String)
323     table_schemas = {}
324     search_path.reverse_each do |schema|
325       _tables(:schema=>schema).each do |table|
326         table_schemas[table.to_s] = schema.to_s
327       end
328     end
329     table_schemas
330   end
331 end
set_sql(opts) click to toggle source
    # File lib/sequel/adapters/shared/impala.rb
403 def set_sql(opts)
404   opts.map { |k, v| "SET #{k}=#{v}" }
405 end
type_literal_generic_bignum(column) click to toggle source

Impala doesn't like the word “biginteger”

    # File lib/sequel/adapters/shared/impala.rb
369 def type_literal_generic_bignum(column)
370   :bigint
371 end
type_literal_generic_bignum_symbol(column) click to toggle source

Impala doesn't like the word “biginteger”

    # File lib/sequel/adapters/shared/impala.rb
364 def type_literal_generic_bignum_symbol(column)
365   :bigint
366 end
type_literal_generic_date(column) click to toggle source

Impala doesn't support date columns yet, so use timestamp until date is natively supported.

    # File lib/sequel/adapters/shared/impala.rb
375 def type_literal_generic_date(column)
376   :timestamp
377 end
type_literal_generic_float(column) click to toggle source

Impala uses double instead of “double precision” for floating point values.

    # File lib/sequel/adapters/shared/impala.rb
381 def type_literal_generic_float(column)
382   :double
383 end
type_literal_generic_integer(column) click to toggle source

Impala doesn't like the word “integer”

    # File lib/sequel/adapters/shared/impala.rb
359 def type_literal_generic_integer(column)
360   :int
361 end
type_literal_generic_numeric(column) click to toggle source

Impala uses decimal instead of numeric for arbitrary precision numeric values.

    # File lib/sequel/adapters/shared/impala.rb
387 def type_literal_generic_numeric(column)
388   column[:size] ? "decimal(#{Array(column[:size]).join(', ')})" : :decimal
389 end
type_literal_generic_string(column) click to toggle source

Use char or varchar if given a size, otherwise use string. Using a size is not recommend, as Impala doesn't implicitly cast string values to char or varchar, and doesn't implicitly cast from different sizes of varchar.

    # File lib/sequel/adapters/shared/impala.rb
395 def type_literal_generic_string(column)
396   if size = column[:size]
397     "#{'var' unless column[:fixed]}char(#{size})"
398   else
399     :string
400   end
401 end