module Sequel::Impala::DatabaseMethods
Public Instance Methods
# File lib/sequel/adapters/shared/impala.rb 23 def compute_stats(table_name) 24 run(compute_stats_sql(table_name)) 25 end
Do not use a composite primary key, foreign keys, or an index when creating a join table, as Impala
doesn't support those.
# File lib/sequel/adapters/shared/impala.rb 10 def create_join_table(hash, options=OPTS) 11 keys = hash.keys.sort_by(&:to_s) 12 create_table(join_table_name(hash, options), options) do 13 keys.each do |key| 14 Integer key 15 end 16 end 17 end
Create a database/schema in Imapala.
Options:
- :if_not_exists
-
Don't raise an error if the schema already exists.
- :location
-
Set the file system location to store the data for tables in the created schema.
Examples:
create_schema(:s) # CREATE SCHEMA `s` create_schema(:s, :if_not_exists=>true) # CREATE SCHEMA IF NOT EXISTS `s` create_schema(:s, :location=>'/a/b') # CREATE SCHEMA `s` LOCATION '/a/b'
# File lib/sequel/adapters/shared/impala.rb 44 def create_schema(schema, options=OPTS) 45 run(create_schema_sql(schema, options)) 46 end
Set the database_type
for this database to :impala.
# File lib/sequel/adapters/shared/impala.rb 49 def database_type 50 :impala 51 end
Return the DESCRIBE output for the table, showing table columns, types, and comments. If the :formatted option is given, use DESCRIBE FORMATTED and return a lot more information about the table. Both of these return arrays of hashes.
Examples:
describe(:t) # DESCRIBE `t` describe(:t, :formatted=>true) # DESCRIBE FORMATTED `t`
# File lib/sequel/adapters/shared/impala.rb 66 def describe(table, opts=OPTS) 67 if ds = opts[:dataset] 68 ds = ds.naked 69 else 70 ds = dataset.clone 71 ds.identifier_input_method = identifier_input_method 72 end 73 ds.identifier_output_method = nil 74 ds.with_sql("DESCRIBE #{'FORMATTED ' if opts[:formatted]} ?", table).all 75 end
Drop a database/schema from Imapala.
Options:
- :if_exists
-
Don't raise an error if the schema doesn't exist.
Examples:
drop_schema(:s) # DROP SCHEMA `s` create_schema(:s, :if_exists=>true) # DROP SCHEMA IF EXISTS `s`
# File lib/sequel/adapters/shared/impala.rb 89 def drop_schema(schema, options=OPTS) 90 run(drop_schema_sql(schema, options)) 91 end
Implicitly quailfy the table if using the :search_path option. This will look at all of the tables and views in the schemas, and if an unqualified table is used and appears in one of the schemas, it will be implicitly qualified with the given schema name.
# File lib/sequel/adapters/shared/impala.rb 98 def implicit_qualify(table) 99 return table unless opts[:search_path] 100 101 case table 102 when Symbol 103 s, t, a = Sequel.split_symbol(table) 104 if s 105 return table 106 end 107 t = implicit_qualify(t) 108 a ? Sequel.as(t, a) : t 109 when String 110 if schema = search_path_table_schemas[table] 111 Sequel.qualify(schema, table) 112 else 113 invalidate_table_schemas 114 if schema = search_path_table_schemas[table] 115 Sequel.qualify(schema, table) 116 else 117 Sequel.identifier(table) 118 end 119 end 120 when SQL::Identifier 121 implicit_qualify(table.value.to_s) 122 when SQL::AliasedExpression 123 SQL::AliasedExpression.new(implicit_qualify(table.expression), table.alias) 124 else 125 table 126 end 127 end
# File lib/sequel/adapters/shared/impala.rb 189 def invalidate_table_schemas 190 @search_path_table_schemas = nil 191 end
Load data from HDFS into Impala
.
Options:
- :overwrite
-
Overwrite the existing table instead of appending to it.
Examples:
load_data('/user/foo', :bar) LOAD DATA INPATH '/user/foo' INTO TABLE `bar` load_data('/user/foo', :bar, :overwrite=>true) LOAD DATA INPATH '/user/foo' OVERWRITE INTO TABLE `bar`
# File lib/sequel/adapters/shared/impala.rb 141 def load_data(path, table, options=OPTS) 142 run(load_data_sql(path, table, options)) 143 end
# File lib/sequel/adapters/shared/impala.rb 19 def refresh(table_name) 20 run(refresh_sql(table_name)) 21 end
Sets options in the current db connection for each key/value pair
# File lib/sequel/adapters/shared/impala.rb 206 def set(opts) 207 set_sql(opts).each do |sql| 208 run(sql) 209 end 210 end
Impala
supports CREATE TABLE IF NOT EXISTS.
# File lib/sequel/adapters/shared/impala.rb 152 def supports_create_table_if_not_exists? 153 true 154 end
Impala
does not support foreign keys.
# File lib/sequel/adapters/shared/impala.rb 157 def supports_foreign_key_parsing? 158 false 159 end
Impala
does not support indexes.
# File lib/sequel/adapters/shared/impala.rb 162 def supports_index_parsing? 163 false 164 end
Check that the tables returned by the JDBC
driver are actually valid tables and not views. The Hive2
JDBC
driver returns views when listing tables and nothing when listing views.
# File lib/sequel/adapters/shared/impala.rb 169 def tables(opts=OPTS) 170 _tables(opts).select{|t| is_valid_table?(t, opts)} 171 end
Impala
doesn't support transactions, so instead of issuing a transaction, just checkout a connection. This ensures the same connection is used for the transaction block, but as Impala
doesn't support transactions, you can't rollback.
# File lib/sequel/adapters/shared/impala.rb 177 def transaction(opts=OPTS) 178 synchronize(opts[:server]) do |c| 179 yield c 180 end 181 end
Creates a dataset that uses the VALUES clause:
DB.values([[1, 2], [3, 4]]) VALUES ((1, 2), (3, 4))
# File lib/sequel/adapters/shared/impala.rb 197 def values(v) 198 @default_dataset.clone(:values=>v) 199 end
Determine the available views for listing all tables via JDBC
(which includes both tables and views), and removing all valid tables.
# File lib/sequel/adapters/shared/impala.rb 185 def views(opts=OPTS) 186 _tables(opts).reject{|t| is_valid_table?(t, opts)} 187 end
Private Instance Methods
# File lib/sequel/adapters/shared/impala.rb 214 def _tables(opts) 215 m = output_identifier_meth 216 metadata_dataset.with_sql("SHOW TABLES#{" IN #{quote_identifier(opts[:schema])}" if opts[:schema]}"). 217 select_map(:name).map do |table| 218 m.call(table) 219 end 220 end
Impala
uses ADD COLUMNS instead of ADD COLUMN. As its use of ADD COLUMNS implies, it supports adding multiple columns at once, but this adapter doesn't offer an API for that.
# File lib/sequel/adapters/shared/impala.rb 225 def alter_table_add_column_sql(table, op) 226 "ADD COLUMNS (#{column_definition_sql(op)})" 227 end
Impala
uses CHANGE instead of having separate RENAME syntax for renaming tables. As CHANGE requires a type, look up the type from the database schema.
# File lib/sequel/adapters/shared/impala.rb 232 def alter_table_rename_column_sql(table, op) 233 old_name = op[:name] 234 opts = schema(table).find{|x| x.first == old_name} 235 opts = opts ? opts.last : {} 236 unless opts[:db_type] 237 raise Error, "cannot determine database type to use for CHANGE COLUMN operation" 238 end 239 new_col = op.merge(:type=>opts[:db_type], :name=>op[:new_name]) 240 "CHANGE #{quote_identifier(old_name)} #{column_definition_sql(new_col)}" 241 end
# File lib/sequel/adapters/shared/impala.rb 243 def alter_table_set_column_type_sql(table, op) 244 "CHANGE #{quote_identifier(op[:name])} #{column_definition_sql(op)}" 245 end
Add COMMENT when defining the column, if :comment is present.
# File lib/sequel/adapters/shared/impala.rb 248 def column_definition_comment_sql(sql, column) 249 sql << " COMMENT #{literal(column[:comment])}" if column[:comment] 250 end
# File lib/sequel/adapters/shared/impala.rb 252 def column_definition_order 253 [:comment] 254 end
# File lib/sequel/adapters/shared/impala.rb 298 def compute_stats_sql(table_name) 299 "COMPUTE STATS #{quote_schema_table(table_name)}" 300 end
# File lib/sequel/adapters/shared/impala.rb 256 def create_schema_sql(schema, options) 257 "CREATE SCHEMA #{'IF NOT EXISTS ' if options[:if_not_exists]}#{quote_identifier(schema)}#{" LOCATION #{literal(options[:location])}" if options[:location]}" 258 end
Support using table parameters for CREATE TABLE AS, necessary for creating parquet files from datasets.
# File lib/sequel/adapters/shared/impala.rb 262 def create_table_as_sql(name, sql, options) 263 "#{create_table_prefix_sql(name, options)}#{create_table_parameters_sql(options) } AS #{sql}" 264 end
# File lib/sequel/adapters/shared/impala.rb 276 def create_table_parameters_sql(options) 277 sql = "" 278 sql << " COMMENT #{literal(options[:comment])}" if options[:comment] 279 if options[:field_term] || options[:line_term] 280 sql << " ROW FORMAT DELIMITED" 281 if options[:field_term] 282 sql << " FIELDS TERMINATED BY #{literal(options[:field_term])}" 283 sql << " ESCAPED BY #{literal(options[:field_escape])}" if options[:field_escape] 284 end 285 if options[:line_term] 286 sql << " LINES TERMINATED BY #{literal(options[:line_term])}" 287 end 288 end 289 sql << " STORED AS #{options[:stored_as]}" if options[:stored_as] 290 sql << " LOCATION #{literal(options[:location])}" if options[:location] 291 sql 292 end
# File lib/sequel/adapters/shared/impala.rb 266 def create_table_prefix_sql(name, options) 267 "CREATE #{'EXTERNAL ' if options[:external]}TABLE#{' IF NOT EXISTS' if options[:if_not_exists]} #{quote_schema_table(name)}" 268 end
# File lib/sequel/adapters/shared/impala.rb 270 def create_table_sql(name, generator, options) 271 sql = super 272 sql += create_table_parameters_sql(options) 273 sql 274 end
# File lib/sequel/adapters/shared/impala.rb 302 def drop_schema_sql(schema, options) 303 "DROP SCHEMA #{'IF EXISTS ' if options[:if_exists]}#{quote_identifier(schema)}" 304 end
Impala
folds identifiers to lowercase, quoted or not, and is actually case insensitive, so don't use an identifier input or output method.
# File lib/sequel/adapters/shared/impala.rb 308 def identifier_input_method_default 309 nil 310 end
# File lib/sequel/adapters/shared/impala.rb 311 def identifier_output_method_default 312 nil 313 end
SHOW TABLE STATS will raise an error if given a view and not a table, so use that to differentiate tables from views.
# File lib/sequel/adapters/shared/impala.rb 335 def is_valid_table?(t, opts=OPTS) 336 t = [opts[:schema], t].map(&:to_s).join('__').to_sym if opts[:schema] 337 rows = describe(t, :formatted=>true) 338 if row = rows.find{|r| r[:name].to_s.strip == 'Table Type:'} 339 row[:type].to_s.strip !~ /VIEW/ 340 end 341 end
# File lib/sequel/adapters/shared/impala.rb 343 def load_data_sql(path, table, options) 344 "LOAD DATA INPATH #{literal(path)}#{' OVERWRITE' if options[:overwrite]} INTO TABLE #{literal(table)}" 345 end
Metadata queries on JDBC
use uppercase keys, so set the identifier output method to downcase so that metadata queries work correctly.
# File lib/sequel/adapters/shared/impala.rb 349 def metadata_dataset 350 @metadata_dataset ||= ( 351 ds = dataset; 352 ds.identifier_input_method = identifier_input_method_default; 353 ds.identifier_output_method = :downcase; 354 ds 355 ) 356 end
# File lib/sequel/adapters/shared/impala.rb 315 def quote_identifiers_default 316 true 317 end
# File lib/sequel/adapters/shared/impala.rb 294 def refresh_sql(table_name) 295 "REFRESH #{quote_schema_table(table_name)}" 296 end
# File lib/sequel/adapters/shared/impala.rb 319 def search_path_table_schemas 320 @search_path_table_schemas ||= begin 321 search_path = opts[:search_path] 322 search_path = search_path.split(',') if search_path.is_a?(String) 323 table_schemas = {} 324 search_path.reverse_each do |schema| 325 _tables(:schema=>schema).each do |table| 326 table_schemas[table.to_s] = schema.to_s 327 end 328 end 329 table_schemas 330 end 331 end
# File lib/sequel/adapters/shared/impala.rb 403 def set_sql(opts) 404 opts.map { |k, v| "SET #{k}=#{v}" } 405 end
Impala
doesn't like the word “biginteger”
# File lib/sequel/adapters/shared/impala.rb 369 def type_literal_generic_bignum(column) 370 :bigint 371 end
Impala
doesn't like the word “biginteger”
# File lib/sequel/adapters/shared/impala.rb 364 def type_literal_generic_bignum_symbol(column) 365 :bigint 366 end
Impala
doesn't support date columns yet, so use timestamp until date is natively supported.
# File lib/sequel/adapters/shared/impala.rb 375 def type_literal_generic_date(column) 376 :timestamp 377 end
Impala
uses double instead of “double precision” for floating point values.
# File lib/sequel/adapters/shared/impala.rb 381 def type_literal_generic_float(column) 382 :double 383 end
Impala
doesn't like the word “integer”
# File lib/sequel/adapters/shared/impala.rb 359 def type_literal_generic_integer(column) 360 :int 361 end
Impala
uses decimal instead of numeric for arbitrary precision numeric values.
# File lib/sequel/adapters/shared/impala.rb 387 def type_literal_generic_numeric(column) 388 column[:size] ? "decimal(#{Array(column[:size]).join(', ')})" : :decimal 389 end
Use char or varchar if given a size, otherwise use string. Using a size is not recommend, as Impala
doesn't implicitly cast string values to char or varchar, and doesn't implicitly cast from different sizes of varchar.
# File lib/sequel/adapters/shared/impala.rb 395 def type_literal_generic_string(column) 396 if size = column[:size] 397 "#{'var' unless column[:fixed]}char(#{size})" 398 else 399 :string 400 end 401 end