class Bones::Algorithm
This class holds one algorithm, which includes a species, a name, and the source C-code.
The algorithm class holds all sorts of information on var- iables. This information is only available after calling the ‘populate’ method, which populates a lists of varia- bles of all sorts: a regular list, a specialized hash, and lists of input/output array variables.
Constants
- ACCELERATED
Constant to set the name of the algorithm’s accelerated version
- ORIGINAL
Constant to set the name of the algorithm’s original version
Attributes
Public Class Methods
This method initializes the class. It gives the new algorithm a name, species and source code. At initiali- zation, this method checks if the name starts with a digit. This is not allowed, so an underscore is added prior to the digit.
# File lib/bones/algorithm.rb 25 def initialize(name, filename, id, species, code) 26 name = '_'+name if name =~ /^\d/ 27 @filename = filename 28 @basename = name 29 @name = (name+'_'+id).gsub(/\W/,'') 30 @id = id 31 @original_name = @name+ORIGINAL 32 @accelerated_name = @name+ACCELERATED 33 @species = species 34 begin 35 @code = C::Statement.parse(code).preprocess 36 rescue 37 @code = C::Statement.parse('{'+code+'}').preprocess 38 end 39 @hash = {} 40 @lists = {:host_name => [],:host_definition => [], :argument_name => [], :argument_definition => [], :golden_name => []} 41 @arrays = Variablelist.new() 42 @constants = Variablelist.new() 43 @merge_factor = 0 44 @register_caching_enabled = 1 45 @function_code = '' 46 @function_name = '' 47 48 # Set the initial hash 49 @hash = {:algorithm_id => @id, 50 :algorithm_name => @name, 51 :algorithm_basename => @basename, 52 :algorithm_filename => @filename} 53 end
Public Instance Methods
This method is used to generate verification code. This verification code contains a copy of the original code. It also provides a verification which compares the output of the original code with the output of the generated code. The verification code prints warnings if the outputs are not equal, else it prints a success message.
# File lib/bones/algorithm.rb 512 def generate_replacement_code(options, skeleton, verify_code, prefix, timer_start, timer_stop) 513 replacement = C::NodeArray.new 514 replacement.push(C::ExpressionStatement.parse(@accelerated_name+'('+@lists[:host_name]+');')) 515 original_definition = '' 516 verify_definitions = [] 517 if options[:verify] 518 guesses = @arrays.map { |array| array.guess } 519 if guesses.include?(true) 520 puts WARNING+'Verification not supported for this class' 521 else 522 523 # Generate the replacement code and the original function 524 @arrays.each do |array| 525 replacement.insert(0,C::ExpressionStatement.parse("memcpy(#{array.golden_name},#{array.name},#{array.size.join('*')}*sizeof(#{array.type_name}));")) 526 replacement.insert(0,C::Declaration.parse(array.definition.gsub!(/\b#{array.name}\b/,array.golden_name)+array.initialization)) 527 end 528 replacement.push(C::ExpressionStatement.parse(@original_name+'('+@lists[:golden_name]+');')) 529 original_definition = "void #{@original_name}(#{@lists[:host_definition]})" 530 body = "#{timer_start}#{NL} // Original code#{NL}#{@code}#{NL}#{timer_stop}" 531 verify_code.push(prefix+original_definition+' {'+NL+body+'}'+NL+NL) 532 @arrays.select(OUTPUT).each do |array| 533 replacement.push(C::ExpressionStatement.parse(("bones_verify_results_#{array.name}_#{@id}(#{array.name}#{array.flatten},#{array.golden_name}#{array.flatten},#{@hash[:argument_name]});").remove_extras)) 534 end 535 @arrays.each do |array| 536 replacement.push(C::ExpressionStatement.parse("free(#{array.golden_name});")) if array.dynamic? 537 end 538 539 # Generate the verification function itself 540 @arrays.select(OUTPUT).each_with_index do |array,num_array| 541 minihash = @hash["out#{num_array}".to_sym] 542 minihash[:name] = minihash[:name]+'_'+@id 543 minihash[:argument_definition] = @hash[:argument_definition] 544 instantiated_skeleton = search_and_replace(minihash,skeleton) 545 verify_definitions.push(instantiated_skeleton.scan(/#{START_DEFINITION}(.+)#{END_DEFINITION}/m).join.strip.remove_extras) 546 verify_code.push(instantiated_skeleton.remove_extras.gsub!(/#{START_DEFINITION}(.+)#{END_DEFINITION}/m,'')) 547 end 548 end 549 end 550 return replacement, original_definition, verify_definitions.join(NL) 551 end
Helper function to create a the special code which is required for OpenCL function calls to be able to use kernel arguments.
# File lib/bones/algorithm.rb 360 def opencl_arguments(list,kernel_id) 361 return '' if list == '' 362 argument_string = '' 363 list.split(', ').each_with_index do |variable,id| 364 argument_string += 'clSetKernelArg(bones_kernel_'+@name+'_'+kernel_id.to_s+',bones_num_args+'+id.to_s+',sizeof('+variable.strip+'),(void*)&'+variable.strip+');'+NL+INDENT 365 end 366 return argument_string 367 end
This method performs the code transformations according to the transformation settings as provided as an argument to the function. It calls the various code transformation functions as implemented for the CAST class. The resulting modified code is finally stored in the search-and-replace hash. This method assumes that the populate method has already been called, such that the hash contains the dimensions needed to create the global ID definitions.
# File lib/bones/algorithm.rb 79 def perform_transformations(transformation_settings) 80 complexity = 0 81 82 # Save the original code (with flattened arrays) in the hash as well 83 new_code = @code.clone 84 @arrays.each do |array| 85 new_code.transform_flatten(array) 86 end 87 @hash[:algorithm_code0] = new_code.to_s 88 89 # Loop over the number of transformation 'blocks' 90 transformation_settings.split(' ').each_with_index do |transformation,num_transformation| 91 new_code = @code.clone 92 extra_indent = '' 93 94 # Replace existing loops in the code (always do this) 95 array = @arrays.representative 96 array.species.dimensions.each_with_index do |dimension,num_dimension| 97 index = (array.species.reverse?) ? num_dimension : array.species.dimensions.length-num_dimension-1 98 index_reverse = !(array.species.reverse?) ? num_dimension : array.species.dimensions.length-num_dimension-1 99 100 # Calculate the loop start and end conditions 101 from = array.species.from_at(index) 102 to = array.species.to_at(index) 103 104 # Process the existing code and update the hash 105 if from != to 106 new_code, loop_variable_name = new_code.remove_loop(from,to) 107 new_variable_name = GLOBAL_ID+'_'+index_reverse.to_s 108 new_code.replace_variable(loop_variable_name,new_variable_name) 109 update_hash(loop_variable_name) 110 end 111 end 112 113 # Shuffle the indices of the first input(s) (conditionally do this) 114 shuffle_arrays = [] 115 if transformation[0,1] == '2' 116 shuffle_arrays.push(@arrays.select(INPUT)[0]) 117 elsif transformation[0,1] == '3' 118 shuffle_arrays.push(@arrays.select(INPUT)[0]) 119 shuffle_arrays.push(@arrays.select(INPUT)[1]) 120 end 121 new_code.transform_shuffle(shuffle_arrays) 122 123 # Use the local on-chip memory (conditionally do this) 124 if transformation[0,1] == '1' 125 local_memory_arrays = [@arrays.select(INPUT)[0]] 126 new_code.transform_use_local_memory(local_memory_arrays) 127 end 128 129 # Flatten the arrays to 1D (always do this) 130 @arrays.each do |array| 131 new_code.transform_flatten(array) 132 end 133 134 # Perform array substitution a.k.a. register caching (conditionally do this) 135 if @register_caching_enabled == 1 136 @arrays.outputs.each do |array| 137 if array.species.element? 138 if @arrays.inputs.include?(array) 139 new_code.transform_substitution(array,true) 140 else 141 new_code.transform_substitution(array,false) 142 end 143 extra_indent = INDENT 144 end 145 end 146 end 147 148 # Perform transformations for reduction operations (conditionally do this) 149 if transformation[1,1].to_i >= 1 150 input = @arrays.select(INPUT)[0] 151 @arrays.select(OUTPUT).each do |output| 152 if output.species.shared? 153 new_code = new_code.transform_reduction(input,output,transformation[1,1].to_i) 154 end 155 end 156 end 157 158 # Perform thread-merging (experimental) 159 # TODO: Solve the problem related to constants (e.g chunk/example1.c) 160 if @merge_factor == 0 161 if transformation[0,1] == '4' && @hash[:parallelism].to_i >= 1024*1024 162 @merge_factor = 4 163 else 164 @merge_factor = 1 165 end 166 end 167 if @merge_factor > 1 168 #puts @hash[:parallelism] 169 if new_code.has_conditional_statements? 170 puts MESSAGE+'Not coarsening ('+@merge_factor.to_s+'x) because of conditional statements in kernel body.' 171 # TODO: Fix this temporary hack for multiple loops with mismatching bounds 172 elsif ((@hash[:parallelism].to_i % @merge_factor) != 0) || (@hash[:parallelism].to_i == 4192256) 173 puts MESSAGE+'Not coarsening ('+@merge_factor.to_s+'x) because of mismatching amount of parallelism ('+@hash[:parallelism]+').' 174 else 175 puts MESSAGE+'Coarsening threads by a factor '+@merge_factor.to_s+'.' 176 177 # Update the hash 178 @hash[:ids] = @hash[:ids].split(NL).map { |line| 179 C::parse(line).transform_merge_threads(@merge_factor,[GLOBAL_ID]+@constants.map{ |c| c.name }).to_s.split(NL).each_with_index.map do |id,index| 180 id.gsub(/\b#{GLOBAL_ID}\b/,"(#{GLOBAL_ID}+gridDim.x*blockDim.x*#{index})") 181 end 182 }.join(NL+INDENT*2) 183 @hash[:parallelism] = (@hash[:parallelism].to_i / @merge_factor).to_s 184 185 # Transform the code 186 excludes = (@constants+@arrays).map { |c| c.name } 187 new_code.transform_merge_threads(@merge_factor,excludes) 188 end 189 end 190 191 # Obtain the complexity in terms of operations for the resulting code 192 complexity += new_code.get_complexity 193 194 # Store the resulting code in the hash 195 resulting_code = new_code.strip_brackets.to_s 196 @hash[('algorithm_code'+(num_transformation+1).to_s).to_sym] = (transformation[1,1].to_i >= 1) ? resulting_code : extra_indent+INDENT+resulting_code.gsub!(NL,NL+INDENT) 197 end 198 199 @hash[:complexity] = complexity.to_s 200 end
Method to generate performance modeling code. This method is still under construction and will not be called yet. TODO: Complete this method
# File lib/bones/algorithm.rb 556 def performance_model_code(model_dir) 557 558 # Load the profile database 559 profiles = Array.new 560 File.read(File.join(model_dir,'profile.txt')).each do |line| 561 profiles.push(line.split(',')) 562 end 563 564 # Iterate over all the profiles 565 result = C::NodeArray.new 566 profiles.each do |profile| 567 568 # Fill the hash with profile information and species information 569 mini_hash = { 570 :name => profile[0].strip, 571 :comp => profile[1].strip, 572 :coal => profile[2].strip, 573 :unco => profile[3].strip, 574 :copy => profile[4].strip, 575 :f => @hash[:complexity], 576 :w => @hash[:parallelism], 577 :c => @species.all_structures.map { |s| simplify('4*('+s.dimensions.map { |d| sum(d) }.join('*')+')') }.join(' + '), 578 :m => '1', 579 :u => '0', 580 :o => '8' 581 } 582 583 # Load the skeleton for the performance model and set the values according to the hash 584 model_skeleton = File.read(File.join(model_dir,'model.c')) 585 search_and_replace!(mini_hash,model_skeleton) 586 result.push(C::Block.parse(model_skeleton)) 587 end 588 return result 589 end
This method creates the search-and-replace hash based on information provided by the algorithm. It is called from the ‘populate’ method of this class.
List of possible hash keys:¶ ↑
algorithm_id
_name _basename _filename _code*
(in*|out*)_type
_name _devicename _devicepointer _dimensions _dimension*_to _from _sum _to _from _parameters _parameter*_to _from _sum _ids _localids _flatindex
(in|out)_names
_devicenames _devicedefinitions _devicedefinitionsopencl
names devicenames devicedefinitions devicedefinitionsopencl
parallelism factors ids verifyids
argument_name argument_definition kernel_argument_list
# File lib/bones/algorithm.rb 248 def populate_hash 249 @hash[:argument_name] = @lists[:argument_name] 250 @hash[:argument_definition] = @lists[:argument_definition] 251 252 # Obtain the necessary data for the hash per array 253 parallelisms = [] 254 DIRECTIONS.each do |direction| 255 arrays = @arrays.select(direction) 256 arrays.each_with_index do |array,num_array| 257 hashid = "#{direction}#{num_array}".to_sym 258 259 # Gather the name and type data 260 minihash = {:type => array.type_name, 261 :name => array.name, 262 :devicepointer => array.device_pointer, 263 :devicename => array.device_name, 264 :flatindex => array.flatindex} 265 266 # Gather the dimensions data 267 dimensions = array.species.dimensions 268 dimensions.each_with_index do |dimension,num_dimension| 269 minihash["dimension#{num_dimension}".to_sym] = {:sum => simplify(sum(dimension)), 270 :from => simplify(from(dimension)), 271 :to => simplify(to(dimension))} 272 end 273 minihash[:dimensions] = simplify(dimensions.map { |d| sum(d) }.join('*')) 274 minihash[:from] = dimensions.map { |d| from(d) }.zip(array.factors.drop(1).reverse).map { |e| simplify(e.join('')) }.join('+') 275 minihash[:to ] = dimensions.map { |d| to(d) }.zip(array.factors.drop(1).reverse).map { |e| simplify(e.join('')) }.join('+') 276 277 # Gather the parameter data 278 if array.species.has_parameter? 279 parameters = array.species.parameters 280 parameters.each_with_index do |parameter,num_parameter| 281 minihash["parameter#{num_parameter}".to_sym] = {:sum => simplify(sum(parameter)), 282 :from => simplify(from(parameter)), 283 :to => simplify(to(parameter))} 284 end 285 minihash[:parameters] = simplify(parameters.map { |p| sum(p) }.join('*')) 286 end 287 288 # Store the data into the hash 289 @hash[hashid] = minihash 290 291 # Gather information regarding the parallelism 292 if array.species.chunk? 293 dim_div = simplify(minihash[:dimensions]+'/'+minihash[:parameters]) 294 parallelisms.push([dim_div,hashid,0]) 295 elsif array.species.element? || array.species.neighbourhood? 296 parallelisms.push([minihash[:dimensions],hashid,1]) 297 end 298 299 # Populate the global ID definitions hash, create the proper indices (and store as '{in/out}*_ids' in the hash) 300 ids, localids, verifyids, factors = [], [], [], [''] 301 dimensions = array.species.dimensions.clone 302 dimensions.each_with_index do |dimension,num_dimension| 303 index = (array.species.reverse?) ? num_dimension : array.species.dimensions.length-num_dimension-1 304 index_reverse = !(array.species.reverse?) ? num_dimension : array.species.dimensions.length-num_dimension-1 305 306 # Generate the index expressions 307 divider = (array.species.chunk?) ? '/'+sum(array.species.parameters[index]) : '' 308 dimensions_hash = (index == dimensions.length-1) ? '1' : dimensions.drop(index+1).map { |d| sum(d) }.join('*') 309 dimensions_hash = simplify(dimensions_hash) 310 dimensions_division = (dimensions_hash == '1') ? '' : '/('+dimensions_hash+')' 311 minihash = {:dimensions1 => "#{GLOBAL_ID}#{dimensions_division}", 312 :dimensions2 => "#{LOCAL_ID }#{dimensions_division}", 313 :modulo => (index_reverse != dimensions.length-1) ? '%('+simplify(sum(dimension)+divider)+')' : '', 314 :offset => simplify(from(dimension))} 315 expr_global = search_and_replace(minihash,"((<dimensions1>)<modulo>)+<offset>") 316 expr_local = search_and_replace(minihash,"((<dimensions2>)<modulo>)+<offset>") 317 318 # Selectively push the ID definitions to the result array 319 from = array.species.from_at(index) 320 to = array.species.to_at(index) 321 verifyids.push("const int #{GLOBAL_ID}_#{index_reverse} = "+expr_global+';') 322 if from != to 323 ids.push("const int #{GLOBAL_ID}_#{index_reverse} = "+expr_global+';') 324 localids.push("const int #{LOCAL_ID }_#{index_reverse} = "+expr_local+';') 325 factors.push(array.factors[index_reverse]) 326 end 327 end 328 329 # Store the results in the hash 330 @hash[hashid][:ids] = ids.join(NL+INDENT*2) 331 @hash[hashid][:localids] = localids.join(NL+INDENT*2) 332 @hash[hashid][:verifyids] = verifyids.join(NL+INDENT*2) 333 @hash[hashid][:factors] = factors.last 334 end 335 336 # Create lists of array names and definitions 337 @hash["#{direction}_devicedefinitions".to_sym] = arrays.map { |a| a.device_definition }.uniq.join(', ') 338 @hash["#{direction}_devicedefinitionsopencl".to_sym] = arrays.map { |a| '__global '+a.device_definition }.uniq.join(', ') 339 @hash["#{direction}_devicenames".to_sym] = arrays.map { |a| a.device_name }.uniq.join(', ') 340 @hash["#{direction}_names".to_sym] = arrays.map { |a| a.name }.uniq.join(', ') 341 end 342 @hash[:devicedefinitions] = @arrays.map { |a| a.device_definition }.uniq.join(', ') 343 @hash[:devicedefinitionsopencl] = @arrays.map { |a| '__global '+a.device_definition }.uniq.join(', ') 344 @hash[:devicenames] = @arrays.map { |a| a.device_name }.uniq.join(', ') 345 @hash[:names] = @arrays.map { |a| a.name }.uniq.join(', ') 346 347 # Set the parallelism for the complete species, first sort them according to priorities and then find the maximum 348 # TODO: Remove the 'reverse' statement and get the 'ids' part working correctly for chunks 349 # TODO: How to find the maximum of symbolic expressions? 350 parallelisms = parallelisms.reverse.sort_by { |p| p[2] } 351 parallelism = parallelisms.reverse.max_by { |p| p[0].to_i } 352 @hash[:parallelism] = parallelism[0] 353 @hash[:ids] = @hash[parallelism[1]][:ids] 354 @hash[:factors] = @hash[parallelism[1]][:factors] 355 @arrays.set_representative(parallelism[1]) 356 end
Method to populate 5 lists with variable information. Below are listed the names of the four lists with an example value:
- host_name
-
Example: ‘array’
- host_definition
-
Example: ‘int array[10]’
- argument_name
-
Example: ‘threshold’
- argument_definition
-
Example: ‘float threshold’
- golden_name
-
Example: ‘golden_array’
# File lib/bones/algorithm.rb 490 def populate_lists 491 @constants.each do |variable| 492 @lists[:host_name] .push(variable.name) 493 @lists[:host_definition] .push(variable.definition) 494 @lists[:argument_name] .push(variable.name) 495 @lists[:argument_definition].push(variable.definition) 496 @lists[:golden_name] .push(variable.name) 497 end 498 @arrays.each do |variable| 499 @lists[:host_name] .push(variable.name) 500 @lists[:host_definition] .push(variable.definition) 501 @lists[:golden_name] .push(variable.golden_name) 502 end 503 @lists.each { |name,list| @lists[name] = list.join(', ') } 504 end
Method to create a list of variables for the current algorithm. These variables should hold two conditions: 1) they are not local to the algorithm’s code, and 2), they are used in the algorithm’s code.
The method gets a lists of undefined variables in the algorithm’s code and subsequently searches the original code for the definition of this variable.
# File lib/bones/algorithm.rb 400 def populate_variables(original_code,defines) 401 @code.undefined_variables.each do |name| 402 type = @function_code.variable_type(name) 403 raise_error('Variable '+name+' not declared in original code') if !type 404 size = original_code.size(name) 405 direction = @code.direction(name) 406 size.map! { |s| simplify(replace_defines(s,defines)) } 407 variable = Variable.new(name,type,size,direction,@id,@species.shared?) 408 (variable.dimensions > 0) ? @arrays.push(variable) : @constants.push(variable) 409 end 410 raise_error('No input nor output arrays detected, make sure they are properly defined') if arrays.empty? 411 412 DIRECTIONS.each do |direction| 413 species = @species.structures(direction) 414 if direction == INPUT && @species.shared? 415 arrays = @arrays.inputs_only 416 else 417 arrays = @arrays.select(direction) 418 end 419 if !arrays.empty? 420 421 # Check if the amount of input/ouput arrays is equal to the amount of input/output species 422 if species.length < arrays.length 423 array_names = arrays.map { |a| a.name }.join('","') 424 raise_error(direction.capitalize+'put array count mismatch (expected '+species.length.to_s+', found '+arrays.length.to_s+' ["'+array_names+'"])') 425 end 426 427 # Set the species for the arrays (distinguish between arrays with and without a name) 428 species.each do |structure| 429 430 # Loop over all found arrays and match it with a species 431 array = nil 432 arrays.each do |free_array| 433 if !free_array.species 434 if structure.has_arrayname? 435 if structure.name == free_array.name 436 array = free_array 437 break 438 end 439 else 440 array = free_array 441 break 442 end 443 end 444 end 445 446 # Still haven't found anything, assign the species to an array of equal name 447 if !array 448 arrays.each do |free_array| 449 array = free_array if structure.name == free_array.name 450 end 451 end 452 453 # Still haven't found anything, raise an error 454 if !array 455 raise_error("Could not find a matching array in C-code for a species with name '#{species.first.name}'") 456 end 457 458 # Process the assignment 459 array.species = structure 460 raise_error("Species of '#{array.species.name}' is mismatched with array '#{array.name}'") if array.species.name != array.name 461 462 # Check if the array size was set, if not, it will be set to the species' size 463 if array.size.empty? 464 array.size = array.species.dimensions.map { |d| sum(d) } 465 array.guess = true 466 puts WARNING+'Could not determine size for array "'+array.name+'" automatically, assuming: '+array.size.inspect+'.' 467 end 468 469 # Set the multiplication factors (for later) 470 array.set_factors 471 end 472 end 473 end 474 475 # Sort the arrays according to the alphabet 476 if @arrays.length > 1 477 @arrays.sort_by(['chunk','neighbourhood','element','shared','full']) 478 end 479 end
This method sets the code and name for the function in which the algorithm is found. This is done based on the original code, which is given as input to this method. The method does not return any value, instead, it sets two class variables (@function_code and @function_name).
# File lib/bones/algorithm.rb 60 def set_function(full_code) 61 full_code.get_functions.each do |function| 62 if function.node_exists?(@code) 63 @function_code = function 64 @function_name = function.name 65 end 66 end 67 raise_error("Incorrect code found in body of #{@name}, something wrong with the classification?") if @function_code == "" 68 end
This method updates the hash after loops are removed from the code. It takes as an argument a loop variable, which it removes from both the ‘:argument_name’ and ‘:argument_ definition’ hash entries.
# File lib/bones/algorithm.rb 373 def update_hash(loop_variable) 374 names = @hash[:argument_name].split(', ') 375 definitions = @hash[:argument_definition].split(', ') 376 # TODO: The following two lines give problems with correlation-k4 377 names.delete(loop_variable.to_s) 378 definitions.each { |definition| definitions.delete(definition) if definition =~ /\b#{loop_variable}\b/ } 379 @hash[:argument_name] = names.join(', ') 380 @hash[:argument_definition] = definitions.join(', ') 381 382 # Now, generate the special code which is required for OpenCL function calls to be able to use kernel arguments. 383 @hash[:kernel_argument_list] = opencl_arguments([@hash[:devicenames],@hash[:argument_name]].join(', ').remove_extras,0) 384 @hash[:kernel_argument_list_in] = opencl_arguments(@hash[:in_devicenames],0) 385 @hash[:kernel_argument_list_out] = opencl_arguments(@hash[:out_devicenames],0) 386 @hash[:kernel_argument_list_constants] = opencl_arguments(@hash[:argument_name],0) 387 388 # Add declarations for the loop variables for the original code in the hash 389 @hash[:algorithm_code0] = INDENT+"int #{loop_variable};"+NL+@hash[:algorithm_code0] 390 end