class Ai::Nlp::Languages
Class to handle multiple languages
Public Class Methods
Initialisation
# File lib/ai/nlp/languages.rb, line 14 def initialize @n_gram = NGram.new end
Public Instance Methods
Returns the currently known languages @return An array of Language
# File lib/ai/nlp/languages.rb, line 21 def all @languages = Language.all end
Create a new language. @param string name The language name. @param string input The initial data set. @return La langue créée.
# File lib/ai/nlp/languages.rb, line 40 def create_one(name, input) language = Language.new(name: name) language.update(map: @n_gram.calculate(input).to_h) end
Offers among the available languages the closest one to the datasets @param string input The data set.
# File lib/ai/nlp/languages.rb, line 28 def guess(input) all return [] if @languages.empty? hash = @languages.map { |language| [language, score(input, language)] }.to_h sort(hash) end
Private Instance Methods
Add frequency if needed @param integer input_gram_freq The input gram frequency @param integer pos The position in the max_compare @param integer point The current calculated points
# File lib/ai/nlp/languages.rb, line 89 def add_frequency(input_gram_freq, pos, point) point += (input_gram_freq - pos).abs if input_gram_freq point end
Calculates the new frequency @return le score (point)
# File lib/ai/nlp/languages.rb, line 74 def calculate_point(max_compare, ngram, input_gram) point = 0 (0..max_compare).each do |pos| position = input_gram[pos] next unless position point = add_frequency(ngram[position[0]], pos, point) end point end
# File lib/ai/nlp/languages.rb, line 56 def reject(sorted_languages, hash) sorted_languages.reject { |language| hash[language].zero? } end
Compare a string of characters against a language based on, at most, the 400 most commonly used groups of letters. @param string input The data set to compare @param Language language The Language to compare to
# File lib/ai/nlp/languages.rb, line 65 def score(input, language) input_gram = @n_gram.calculate(input) ngram = language.map calculate_point([input_gram.size, 400].min, ngram, input_gram) end
Sort the language hash @param hash hash The language hash @return the sorted list of languages
# File lib/ai/nlp/languages.rb, line 51 def sort(hash) sorted_languages = @languages.sort_by { |language| hash[language] } reject(sorted_languages, hash) end