module Namesplit

Nettoie le string envoyé par l’utilisateur et renvoyée par la suite.

Extraction à partir des prénoms

Liste des prénoms.

Pour les cas de Split les plus simples.

Ajoute la méthode titleize pour revoir certains formats de noms.

Constants

FIRST_NAMES
VERSION

Public Class Methods

split(full_name) click to toggle source

Public : Cette méthode organise progressivement le split du nom et du prénom. Les différentes solutions sont testées une à une jusqu’à ce que l’une d’entre elle fonctionne.

# File lib/namesplit.rb, line 15
def self.split(full_name)
  @result = OpenStruct.new(quality: 0)
  return @result if full_name.nil?
  clean(full_name)

  with_first_names
  with_uppercasing if @result.first_names.nil?
  with_space if @result.first_names.nil?

  clean_output
end
titleize(target) click to toggle source

Public : Transforme en titre un string.

# File lib/namesplit/titleize.rb, line 6
def self.titleize(target)
  # Tous les mots sont traités un à un en découpant la phrase à partir des
  # espaces.
  words = target.to_s.split(" ").map do |word|
    # Les apostrophes internes et les tirets sont gérés correctement
    word.split("-").map! do |sub_word|
      sub_word.split("'").map! do |subsub_word|
        titleize_word(subsub_word)
      end.join("'")
    end.join("-")
  end

  words.join(" ")
end
with_first_names() click to toggle source
# File lib/namesplit/first_names.rb, line 4
def self.with_first_names
  words = @full_name.split(" ")

  first_name = ""
  words.each do |word|
    next unless FIRST_NAMES.include?(titleize(word))

    first_name << " " + word
  end

  return if first_name == ""
  first_name.strip! && first_name.split(" ").each { |w| words.delete(w) }
  @result.first_names = first_name
  @result.last_name = words.join(" ")
  @result.quality = 0.9
end

Private Class Methods

clean(full_name) click to toggle source

Private : Nettoe la ponctuation du string.

# File lib/namesplit/clean.rb, line 8
def self.clean(full_name)
  @full_name = full_name.to_s
  @full_name.gsub!(/!|\.|,|\?|\(|\)|\s\-\s/, " ")
  @full_name.gsub!(/\s+/, " ")

  @result.full_name = titleize(@full_name)
end
clean_output() click to toggle source
# File lib/namesplit/clean.rb, line 16
def self.clean_output
  @result.first_names = titleize(@result.first_names)
  @result.first_name = @result.first_names.split(" ")[0]
  @result.last_name = titleize(@result.last_name)

  @result
end
titleize_word(word) click to toggle source

Private : Transforme un mot en minuscule et ajoute une majuscule à la première lettre.

Exemple :

"BOÎTE".titleize => "Boîte"
# File lib/namesplit/titleize.rb, line 29
def self.titleize_word(word)
  accents = { "É" => "é", "È" => "è", "Ê" => "ê", "Ë" => "ë", "À" => "à", "Â" => "â", "Ï" => "ï", "Î" => "î", "Ô" => "ô", "Ù" => "ù", "Û" => "û", "Ü" => "ü", "Ç" => "ç", "Ö" => "ö", "Ÿ" => "ÿ" }

  final = []
  word.chars.each.with_index do |char, index|
    if index == 0
      final << char.upcase
    else
      new_char = accents.values_at(char).first

      if new_char.nil?
        final << char.downcase
      else
        final << new_char
      end
    end
  end

  final.join("")
end
uppercase_percentage() click to toggle source

Private : Renvoie le pourcentage de lettre dans la phrase en majuscules.

# File lib/namesplit/simple_split.rb, line 46
def self.uppercase_percentage
  name = @full_name.gsub(/[^A-Za-z]/, "")
  name.gsub(/[^A-Z]/, "").size / name.size.to_f
end
with_space() click to toggle source

Private : Dans le cas où seuls deux éléments composent le string et qu’aucun nom ou prénom n’est détecté parmi l’un ou l’autre.

# File lib/namesplit/simple_split.rb, line 9
def self.with_space
  words = @full_name.split(" ")
  @result.first_names = words[0]
  words.delete(words[0])

  @result.last_name = words.join(" ")
  @result.quality = 0.3
end
with_uppercasing() click to toggle source

Private : En fonction des majuscules ou non, on détecte le nom de la personne. Afin d’éviter les problèmes avec les “-” ou les accents majuscules “À” on effectue quelques pourcentage.

# File lib/namesplit/simple_split.rb, line 22
def self.with_uppercasing
  words = @full_name.split(" ")
  return if uppercase_percentage > 0.95

  last_name = ""
  last_index = 0
  words.each.with_index do |word, index|
    result = word.gsub(/[^A-Z]/, "").size / word.size.to_f

    next if result < 0.7
    next if last_name != "" && last_index != index - 1
    last_name << " " + word
    last_index = index
  end

  return if last_name == ""
  last_name.strip! && last_name.split(" ").each { |w| words.delete(w) }
  @result.last_name = last_name
  @result.first_names = words.join(" ")
  @result.quality = 0.8
end