module Gnomika
Module containing all classes and functions of the application
Public Class Methods
Gets information about all categories from the website @return An Array of Category
objects
# File lib/gnomikologikon/web_processing.rb, line 13 def self.fetch_category_info response = HTTParty.get('https://www.gnomikologikon.gr/categ.php', { headers: {"User-Agent" => "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36"}, }) doc = Nokogiri::HTML.parse(response.body) # Each "big" category is stored in a table with class "authrst" category_tables = doc.xpath("//table[@class='authrst']") categories = [] category_tables.each do |table| # Get category name. Category names are stored in td elements with class "authrsh" category_name = table.xpath("tr/td[@class='authrsh']").text # Get the subcategories of each category subcategories = [] # Subcategories of each category are a elements in a list subcategory_elements = table.xpath("tr//ul//li//a") subcategory_elements.each do |element| subcategory_name = element.content # Need to prefix category URLs with the website URL subcategory_url = "https://www.gnomikologikon.gr/#{element[:href]}" subcategories << Subcategory.new(subcategory_name,subcategory_url) end categories << Category.new(category_name,subcategories: subcategories) end categories end
Get all quotes for the given subcategories @yield Runs the given block after each subcategory is processed @param subcategories Array of subcategories @return Hash matching each subcategory to an Array of quotes
# File lib/gnomikologikon/web_processing.rb, line 49 def self.get_quotes_for_categories(subcategories) quotes = {} subcategories.each do |subcategory| response = HTTParty.get(subcategory.url, { headers: {"User-Agent" => "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36"}, }) quotes[subcategory] = get_quotes_from_html(response.body) yield if block_given? # Throttle the connection because processing is fast sleep 1 end quotes end
Fetch all quotes from given HTML page @param page_body Body of HTML page to extract quotes from @return Array of quotes
# File lib/gnomikologikon/web_processing.rb, line 67 def self.get_quotes_from_html(page_body) doc = Nokogiri::HTML.parse(page_body) quotes_tables = doc.xpath("//table[@class='quotes']//td[@class='quote']") quotes = [] quotes_tables.each do |quote| # Get quote contents content = quote.at_xpath("./text()").text # Check if there is an explanation explanation = quote.xpath("./table[@class='expth']//td") unless explanation.empty? # If an explanation exists, there are two td elements # Remove pavla explanation = explanation.reject{|element| element["class"] == "pavla"} # Keep the explanation td element explanation = explanation[0] content << "\n(#{explanation.text})" end # Check if there is a comment comment = quote.xpath("./p[contains(@class, 'comnt')]") unless comment.nil? # Do not add comment if it does not contain text content << "\n#{comment.text}" unless comment.text.empty? end # HTML p elements with class auth0-auth4 contain quote author and additional information (e.g. book) # Get the text of all auth p elements and combine them in a string author = quote.xpath(".//p[contains(@class, 'auth')]") author = author.map{|el| el.text}.join(' ') quotes << Quote.new(content,author) end quotes end
# File lib/gnomika.rb, line 12 def self.main # Parse command line arguments options = ArgParser.parse(ARGV) # Get available categories available_categories = Gnomika.fetch_category_info # Prompt user to select a category selected_category = select_category(available_categories) # Prompt user to select subcategories selected_subcategories = select_subcategories(selected_category) # Create a progress bar progressbar = ProgressBar.create(total: selected_subcategories.length, title: "Download Progress") # Get the quotes and display progress quotes = Gnomika.get_quotes_for_categories(selected_subcategories){ progressbar.increment } # Create quote files with the options given as parameters begin write_files(options, quotes) rescue StandardError => e STDERR.puts e.message end end
Prompt the user to select a category @param categories Array of Category
objects @return The Category
object the user selected
# File lib/gnomikologikon/ui.rb, line 6 def self.select_category(categories) # Index starting from 1 categories.each_with_index { |category, index| puts "#{index+1}. #{category.name}" } ok = false num = -1 until ok print "Select a category: " STDOUT.flush input = STDIN.gets begin # Remove 1 because display indexes starting from 1. num = Integer(input) - 1 break unless num < 0 || num > categories.length-1 # Input was number but out of bounds puts "Invalid selection!" rescue ArgumentError # Input could not be converted to integer puts "Invalid selection!" end end # Return Category object chosen by the user categories[num] end
Prompt user to select subcategories. Selection can be a range e.g. 1-2, single categories e.g. 1,2,3 or both e.g 1,2-4 @param category The Category
that will be shown @return Array of selected Subcategory
objects
# File lib/gnomikologikon/ui.rb, line 35 def self.select_subcategories(category) subcategories = category.subcategories # Index starting from 1 subcategories.each_with_index { |subcategory, index| puts "#{index+1}. #{subcategory.name}" } puts "You can select multiple categories separated by ',' or a range of categories with '-'." # Contains indexes of all subcategories specified by the user selected_indexes = [] ok = false until ok print "Select subcategories: " # On each loop we assume input is ok until an error is encountered ok = true # Empty the indexes array, in case of leftovers from previous loop selected_indexes = [] input = STDIN.gets # Split the input input = input.split(",") # Check each selection input.each do |selection| begin # We need to remove 1 from each because during selection category numbers started from 1 selected_indexes += selection_to_array(selection, subcategories.length).map{|it| it - 1} rescue => error ok = false puts error.message break end end end # Remove any duplicate indexes selected_indexes.uniq! # Create an array with the corresponding Subcategory objects and return it selected_indexes.map { |index| subcategories[index]} end
Writes given quotes to files according to the given options. @param options GnomikaOptions object containing file output options @param quotes Hash matching each subcategory to an Array with quotes
# File lib/gnomikologikon/file_writer.rb, line 7 def self.write_files(options, quotes) # If no custom directory is specified, use the current directory output_directory = Dir.pwd if options.custom_output_dir_set custom_dir = options.custom_output_dir_value begin unless Dir.exist? custom_dir Dir.mkdir custom_dir end output_directory = custom_dir rescue StandardError => e raise e end end file_writer = FileWriter.new(output_directory, options.single_file, single_file_name: options.single_file_name) quotes.each_pair do |subcategory,subcategory_quotes| file_writer.write_quotes(subcategory.name,subcategory_quotes) end # Must generate strfiles for fortune command to work file_writer.generate_strfiles end
Private Class Methods
Converts given selection into an array of indexes. Throws an ArgumentError if the selection is invalid. An error message is included in the exception. This function must be used with a single selection (e.g 3 or 3-5), not with a list of many selections (e.g 1,2,3…) @param selection String of the selection @param max_available_index Used w @return Array of indexes included in the selection
# File lib/gnomikologikon/ui.rb, line 80 def self.selection_to_array(selection, max_available_index) # Check if selection is a range if selection.include?("-") # Try to process it as a range range_start, range_end = selection.split("-") begin range_start = Integer(range_start) range_end = Integer(range_end) # Check if range is correct. Start must be smaller or equal than end and end must be smaller or equal # to max_available index if range_start > range_end || range_end > max_available_index || range_start < 1 raise ArgumentError end return (range_start..range_end).to_a rescue ArgumentError raise ArgumentError.new "Invalid range! (#{selection.strip})" end else # Assume selection is an integer begin number = Integer(selection) # Check limits if number < 1 || number > max_available_index raise ArgumentError end return [number] rescue raise ArgumentError.new "Invalid selection! (#{selection.strip})" end end end