class NetworkUtils::UrlInfo
Simple class to get URL info (validation/existance, headers, content-type) Allows to get all this stuff without actually downloading huge files like CSVs, images, videos, etc.
Public Class Methods
Initialise a UrlInfo
for a particular URL
@param [String] url the URL you want to get info about @param [Integer] request_timeout Max time to wait for headers from the server (seconds)
# File lib/network_utils/url_info.rb, line 29 def initialize(url, request_timeout = 10) @url = String.new(url.to_s).force_encoding('UTF-8') @request_timeout = request_timeout end
Public Instance Methods
A shortcut method to get the Content-Type of the remote resource
@return [String] remote resource Content-Type Header content
# File lib/network_utils/url_info.rb, line 72 def content_type headers&.fetch('content-type', nil) &.split(/,\s/) &.map { |ct| ct.split(/;\s/).first } end
A method to get the remote resource HTTP headers Caches the result and returns memoised version
@return [Hash, nil] remote resource HTTP headers list or nil
# File lib/network_utils/url_info.rb, line 82 def headers return nil if @url.to_s.empty? return nil unless (encoded_url = encode(@url)) Timeout.timeout(@request_timeout + CODE_TIMEOUT_EXTRA) do response = HTTParty.head(encoded_url, timeout: @request_timeout) raise response.response if response.response.is_a?(Net::HTTPServerError) || response.response.is_a?(Net::HTTPClientError) @headers ||= response.headers end rescue SocketError, ThreadError, Errno::ENETUNREACH, Errno::ECONNREFUSED, Errno::EADDRNOTAVAIL, Timeout::Error, TypeError, Net::HTTPServerError, Net::HTTPClientError, Net::OpenTimeout nil end
Check the Content-Type of the resource
@param [String, Symbol, Array] type the prefix (before “/”) or full Content-Type content @return [Boolean] true if Content-Type matches something from the types list
# File lib/network_utils/url_info.rb, line 38 def is?(type) return false if type.to_s.empty? expected_types = Array.wrap(type).map(&:to_s) content_type && expected_types.select do |t| content_type.select { |ct| ct.start_with?(t) } end.any? end
A shortcut method to get the remote resource size
@return [Integer] remote resource size (bytes), 0 if there's nothing
# File lib/network_utils/url_info.rb, line 65 def size headers&.fetch('content-length', 0).to_i end
Check offline URL validity
@return [Boolean] true if the URL is valid from the point of view of the standard
# File lib/network_utils/url_info.rb, line 50 def valid? @url.match?(UrlRegex.get(mode: :validation)) end
Check online URL validity (& format validity as well)
@return [Boolean] true if the URL is valid from the point of view of the
standard & exists (has headers)
# File lib/network_utils/url_info.rb, line 58 def valid_online? valid? && headers end
Private Instance Methods
# File lib/network_utils/url_info.rb, line 101 def encode(url) Addressable::URI.encode(url) end