class CHECKING::YOU
I'm not trying to be an exact clone of `shared-mime-info`, but I think its “Recommended checking order” is pretty sane: specifications.freedesktop.org/shared-mime-info-spec/latest/
In addition to the above, CYO() supports IETF-style Media Type strings like “application/xhtml+xml” and supports `stat`-less testing of `.extname`-style Strings.
Constants
- CLASS_NEEDLEMAKER
The following two `proc`s handle classwide-memoization and instance-level assignment for values that may be Enumerable but often refer to only a single
Object
.For example, most `Postfix`es (file extensions) will only ever belong to a single CYO
Object
, but a handful represent possibly-multiple types, like how `.doc` can be an MSWord file or WordPad RTF.These assignment procs take a storage haystack, a needle to store, and the CYO receiver to which the needle refers. They will set `haystack => CYO` if that needle is unique and unset, or they will convert an existing single `haystack => CYO` assignment to `haystack => Set[existingCYO, newCYO]`.
This is an admittedly-annoying complexity-for-performance tradeoff with the goal of allocating as few spurious containers as possible instead of explicitly initializing a Set for every needle when most of them would wastefully be a Set of just a single thing.
- INSTANCE_NEEDLEMAKER
This is the instance-level version of the above, e.g. a CYO with only one Postfix will assign `cyo.:@postfixes = Postfix`, and a CYO with many Postfixes will assign e.g. `cyo.:@postfixes = Set[post, fix, es, …]`.
- LEGENDARY_HEAVY_GLOW
Extract the heaviest member(s) from an Enumerable of weighted keys.
- StickAround
Provide case-optional String-like keys for Postfixes, Globs, etc.
From Ruby's `Hash` docs: “Two objects refer to the same hash key when their hash value is identical and the two objects are eql? to each other” I tried to subclass String and just override `:eql?` and `:hash` for case-insensitive lookups, but it turns out not be that easy due to MRI's C comparison functions for String, Symbol, etc.
It was super-confusing because I could call e.g. `'DOC'.eql? 'doc'` manually and get `true`, but it would always fail to work when used as a `Hash` key, when calling `uniq`, or in a `Set`:
irb(main):049:1* Lol = Class.new(String).tap { irb(main):050:1* _1.define_method(:hash) do; self.downcase!.hash; end; irb(main):051:1* _1.define_method(:eql?) do |lol|; self.casecmp?(lol); end; irb(main):052:1* _1.alias_method(:==, :eql?) irb(main):053:0> } irb(main):054:0> fart = Lol.new(“abcdefg”) irb(main):055:0> butt = Lol.new(“abcdefgh”) irb(main):056:0> fart == butt
> true¶ ↑
irb(main):057:0> fart.eql? butt
> true¶ ↑
irb(main):058:0> fart.hash
> 1243221847611081438¶ ↑
irb(main):059:0> butt.hash
> 1243221847611081438¶ ↑
irb(main):060:0> fart => “smella”
> nil¶ ↑
irb(main):061:0> fart => “smella”
> “smella”¶ ↑
I'm not the first to run into this, as I found when searching for `“rb_str_hash_cmp”`: kate.io/blog/strange-hash-instances-in-ruby/
To work around this I will explicitly `downcase` the actual String subclass' value and just let the hashes collide for differently-cased values, then `eql?` will decide. This is still slower than the all-C String code but is the fastest method I've found to achieve this without doubling my
Object
allocations by wrapping each String in a Struct.- TEST_EXTANT_PATHNAME
Test a Pathname representing an extant file whose contents and metadata we can use. This is separated into a lambda due to the complexity, since the entry-point might be given a String that could represent a Media Type, a hypothetical path, an extant path, or even raw stream contents. It could be given a Pathname representing either a hypothetical or extant file. It could be given an IO/Stream object. Several input possibilities will end up callin this lambda.
Some of this complexity is my fault, since I'm doing a lot of variable juggling to avoid as many new-Object-allocations as possible in the name of performance since this library is the very core-est core of DistorteD; things like assigning Hash values to single CYO objects the first time that key is stored then replacing that value with a Set iff that key needs to reference any additional CYO.
-
`::from_xattr` can return `nil` or a single `CYO` depending on filesystem extended attributes. It is very very unlikely that most people will ever use this, but I think it's cool 8)
-
`::from_postfix` can return `nil`, `CYO`, or `Set` since I decided to store Postfixes separately from freeform globs since file-extension matches are the vast majority of globs. Postfixes avoid needing to be weighted since they all represent the same final pathname component and should never result in multiple conflicting Postfix key matches. A single Postfix key can represent multiple CYOs, though; hence the possible `Set`.
-
`::from_glob` can return `nil` or `Hash` since even a single match will include the weighted key.
-
`::from_content` can return `nil` or `Hash` based on a `libmagic`-style match of file/stream contents. Many common types can be determined from the first four bytes alone, but we support matching arbitrarily-long sequences against arbitrarily-big byte range boundaries. These keys will also be weighted, even for a single match.
-
Public Class Methods
# File lib/checking-you-out.rb, line 19 def self.OUT(unknown_identifier, so_deep: true) case unknown_identifier when ::Pathname TEST_EXTANT_PATHNAME.call(unknown_identifier) when ::String case when unknown_identifier.count(-?/) == 1 then # TODO: Additional String validation here. ::CHECKING::YOU::OUT::from_ietf_media_type(unknown_identifier) when unknown_identifier.start_with?(-?.) && unknown_identifier.count(-?.) == 1 then ::CHECKING::YOU::OUT::from_pathname(unknown_identifier) else if File::exist?(File::expand_path(unknown_identifier)) and so_deep then TEST_EXTANT_PATHNAME.call(Pathname.new(File::expand_path(unknown_identifier))) else LEGENDARY_HEAVY_GLOW.call(::CHECKING::YOU::OUT::from_glob(unknown_identifier), :weight) || ::CHECKING::YOU::OUT::from_postfix(unknown_identifier) end end when ::CHECKING::YOU::IN unknown_identifier.out end end