class Infoboxer::Tree::Template
Represents MediaWiki
template.
[Template](en.wikipedia.org/wiki/Wikipedia:Templates) is basically a thing with name, some variables and their values. When pages are displayed in browser, templates are rendered in something different by wiki engine; yet, when extracting information with Infoboxer
, you are working with original templates.
It requires some mastering and understanding, yet allows to do very poweful things. There are many kinds of them, from pure formatting-related (which are typically not more than small bells and whistles for page outlook, and should be rendered as a text) to very information-heavy ones, like [infoboxes](en.wikipedia.org/wiki/Help:Infobox), from which Infoboxer
borrows its name!
Basically, for information extraction from template you'll list its {#variables}, and then use {#fetch} method (and its variants: {#fetch_hash}/#{fetch_date}) to extract their values.
### On variables naming
MediaWiki
templates can contain named and unnamed variables. Example:
“` {{birth date and age|1953|2|19|df=y}} “`
This is template with name “birth date and age”, three unnamed variables with values “1953”, “2” and “19”, and one named variable with name “df” and value “y”.
For consistency, Infoboxer
treats unnamed variables exactly the same way MediaWiki
does: they considered to have numeric names, which are _started from 1_ and _stored as a strings_. So, for template shown above, the following is correct:
“`ruby template.fetch('1').text == '1953' template.fetch('2').text == '2' template.fetch('3').text == '19' template.fetch('df').text == 'y' “`
Note also, that _named variables with simple text values_ are duplicated as a template node {Node#params}, so, the following is correct also:
“`ruby template.params == 'y' template.params.has_key?('1') == false “`
For more advanced topics, like subclassing templates by names and converting them to inline text, please read {Templates} module's documentation.
Attributes
Public Class Methods
Infoboxer::Tree::Compound::new
# File lib/infoboxer/tree/template.rb, line 116 def initialize(name, variables = Nodes[]) super(variables, **extract_params(variables)) @name = name end
Public Instance Methods
@private Internal, used by {Parser}.
# File lib/infoboxer/tree/template.rb, line 241 def empty? false end
Fetches template variable(s) by name(s) or patterns.
Usage:
“`ruby argentina.infobox.fetch('leader_title_1') # => one Var
node argentina.infobox.fetch('leader_title_1',
'leader_name_1') # => two Var nodes
argentina.infobox.fetch(/leader_title_d+/) # => several Var
nodes “`
@return [Nodes<Var>]
# File lib/infoboxer/tree/template.rb, line 170 def fetch(*patterns) Nodes[*patterns.map { |p| variables.find(name: p) }.flatten] end
Fetches date by list of variable names containing date components.
_(Experimental, subject to change or enchance.)_
Explanation: if you have template like “` {{birth date and age|1953|2|19|df=y}} “` …there is a short way to obtain date from it: “`ruby template.fetch_date('1', '2', '3') # => Date.new(1953,2,19) “`
@return [Date]
# File lib/infoboxer/tree/template.rb, line 195 def fetch_date(*patterns) components = fetch(*patterns) components.pop while components.last.nil? && !components.empty? if components.empty? nil else Date.new(*components.map { |v| v.to_s.to_i }) end end
Fetches hash `{name => variable}`, by same patterns as {#fetch}.
@return [Hash<String => Var>]
# File lib/infoboxer/tree/template.rb, line 177 def fetch_hash(*patterns) fetch(*patterns).map { |v| [v.name, v] }.to_h end
Wikilink
name of this template's source.
# File lib/infoboxer/tree/template.rb, line 234 def link # FIXME: super-naive for now, doesn't thinks about subpages and stuff. "Template:#{name}" end
# File lib/infoboxer/tree/template.rb, line 154 def named_variables variables.select(&:named?) end
# File lib/infoboxer/tree/template.rb, line 121 def text res = unnamed_variables.map(&:text).join('|') res.empty? ? '' : "{#{name}:#{res}}" end
Represents entire template as hash of `String => String`, where keys are variable names and values are text representation of variables contents.
@return [Hash{String => String}]
# File lib/infoboxer/tree/template.rb, line 141 def to_h variables.map { |var| [var.name, var.text] }.to_h end
See {Node#to_tree}
# File lib/infoboxer/tree/template.rb, line 131 def to_tree(level = 0) ' ' * level + "<#{descr}>\n" + variables.map { |var| var.to_tree(level + 1) }.join end
Returns list of template variables with numeric names (which are treated as “unnamed” variables by MediaWiki
templates, see {Template class docs} for explanation).
@return [Nodes<Var>]
# File lib/infoboxer/tree/template.rb, line 150 def unnamed_variables variables.reject(&:named?) end
# File lib/infoboxer/tree/template.rb, line 126 def unwrap unnamed_variables.flat_map(&:children).unwrap end
Protected Instance Methods
# File lib/infoboxer/tree/template.rb, line 247 def _eq(other) other.name == name && other.variables == variables end
# File lib/infoboxer/tree/template.rb, line 251 def clean_class "Template[#{name}]" end
# File lib/infoboxer/tree/template.rb, line 255 def extract_params(vars) vars .select { |v| v.children.count == 1 && v.children.first.is_a?(Text) } .map { |v| [v.name.to_sym, v.children.first.raw_text] }.to_h end
# File lib/infoboxer/tree/template.rb, line 261 def inspect_variables(depth) variables.to_a[0..1].map { |name, var| "#{name}: #{var.inspect(depth + 1)}" }.join(', ') + (variables.count > 2 ? ', ...' : '') end