module ParsingNesting::Tree
Public Class Methods
Get parslet output for string (parslet output is json-y objects), and transform to an actual abstract syntax tree made up of more semantic ruby objects, Node's. The top one will always be a List
.
Call to_query on resulting Node
in order to transform to Solr query, optionally passing in Solr params to be used as LocalParams in nested dismax queries.
Our approach here works, but as we have to put in special cases it starts getting messy. Ideally we might want to actually transform the Object
graph (abstract syntax tree) instead of trying to handle special cases in to_query. For instance, transform object graph for a problematic pure-negative clause to the corresponding object graph without that (-a AND -b) ==> (NOT (a OR b). Transform (NOT NOT a) to (a). That would probably be more robust. But instead we handle special cases in to_query, which means the special cases tend to multiply and need to be handled at multiple levels. But it's working for now.
the negate method was an experiment in transforming parse tree in place, but isn't being used. But it's left as a sign post.
# File lib/parsing_nesting/tree.rb, line 24 def self.parse(string, query_parser = 'dismax') to_node_tree(ParsingNesting::Grammar.new.parse(string), query_parser) end
theoretically Parslet's Transform could be used for this, but I think the manner in which I'm parsing to Parslet labelled hash isn't exactly what Parslet Transform is set up to work with, I couldn't figure it out. But easy enough to do 'manually'.
# File lib/parsing_nesting/tree.rb, line 32 def self.to_node_tree(tree, query_parser) if tree.is_a? Array # at one point I was normalizing top-level lists of one item to just # be that item, no list wrapper. But having the list wrapper # at the top level is actually useful for Solr output. List.new(tree.collect { |i| to_node_tree(i, query_parser) }, query_parser) elsif tree.is_a? Hash if list = tree[:list] List.new(list.collect { |i| to_node_tree(i, query_parser) }, query_parser) elsif tree.has_key?(:and_list) AndList.new(tree[:and_list].collect { |i| to_node_tree(i, query_parser) }, query_parser) elsif tree.has_key?(:or_list) OrList.new(tree[:or_list].collect { |i| to_node_tree(i, query_parser) }, query_parser) elsif not_payload = tree[:not_expression] NotExpression.new(to_node_tree(not_payload, query_parser)) elsif tree.has_key?(:mandatory) MandatoryClause.new(to_node_tree(tree[:mandatory], query_parser)) elsif tree.has_key?(:excluded) ExcludedClause.new(to_node_tree(tree[:excluded], query_parser)) elsif phrase = tree[:phrase] Phrase.new(phrase) elsif tree.has_key?(:token) Term.new(tree[:token].to_s) end end end
Public Instance Methods
# File lib/parsing_nesting/tree.rb, line 125 def bs_escape(val, char = '"') # crazy double escaping to actually get a single backslash # in there without triggering regexp capture reference val.gsub(char, '\\\\' + char) end
Pass in nil 2nd argument if you DON'T want to embed “!dismax” in your local params. Used by to_single_query_params
# File lib/parsing_nesting/tree.rb, line 107 def build_local_params(hash = {}, force_deftype = "dismax") # we insist on dismax for our embedded queries, or whatever # other defType supplied in 2nd argument. hash = hash.dup if force_deftype hash[:defType] = force_deftype hash.delete("defType") # avoid weird colision with hard to debug results end if !hash.empty? defType = hash.delete(:defType) || hash.delete("defType") "{!" + (defType ? "#{defType} " : "") + hash.collect { |k, v| "#{k}=#{v.to_s.include?(" ") ? "'" + v + "'" : v}" }.join(" ") + "}" else # no local params! "" end end