title: QuickSearch tagline: full index search based on lunr date: 2020-11-08 00:00:00 +100 description: >
QuickSearch is based on the search engine Lunr, fully integrated with the J1 Template. Lunr is designed to be lightweight yet full-featured to provide a great search experience. No need for complex external, server-sided search engines or commercial services on the Internet like Google.
categories: [ Roundtrip ] tags: [ Introduction, Module, Lunr, QuickSearch ]
toc: true scrollbar: false fam_menu_id: page_ctrl_simple
permalink: /pages/public/learn/roundtrip/quicksearch/ regenerate: false
resources: [ lunr, rouge, lightbox, clipboard ] resource_options:
- toccer: collapseDepth: 3 - attic: padding_top: 400 padding_bottom: 50 opacity: 0.5 slides: - url: /assets/images/modules/attics/banner/lunr-banner-1280x800.jpg alt: Lunr
// Page Initializer // ============================================================================= // Enable the Liquid Preprocessor :page-liquid:
// Set (local) page attributes here // —————————————————————————– // :page–attr: <attr-value>
// Load Liquid procedures // —————————————————————————– {% capture load_attributes %}themes/{{site.template.name}}/procedures/global/attributes_loader.proc{%endcapture%}
// Load page attributes // —————————————————————————– {% include {{load_attributes}} scope=“all” %}
// Page content // ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
// Include sub-documents // —————————————————————————–
QuickSearch is based on the search engine Lunr, fully integrated with the J1
Template. Lunr is designed to be lightweight yet full-featured to provide a great search experience. No need for complex external, server-sided search engines or commercial services on the Internet like Google.
Searching a website using QuickSearch is different in relation to search engines like Google or Microsoft Bing. Those search platforms using complex algorithms to provide a simple interface to the public but using a lot of artificial intelligence (AI) methods to make sense of results out of a handful of words given for a search.
Nevertheless, QuickSearch, the J1
implementation of Lunr, is simple like searching at Google but offers additional features to do searches more specifically - if wanted. QuickSearch provides an easy-to-use query language for better results - anyway!
Core concepts¶ ↑
Understanding some of the concepts and terminology that QuickSearch (Lunr) uses will allow users to provide powerful search functionality - to get more relevant search results.
Indexing documents¶ ↑
QuickSearch offers searches on all documents of the website generated by J1
but only for this site. Advantage, no internet access is done for searches because it's not needed. Searches are based on a pre-build local site full-text index loaded by the browser on a page request. The index for a site is generated by the (Jekyll
) plugin `lunr_index.rb` located in the `_plugins` folder.
The full-text index is always generated by Jekyll
at build-time:
.Index creation at buildtime
Startup the site .. Configuration file: …
Incremental build: enabled Generating... J1 QuickSearch: creating search index ... J1 QuickSearch: finished, index ready. ....
Or, if you're running a website in development mode, the index gets refreshed for all files added or modified.
.Index creation if files added, or modified
site: Regenerating: n file(s) changed at … site: … site: J1
QuickSearch: creating search index … site: J1
QuickSearch: finished, index ready.
...
Documents¶ ↑
The searchable data in an index is organized as documents that contain the text, the words (terms), that you want to be able to search on. A document is a data set (JSON object) with a set of fields that are processed to create the result list for a search.
A document data set might look like this:
- source, json, role=“noclip”
{
"title": "Web in a Day", "tagline": "meet & greet jekyll", "url": "/pages/public/learn/kickstarter/web_in_a_day/meet_and_greet/", "date": "2018-05-01 00:00:00 +0000", "tags": [ "Introduction" ], "categories": [ "Jekyll", "Knowledge", "Tutorial" ], "description": "Web in a Day is the first in a series of tutorials ..."
}
In this document, there are several fields, like `title`, `tagline`, or `description`, that could be used for full-text searches. But additional fields are available, like `tags` or `categories` that can be used for more specific searches based on `identifiers`.
NOTE: The document content is collected by the (intrinsic) field `body`. To limit the size of the index data loaded by the browser, the body field is removed from a document. The `body` field not available as an explicit field for searches, but the content is fully searchable.
To do a simple full-text search as well as more specific searches, the QuickSearch core engine Lunr offers a query language, a DSL (domain-specific language). Find more about *QuickSearch|Lunr DSL* queries with the section <<Searching>>.
Scoring¶ ↑
The relevance (the `score`) of a document is calculated based on an algorithm called BM25, along with other factors. You don’t need to worry too much about the details of how this technique works. To summarize: the more a search term occurs in a single document, the more that term will increase that document’s score, but the more a search term occurs in the overall collection of documents, the less that term will increase a document’s score. In other words, seldom words count and increase the score.
Scoring information generated by the BM25 algorithm is added to the (local) search index. This allows a very fast calculation of the relevance of documents for queries.
Imagine you’re website contains documents about
Jekyll
. The term `Jekyll` may occur very frequently throughout the entire website. Used quite often for the content. So finding a document that mentions the termJekyll
isn’t very significant for a search.However, if you’re searching for `Jekyll Generator`, only some documents of the website have the word `Generator` in them, and that will bring the score (relevance) for documents having both words in them at a higher level, bring them higher up in the search results.
Matching and scoring are used by all search engines - the same as for
J1
QuickSearch. You'll see for QuickSearch a similar behavior in sorting search results as you already know from commercial internet search engines like Google: the top results are the more relevant ones.Searching¶ ↑
To access QuickSearch, a magnifier button is available in the `Quicklinks` area in the menu bar at the top-right of every page.
.Search button (magnifier) in the quick access area lightbox::quicksearch-icon[ 800, {data-quicksearch-icon} ]
A mouse-click on the magnifier button opens the search input and disables all other navigation to focus on what you're intended to do: searching.
.Input bar for a QuickSearch lightbox::quicksearch-input[ 800, {data-quicksearch-input} ]
Search queries look like a simple text. But the search `engine` under the hood of QuickSearch transforms the given search string (text) always into a search query. This supports a special syntax, the DSL, for defining more complex queries for better (scored) results.
As always: start simple!
Simple searches¶ ↑
The simplest way to run a search is to pass the text (words, terms) on which you want to search on:
- source, text
jekyll
The above will return all documents that match the term `jekyll`. Searches for multiple terms (words) are also supported. If a document matches *at least* one of the search terms, it will show in the results. The search terms are combined by a logical `OR`.
- source, text
jekyll tutorial
The above example will match documents that contain either `jekyll` OR `tutorial`. Documents that contain both will increase the score, and those documents are returned first.
NOTE: In difference to a Google search (terms are combined at Google by a logical `AND`) a Quicksearch combines the terms by an `OR`.
To combine search terms in a QuickSearch query by a logical AND, the terms could be prepended by a plus sign (`+`) to mark them as for the QuickSearch query (DSL) as required:
- source, text
+jekyll +tutorial
Wildcards¶ ↑
QuickSearch supports wildcards when performing searches. A wildcard is represented as an asterisk (`*`) and can appear anywhere in a search term. For example, the following will match all documents with words beginning with `Jek`:
- source, text
jek*
NOTE: Language grammar rules are not relevant for searches. To simplify an index, all words (terms) are transformed to lower case. As a result, the word `Jekyll` is the same as `jekyll` from a search-engines perspective. Language variations of words like `Jekyll's` or plurals like `Generators` are reduced to their base form. For searches, don't take care of grammar rules but the spelling. If you're unsure about the spelling of a word, use wildcards.
Fields¶ ↑
By default, Lunr will search *all fields* in a document for the given query terms, and it is possible to restrict a term to a specific field. The following example searches for the term `jekyll` in the field title:
- source, text
title:jekyll
The search term is prefixed with the name of the field, followed by a colon (`:`). The field must be one of the fields defined when building the index. Unrecognized fields will lead to an error.
Field-based searches can be combined with all other term modifiers and wildcards, as well as other terms. For example, to search for words beginning with `jek` in the title AND the wildcard `coll*` in a document, the following query can be used:
- source, text
+title:jek* +coll*
Available fields¶ ↑
Besides the document body, an intrinsic field to create the full-text index out of the document content, some more specific fields are available for searches.
.Available fields (all documents)
- cols=“3a,3a,6a, options=”header“, width=”100%“, role=”rtable mt-3“
-
|=============================================================================== |Name |Value |Description|Example|s
|`title` |`string` |The headline of a document (article, post)
Example|s: QuickSearch
- source, text
title:QuickSearch
|`tagline` |`string` |The subtitle of a document (article, post)
Example|s: full index search
|`tags` |`string` |Tags describe the content of a document.
Example|s: Roundtrip, QuickSearch
|`categories` |`string` |Categories describe the group of documnets a document belongs to.
Example|s: Search
|`description` |`string` |The description is given by the author for a document. It gives a brief summary what the document is all about.
Example|s: QuickSearch is based on the search engine Lunr, fully integrated with
J1
Template …|===============================================================================
////
Boosts¶ ↑
In multi-term searches, a single term may be important than others. For these cases Lunr supports term level boosts. Any document that matches a boosted term will get a higher relevance score, and appear higher up in the results. A boost is applied by appending a caret (`^`) and then a positive integer to a term.
- source, javascript
idx.search('foo^10 bar')
The above example weights the term “foo” 10 times higher than the term “bar”. The boost value can be any positive integer, and different terms can have different boosts:
- source, javascript
idx.search('foo^10 bar^5 baz')
Fuzzy Matches¶ ↑
Lunr supports fuzzy matching search terms in documents, which can be helpful if the spelling of a term is unclear, or to increase the number of search results that are returned. The amount of fuzziness to allow when searching can also be controlled. Fuzziness is applied by appending a tilde (`~`) and then a positive integer to a term. The following search matches all documents that have a word within 1 edit distance of “foo”:
- source, javascript
idx.search('foo~1')
An edit distance of 1 allows words to match if either adding, removing, changing or transposing a character in the word would lead to a match. For example “boo” requires a single edit (replacing “f” with “b”) and would match, but “boot” would not as it also requires an additional “t” at the end. ////
Term presence¶ ↑
By default, Lunr combines multiple terms together in a search with a logical OR. That is, a search for `jekyll collections` will match documents that contain `jekyll` or contain `collections` or contain both. This behavior is controllable at the term level, i.e., the presence of each term in matching documents can be specified.
By default, each term is optional in a matching document, though a document must have at least one matching term. It is possible to specify that a term must be present in matching documents or that it must be absent in matching documents.
To indicate that a term must be present in matching documents, the term could be prefixed with a plus sign (`+`) (required), and to indicate that a term must be absent (not wanted), the term should be prefixed with a minus (`-`).
The below example searches for documents that must contain `jekyll`, and must not contain the word `collection`:
- source, text
+jekyll -collection
To simulate a logical AND search of documents that contain the word `jekyll` AND the word `collection`, mark both terms as required:
- source, text
+jekyll +collection
Whats next¶ ↑
You've explored some of the possibilities
J1
offers for websites. But much, much more canJ1
do for your project. This was the last place to go for the roundtrip. More details of the most common elements of Bootstrap can be found on the previewer for a theme. Have a look at the {previewer-theme}[Theme previewer].To make things real for your new site, go for *Web in a day*. This tutorial guides you through all the steps on how to build a website. Your site using
Jekyll
and the template systemJ1
. It's really a pleasant journey to learn what modern static webs can offer today.Start your journey from here: {url-j1-kickstarter–web-in-a-day}[Web in a day, {browser-window–new}].