nLab Searching the nLab

Contents

Contents

Searching the nLab

There are several methods of searching the nLab:

  1. The built-in search. This is via the search box at the top of every page. The distinguishing characteristics of this search are:

    1. It uses regular expressions.
    2. It searches the source of each page.
  2. Formula Search: nLab Formula Search provides an instance of the MathWebSearch engine for nLab. To use it enter a formula query (LaTeX with query variables of the form ?a, ?b, …) into the central input box (or just select one of the examples), wait until the MathML has been rendered (query variables are red), and hit search. If you click on one of the hits, you get a URL, which puts you into nLab and highlights the formula hit (in gray).

  3. An external search engine.

    Most search engines allow restricting the search to a single site, such as to ncatlab.org/nlab.

    The distinguishing characteristics of an external search are:

    1. Usually the search is for alphanumeric characters.

    2. It searches the rendered version of each page, instead of the source code.

Search the nLab using Google:

MathWebSearch (MWS) is a content-based search engine for mathematical formulae. It indexes MathML formulae, using a technique derived from automated theorem proving: Substitution Tree Indexing.

As the project is still under development, the authors would be happy to hear your feedback. If you have some comments you can leave it here. If you have a bug report or a feature request please open a issue on the github repository.

Regular Expressions

Regular expressions are a powerful way of extending search capabilities to take into account that one often wants to search for more than just a set phrase. In a regular expression, certain characters are declared to be “special” and have a particular interpretation (somewhat like TeX with its special catcodes). A special character can always be “escaped” to interpret it as an ordinary character. Thus . means “match any single character” but \. means “match a period”.

As Instiki is written in ruby, it uses the ruby version of regular expressions (each language has its own version; the differences are usually minor). The following is based on the list at ruby-doc. It has been condensed slightly to those aspects likely to be of use here:

Special characters
The special characters are: ., |, (, ), [, \, ^, {, +, $, *, and ?. To match one of these characters, precede it with a backslash. All other characters ordinarily just match themselves unless they are made somehow special by one of the special characters.
\b, \B
Match word boundaries and nonword boundaries respectively. Thus cat matches against category and cat but cat\b only matches cat (and scat).
[]
This matches against a single character in a list. The characters |, (, ), [, ^, $, *, ? are treated as regular characters in such a list. You can specify a range using -: thus, a-z. To include a ] or - it must come at the start of the list. A ^ at the start negates the list.
\d, \s, \w
These match, respectively, digits, spaces, and word characters.
\D, \S, \W
These are the negations of the lowercase versions.
. (period)
Matches any character except a newline.
()
Parentheses group pieces of the regular expression. This is important for the following modifications. In these, xx stands for a sub-expression which can be a single character, a [], or a ().
xx*
Matches zero or more occurrences of xx. Thus ab* matches a, ab, abb, and so forth. Similarly, (cat)* matches cat, catcat, catcatcat, and so forth. This will try to match as much as possible; use xx*? to make it match as little as possible.
xx+
Matches one or more occurrences of xx. Thus ab+ matches ab, abb, but not a. This will try to match as much as possible; use xx+? to make it match as little as possible.
xx{mm,nn}
Matches at least mm and at most nn occurrences of rere. This will try to match as much as possible; use xx{mm,nn}? to make it match as little as possible.
xx?
Matches zero or one occurrence of xx.
x 1x_1|x 2x_2
Matches either x 1x_1 or x 2x_2.
category: meta

Last revised on July 3, 2023 at 06:37:07. See the history of this page for a list of all contributions to it.