liqe

Lightweight and performant Lucene-like parser, serializer and search engine...

README

liqe

Travis build status Coveralls NPM version Canonical Code Style Twitter Follow

Lightweight and performant Lucene-like parser, serializer and search engine.


Motivation


Originally built Liqe to enable Roarr log filtering via cli. I have since been polishing this project as a hobby/intellectual exercise. I've seen it being adopted by various CLI and web applications that require advanced search. To my knowledge, it is currently the most complete Lucene-like syntax parser and serializer in JavaScript, as well as a compatible in-memory search engine.

Liqe use cases include:

parsing search queries
serializing parsed queries
searching JSON documents using the Liqe query language (LQL)

Note that the Liqe AST is treated as a public API, i.e., one could implement their own search mechanism that uses Liqe query language (LQL).

Usage


  1. ```ts
  2. import {
  3.   filter,
  4.   highlight,
  5.   parse,
  6.   test,
  7. } from 'liqe';

  8. const persons = [
  9.   {
  10.     height: 180,
  11.     name: 'John Morton',
  12.   },
  13.   {
  14.     height: 175,
  15.     name: 'David Barker',
  16.   },
  17.   {
  18.     height: 170,
  19.     name: 'Thomas Castro',
  20.   },
  21. ];
  22. ```

Filter a collection:

  1. ```ts
  2. filter(parse('height:>170'), persons);
  3. // [
  4. //   {
  5. //     height: 180,
  6. //     name: 'John Morton',
  7. //   },
  8. //   {
  9. //     height: 175,
  10. //     name: 'David Barker',
  11. //   },
  12. // ]
  13. ```

Test a single object:

  1. ```ts
  2. test(parse('name:John'), persons[0]);
  3. // true
  4. test(parse('name:David'), persons[0]);
  5. // false
  6. ```

Highlight matching fields and substrings:

  1. ```ts
  2. test(highlight('name:john'), persons[0]);
  3. // [
  4. //   {
  5. //     path: 'name',
  6. //     query: /(John)/,
  7. //   }
  8. // ]
  9. test(highlight('height:180'), persons[0]);
  10. // [
  11. //   {
  12. //     path: 'height',
  13. //   }
  14. // ]
  15. ```

Query Syntax


Liqe uses Liqe Query Language (LQL), which is heavily inspired by Lucene but extends it in various ways that allow a more powerful search experience.

Liqe syntax cheat sheet


  1. ```rb
  2. # search for "foo" term anywhere in the document (case insensitive)
  3. foo

  4. # search for "foo" term anywhere in the document (case sensitive)
  5. 'foo'
  6. "foo"

  7. # search for "foo" term in `name` field
  8. name:foo

  9. # search for "foo" term in `full name` field
  10. 'full name':foo
  11. "full name":foo

  12. # search for "foo" term in `first` field, member of `name`, i.e.
  13. # matches {name: {first: 'foo'}}
  14. name.first:foo

  15. # search using regex
  16. name:/foo/
  17. name:/foo/o

  18. # search using wildcard
  19. name:foo*bar

  20. # boolean search
  21. member:true
  22. member:false

  23. # null search
  24. member:null

  25. # search for age =, >, >=, <, <=
  26. height:=100
  27. height:>100
  28. height:>=100
  29. height:<100
  30. height:<=100

  31. # search for height in range (inclusive, exclusive)
  32. height:[100 TO 200]
  33. height:{100 TO 200}

  34. # boolean operators
  35. name:foo AND height:=100
  36. name:foo OR name:bar

  37. # unary operators
  38. NOT foo
  39. -foo
  40. NOT foo:bar
  41. -foo:bar
  42. name:foo AND NOT (bio:bar OR bio:baz)

  43. # implicit AND boolean operator
  44. name:foo height:=100

  45. # grouping
  46. name:foo AND (bio:bar OR bio:baz)
  47. ```

Keyword matching


Search for word "foo" in any field (case insensitive).

  1. ```rb
  2. foo
  3. ```

Search for word "foo" in the name field.

  1. ```rb
  2. name:foo
  3. ```

Search for name field values matching /foo/i regex.

  1. ```rb
  2. name:/foo/i
  3. ```

Search for name field values matching f*o wildcard pattern.

  1. ```rb
  2. name:f*o
  3. ```

Search for phrase "foo bar" in the name field (case sensitive).

  1. ```rb
  2. name:"foo bar"
  3. ```

Number matching


Search for value equal to 100 in the height field.

  1. ```rb
  2. height:=100
  3. ```

Search for value greater than 100 in the height field.

  1. ```rb
  2. height:>100
  3. ```

Search for value greater than or equal to 100 in the height field.

  1. ```rb
  2. height:>=100
  3. ```

Range matching


Search for value greater or equal to 100 and lower or equal to 200 in the height field.

  1. ```rb
  2. height:[100 TO 200]
  3. ```

Search for value greater than 100 and lower than 200 in the height field.

  1. ```rb
  2. height:{100 TO 200}
  3. ```

Wildcard matching


Search for any word that starts with "foo" in the name field.

  1. ```rb
  2. name:foo*
  3. ```

Search for any word that starts with "foo" and ends with bar in the name field.

  1. ```rb
  2. name:foo*bar
  3. ```

Boolean operators


Search for phrase "foo bar" in the name field AND the phrase "quick fox" in the bio field.

  1. ```rb
  2. name:"foo bar" AND bio:"quick fox"
  3. ```

Search for either the phrase "foo bar" in the name field AND the phrase "quick fox" in the bio field, or the word "fox" in the name field.

  1. ```rb
  2. (name:"foo bar" AND bio:"quick fox") OR name:fox
  3. ```

Serializer


Serializer allows to convert Liqe tokens back to the original search query.

  1. ```ts
  2. import {
  3.   parse,
  4.   serialize,
  5. } from 'liqe';

  6. const tokens = parse('foo:bar');

  7. // {
  8. //   expression: {
  9. //     location: {
  10. //       start: 4,
  11. //     },
  12. //     quoted: false,
  13. //     type: 'LiteralExpression',
  14. //     value: 'bar',
  15. //   },
  16. //   field: {
  17. //     location: {
  18. //       start: 0,
  19. //     },
  20. //     name: 'foo',
  21. //     path: ['foo'],
  22. //     quoted: false,
  23. //     type: 'Field',
  24. //   },
  25. //   location: {
  26. //     start: 0,
  27. //   },
  28. //   operator: {
  29. //     location: {
  30. //       start: 3,
  31. //     },
  32. //     operator: ':',
  33. //     type: 'ComparisonOperator',
  34. //   },
  35. //   type: 'Tag',
  36. // }

  37. serialize(tokens);
  38. // 'foo:bar'
  39. ```

AST


  1. ```ts
  2. import {
  3.   type BooleanOperatorToken,
  4.   type ComparisonOperatorToken,
  5.   type EmptyExpression,
  6.   type FieldToken,
  7.   type ImplicitBooleanOperatorToken,
  8.   type ImplicitFieldToken,
  9.   type LiteralExpressionToken,
  10.   type LogicalExpressionToken,
  11.   type RangeExpressionToken,
  12.   type RegexExpressionToken,
  13.   type TagToken,
  14.   type UnaryOperatorToken,
  15. } from 'liqe';
  16. ```

There are 11 AST tokens that describe a parsed Liqe query.

If you are building a serializer, then you must implement all of them for the complete coverage of all possible query inputs. Refer to the built-in serializer for an example.

Utilities


  1. ```ts
  2. import {
  3.   isSafeUnquotedExpression,
  4. } from 'liqe';

  5. /**
  6. * Determines if an expression requires quotes.
  7. * Use this if you need to programmatically manipulate the AST
  8. * before using a serializer to convert the query back to text.
  9. */
  10. isSafeUnquotedExpression(expression: string): boolean;
  11. ```

Compatibility with Lucene


The following Lucene abilities are not supported:


Recipes


Handling syntax errors


In case of a syntax error, Liqe throws SyntaxError.

  1. ```ts
  2. import {
  3.   parse,
  4.   SyntaxError,
  5. } from 'liqe';

  6. try {
  7.   parse('foo bar');
  8. } catch (error) {
  9.   if (error instanceof SyntaxError) {
  10.     console.error({
  11.       // Syntax error at line 1 column 5
  12.       message: error.message,
  13.       // 4
  14.       offset: error.offset,
  15.       // 1
  16.       offset: error.line,
  17.       // 5
  18.       offset: error.column,
  19.     });
  20.   } else {
  21.     throw error;
  22.   }
  23. }
  24. ```

Highlighting matches


Consider using [highlight-words](https://github.com/tricinel/highlight-words) package to highlight Liqe matches.

Development


Compiling Parser


If you are going to modify parser, then use npm run watch to run compiler in watch mode.

Benchmarking Changes


Before making any changes, capture the current benchmark on your machine using npm run benchmark. Run benchmark again after making any changes. Before committing changes, ensure that performance is not negatively impacted.


Tutorials