liqe

轻量级且高性能的类 Lucene 解析器、序列化器和搜索引擎。

动机
使用方法
查询语法
序列化器
AST
实用工具
与 Lucene 的兼容性
使用方法
- 处理语法错误
- 高亮匹配项
开发
教程

动机

最初开发 Liqe 是为了通过 cli 实现 Roarr 日志过滤。此后，我一直将这个项目作为业余爱好和智力练习来打磨。我看到它被各种需要高级搜索功能的 CLI 和 Web 应用程序采用。据我所知，它目前是 JavaScript 中最完整的类 Lucene 语法解析器和序列化器，同时也是一个兼容的内存搜索引擎。

Liqe 的用例包括：

解析搜索查询
序列化解析后的查询
使用 Liqe 查询语言 (LQL) 搜索 JSON 文档

请注意，Liqe AST 被视为公共 API，即可以实现自己的搜索机制来使用 Liqe 查询语言 (LQL)。

使用方法

import {
  filter,
  highlight,
  parse,
  test,
} from 'liqe';

const persons = [
  {
    height: 180,
    name: 'John Morton',
  },
  {
    height: 175,
    name: 'David Barker',
  },
  {
    height: 170,
    name: 'Thomas Castro',
  },
];

过滤集合：

filter(parse('height:>170'), persons);
// [
//   {
//     height: 180,
//     name: 'John Morton',
//   },
//   {
//     height: 175,
//     name: 'David Barker',
//   },
// ]

测试单个对象：

test(parse('name:John'), persons[0]);
// true
test(parse('name:David'), persons[0]);
// false

高亮匹配的字段和子字符串：

highlight(parse('name:john'), persons[0]);
// [
//   {
//     path: 'name',
//     query: /(John)/,
//   }
// ]
highlight(parse('height:180'), persons[0]);
// [
//   {
//     path: 'height',
//   }
// ]

查询语法

Liqe 使用 Liqe 查询语言 (LQL)，这种语言深受 Lucene 的启发，但在多个方面进行了扩展，以提供更强大的搜索体验。

Liqe 语法速查表

# 在文档中任何位置搜索 "foo" 这个词（不区分大小写）
foo

# 在文档中任何位置搜索 "foo" 这个词（区分大小写）
'foo'
"foo"

# 在 `name` 字段中搜索 "foo" 这个词
name:foo

# 在 `full name` 字段中搜索 "foo" 这个词
'full name':foo
"full name":foo

# 在 `name` 的成员 `first` 字段中搜索 "foo" 这个词，即
# 匹配 {name: {first: 'foo'}}
name.first:foo

# 使用正则表达式搜索
name:/foo/
name:/foo/o

# 使用通配符搜索
name:foo*bar
name:foo?bar

# 布尔值搜索
member:true
member:false

# null 值搜索
member:null

# 搜索年龄 =, >, >=, <, <=
height:=100
height:>100
height:>=100
height:<100
height:<=100

# 搜索身高范围（包含边界、不包含边界）
height:[100 TO 200]
height:{100 TO 200}

# 布尔运算符
name:foo AND height:=100
name:foo OR name:bar

# 一元运算符
NOT foo
-foo
NOT foo:bar
-foo:bar
name:foo AND NOT (bio:bar OR bio:baz)

# 隐式 AND 布尔运算符
name:foo height:=100

# 分组
name:foo AND (bio:bar OR bio:baz)

关键词匹配

在任何字段中搜索单词 "foo"（不区分大小写）。

foo

在 name 字段中搜索单词 "foo"。

name:foo

搜索 name 字段值匹配 /foo/i 正则表达式。

name:/foo/i

搜索 name 字段值匹配 f*o 通配符模式。

name:f*o

搜索 name 字段值匹配 f?o 通配符模式。

name:f?o

在 name 字段中搜索短语 "foo bar"（区分大小写）。

name:"foo bar"

数字匹配

在 height 字段中搜索等于 100 的值。

height:=100

在 height 字段中搜索大于 100 的值。

height:>100

在 height 字段中搜索大于或等于 100 的值。

height:>=100

范围匹配

在 height 字段中搜索大于或等于 100 且小于或等于 200 的值。

height:[100 TO 200]

在 height 字段中搜索大于 100 且小于 200 的值。

height:{100 TO 200}

通配符匹配

在 name 字段中搜索以 "foo" 开头的任何单词。

name:foo*

在 name 字段中搜索以 "foo" 开头并以 "bar" 结尾的任何单词。

name:foo*bar

在 name 字段中搜索以 "foo" 开头，后跟一个任意字符的任何单词。

name:foo?

在 name 字段中搜索以 "foo" 开头，后跟一个任意字符，然后立即以 "bar" 结尾的任何单词。

name:foo?bar

布尔运算符

在 name 字段中搜索短语 "foo bar" 并且在 bio 字段中搜索短语 "quick fox"。

name:"foo bar" AND bio:"quick fox"

在 name 字段中搜索短语 "foo bar" 并且在 bio 字段中搜索短语 "quick fox"，或者在 name 字段中搜索单词 "fox"。

(name:"foo bar" AND bio:"quick fox") OR name:fox

序列化器

序列化器允许将 Liqe 标记转换回原始搜索查询。

import {
  parse,
  serialize,
} from 'liqe';

const tokens = parse('foo:bar');

// {
//   expression: {
//     location: {
//       start: 4,
//     },
//     quoted: false,
//     type: 'LiteralExpression',
//     value: 'bar',
//   },
//   field: {
//     location: {
//       start: 0,
//     },
//     name: 'foo',
//     path: ['foo'],
//     quoted: false,
//     type: 'Field',
//   },
//   location: {
//     start: 0,
//   },
//   operator: {
//     location: {
//       start: 3,
//     },
//     operator: ':',
//     type: 'ComparisonOperator',
//   },
//   type: 'Tag',
// }

serialize(tokens);
// 'foo:bar'

AST

import {
  type BooleanOperatorToken,
  type ComparisonOperatorToken,
  type EmptyExpression,
  type FieldToken,
  type ImplicitBooleanOperatorToken,
  type ImplicitFieldToken,
  type LiteralExpressionToken,
  type LogicalExpressionToken,
  type RangeExpressionToken,
  type RegexExpressionToken,
  type TagToken,
  type UnaryOperatorToken,
} from 'liqe';

有 11 种 AST 标记用于描述已解析的 Liqe 查询。

如果你正在构建序列化器，那么你必须实现所有这些标记，以全面覆盖所有可能的查询输入。请参考内置序列化器作为示例。

实用工具

import {
  isSafeUnquotedExpression,
} from 'liqe';

/**
 * 判断表达式是否需要引号。
 * 如果你需要在使用序列化器将查询转换回文本之前
 * 以编程方式操作 AST，请使用此函数。
 */
isSafeUnquotedExpression(expression: string): boolean;

与 Lucene 的兼容性

不支持以下 Lucene 功能：

使用示例

处理语法错误

如果出现语法错误，Liqe 会抛出 SyntaxError。

import {
  parse,
  SyntaxError,
} from 'liqe';

try {
  parse('foo bar');
} catch (error) {
  if (error instanceof SyntaxError) {
    console.error({
      // 第 1 行第 5 列出现语法错误
      message: error.message,
      // 4
      offset: error.offset,
      // 1
      line: error.line,
      // 5
      column: error.column,
    });
  } else {
    throw error;
  }
}