A Lucene query parser generating ElasticSearch queries and more !
luqum - A lucene query parser in Python, using PLY #########################################################
|pypi-version| |readthedocs| |travis| |coveralls|
|logo|
"luqum" (as in LUcene QUery Manipolator) is a tool to parse queries
written in the Lucene Query DSL
_ and build an abstract syntax tree
to inspect, analyze or otherwise manipulate search queries.
It enables enriching the Lucene Query DSL meanings
(for example to support nested object searches or have particular treatments on some fields),
and transform lucene DSL queries to native ElasticSearch JSON DSL
_
Thanks to luqum, your users may continue to write queries like:
author.last_name:Smith OR author:(age:[25 TO 34] AND first_name:John)
and you will be able to leverage ElasticSearch query DSL,
and control the precise meaning of each search terms.
Luqum is dual licensed under Apache2.0 and LGPLv3.
Compatible with Python 3.6+
pip install luqum
PLY
_ >= 3.11
http://luqum.readthedocs.org/en/latest/
.. _Lucene Query DSL
: https://lucene.apache.org/core/3_6_0/queryparsersyntax.html
.. _ElasticSearch JSON DSL
: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html
.. _PLY
: http://www.dabeaz.com/ply/
.. |logo| image:: https://raw.githubusercontent.com/jurismarches/luqum/master/luqum-logo.png
.. |pypi-version| image:: https://img.shields.io/pypi/v/luqum.svg :target: https://pypi.python.org/pypi/luqum :alt: Latest PyPI version .. |travis| image:: http://img.shields.io/travis/jurismarches/luqum/master.svg?style=flat :target: https://travis-ci.org/jurismarches/luqum .. |coveralls| image:: http://img.shields.io/coveralls/jurismarches/luqum/master.svg?style=flat :target: https://coveralls.io/r/jurismarches/luqum .. |readthedocs| image:: https://readthedocs.org/projects/luqum/badge/?version=latest :target: http://luqum.readthedocs.org/en/latest/?badge=latest :alt: Documentation Status
Changelog for luqum ###################
The format is based on Keep a Changelog
_
and this project tries to adhere to Semantic Versioning
_.
.. _Keep a Changelog
: http://keepachangelog.com/en/1.0.0/
.. _Semantic Versioning
: http://semver.org/spec/v2.0.0.html
Add support for unbounded ranges
Support is added for open ranges, i.e. inequality operators in front of a term. In tree form, the < is named To, and > is named From.
Additionally, a TreeTransformer is also added, to convert these open ranges to more traditional Range objects.
To properly support escaping, some adjustments were made to how escaping
sequences work. After careful evaluation of how Apache Lucene handles
escape sequences, it appears that random characters can be escaped, even
if they result in unknown escape sequences: the escaped character is
always yielded. This makes support for operations such as <\=foo
a lot
less complicated.
There is no support in the ElasticsearchQueryBuilder.
Add support for Lucene and Elasticsearch Boolean operations (#71, thanks to @linefeedse):
Set E element as ElasticsearchQueryBuilder's attributes (#75, thanks to @qcoumes):
This allows to override elements such as EMust, EWord, ..., without the need of overriding ElasticsearchQueryBuilder's methods.
Explicit support for Python 3.9 and Python 3.10 (#76)
Add a thread safe parse function (#82)
Run tests with github actions
Update all libraries for dev:
auto_name
function, as it was not practical as is.elasticsearch named queries
__.__ https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-queries-and-filters
/foo/
(no transformation to Elasticsearch DSL yet)auto_head_tail
util
(use it if you build your tree programatically and want a printable representation)clone_item
method and a setter for children.
This should help with making transformation pattern easier.visitor.TreeVisitor
and visitor.TreeTransformer
classes to help in processing trees
utils.LuceneTreeVisitor
, utils.LuceneTreeVisitorV2
and utils.LuceneTreeTransformer
are warned as deprecated (but still works).IllegalCharacterError
on illegal character found instead of printing and skippingParseError
to ParseSyntaxError
, and kept ParseError
as a parent exceptionmulti_match
query in ElasticsearchQueryBuilder
.ElasticsearchQueryBuilder
's field_options
parameter
can accept match_type
instead of type
to change request type.
This is now the prefered way over type
which may more easily conflict with request parameters.multi-fields
__)__ https://www.elastic.co/guide/en/elasticsearch/reference/6.3/multi-fields.html
special characters escaping
_iter_wildcards
and split_wildcards
to have a finer grained search of wildcard in terms.. _special characters escaping
: https://lucene.apache.org/core/3_6_0/queryparsersyntax.html#Escaping%20Special%20Characters
luqum.utils.LuceneTreeTransformer
when removing nodeluqum.elasticsearch.visitor.ElasticsearchQueryBuilder
zero_terms_query
to match_phrase
was a mistake (introduced in 0.7.0).0.7.0 introduced the match
query for queries with single words on analyzed fields,
whereas before it was using match_phrase
.
Although this is more coherent,
this may cause difficulties on edge cases
as this may leads to results different from previous release.
This behaviour might be disabled using a new match_word_as_phrase
parameter
to luqum.elasticsearch.visitor.ElasticsearchQueryBuilder
.
Note that this parameter maybe removed in future release.
(the field_options
might be used instead on a per field basis).
elastic named queries
__)luqum.elasticsearch.schema
field_options
on luqum.elasticsearch.visitor.ElasticsearchQueryBuilder
allows to add parameters to field queries.
It also permits to control the type of query for match queries.match
, and not match_phrase
match_phrase
has the zero_terms_query
field, as for match
__ https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-named-queries-and-filters.html
a minor release
(Note that 0.2 version was skipped)
TreeVisitorV2
more easy to useThis was the initial release of Luqum.