A Lucene query parser generating ElasticSearch queries and more !
luqum - A lucene query parser in Python, using PLY #########################################################
|pypi-version| |readthedocs| |travis| |coveralls|
|logo|
"luqum" (as in LUcene QUery Manipolator) is a tool to parse queries
written in the Lucene Query DSL_ and build an abstract syntax tree
to inspect, analyze or otherwise manipulate search queries.
It enables enriching the Lucene Query DSL meanings
(for example to support nested object searches or have particular treatments on some fields),
and transform lucene DSL queries to native ElasticSearch JSON DSL_
Thanks to luqum, your users may continue to write queries like:
author.last_name:Smith OR author:(age:[25 TO 34] AND first_name:John)
and you will be able to leverage ElasticSearch query DSL,
and control the precise meaning of each search terms.
Luqum is dual licensed under Apache2.0 and LGPLv3.
Compatible with Python 3.6+
pip install luqum
PLY_ >= 3.11
http://luqum.readthedocs.org/en/latest/
.. _Lucene Query DSL : https://lucene.apache.org/core/3_6_0/queryparsersyntax.html
.. _ElasticSearch JSON DSL: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html
.. _PLY: http://www.dabeaz.com/ply/
.. |logo| image:: https://raw.githubusercontent.com/jurismarches/luqum/master/luqum-logo.png
.. |pypi-version| image:: https://img.shields.io/pypi/v/luqum.svg :target: https://pypi.python.org/pypi/luqum :alt: Latest PyPI version .. |travis| image:: http://img.shields.io/travis/jurismarches/luqum/master.svg?style=flat :target: https://travis-ci.org/jurismarches/luqum .. |coveralls| image:: http://img.shields.io/coveralls/jurismarches/luqum/master.svg?style=flat :target: https://coveralls.io/r/jurismarches/luqum .. |readthedocs| image:: https://readthedocs.org/projects/luqum/badge/?version=latest :target: http://luqum.readthedocs.org/en/latest/?badge=latest :alt: Documentation Status
Changelog for luqum ###################
The format is based on Keep a Changelog_
and this project tries to adhere to Semantic Versioning_.
.. _Keep a Changelog: http://keepachangelog.com/en/1.0.0/
.. _Semantic Versioning: http://semver.org/spec/v2.0.0.html
Add support for unbounded ranges
Support is added for open ranges, i.e. inequality operators in front of a term. In tree form, the < is named To, and > is named From.
Additionally, a TreeTransformer is also added, to convert these open ranges to more traditional Range objects.
To properly support escaping, some adjustments were made to how escaping
sequences work. After careful evaluation of how Apache Lucene handles
escape sequences, it appears that random characters can be escaped, even
if they result in unknown escape sequences: the escaped character is
always yielded. This makes support for operations such as <\=foo a lot
less complicated.
There is no support in the ElasticsearchQueryBuilder.
Add support for Lucene and Elasticsearch Boolean operations (#71, thanks to @linefeedse):
Set E element as ElasticsearchQueryBuilder's attributes (#75, thanks to @qcoumes):
This allows to override elements such as EMust, EWord, ..., without the need of overriding ElasticsearchQueryBuilder's methods.
Explicit support for Python 3.9 and Python 3.10 (#76)
Add a thread safe parse function (#82)
Run tests with github actions
Update all libraries for dev:
auto_name function, as it was not practical as is.elasticsearch named queries__.__ https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-queries-and-filters
/foo/ (no transformation to Elasticsearch DSL yet)auto_head_tail util
(use it if you build your tree programatically and want a printable representation)clone_item method and a setter for children.
This should help with making transformation pattern easier.visitor.TreeVisitor and visitor.TreeTransformer classes to help in processing trees
utils.LuceneTreeVisitor, utils.LuceneTreeVisitorV2 and utils.LuceneTreeTransformer
are warned as deprecated (but still works).IllegalCharacterError on illegal character found instead of printing and skippingParseError to ParseSyntaxError, and kept ParseError as a parent exceptionmulti_match query in ElasticsearchQueryBuilder.ElasticsearchQueryBuilder's field_options parameter
can accept match_type instead of type to change request type.
This is now the prefered way over type
which may more easily conflict with request parameters.multi-fields__)__ https://www.elastic.co/guide/en/elasticsearch/reference/6.3/multi-fields.html
special characters escaping_iter_wildcards and split_wildcards to have a finer grained search of wildcard in terms.. _special characters escaping: https://lucene.apache.org/core/3_6_0/queryparsersyntax.html#Escaping%20Special%20Characters
luqum.utils.LuceneTreeTransformer when removing nodeluqum.elasticsearch.visitor.ElasticsearchQueryBuilderzero_terms_query to match_phrase was a mistake (introduced in 0.7.0).0.7.0 introduced the match query for queries with single words on analyzed fields,
whereas before it was using match_phrase.
Although this is more coherent,
this may cause difficulties on edge cases
as this may leads to results different from previous release.
This behaviour might be disabled using a new match_word_as_phrase parameter
to luqum.elasticsearch.visitor.ElasticsearchQueryBuilder.
Note that this parameter maybe removed in future release.
(the field_options might be used instead on a per field basis).
elastic named queries__)luqum.elasticsearch.schemafield_options on luqum.elasticsearch.visitor.ElasticsearchQueryBuilder
allows to add parameters to field queries.
It also permits to control the type of query for match queries.match, and not match_phrasematch_phrase has the zero_terms_query field, as for match__ https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-named-queries-and-filters.html
a minor release
(Note that 0.2 version was skipped)
TreeVisitorV2 more easy to useThis was the initial release of Luqum.