Project: baron

Full Syntax Tree for python to make writing refactoring code a realist task

Project Details

Latest version
0.10.1
Home Page
https://github.com/PyCQA/baron
PyPI Page
https://pypi.org/project/baron/

Project Popularity

PageRank
0.002412473808432872
Number of downloads
56043

Introduction

Baron is a Full Syntax Tree (FST) library for Python. By opposition to an AST which drops some syntax information in the process of its creation (like empty lines, comments, formatting), a FST keeps everything and guarantees the operation fst_to_code(code_to_fst(source_code)) == source_code.

Roadmap

Current roadmap is as boring as needed:

  • bug fixs
  • new small features (walker pattern, maybe code generation) and performance improvement.

Installation

pip install baron

Basic Usage

from baron import parse, dumps

fst = parse(source_code_string)
source_code_string == dumps(fst)

Unless you want to do low level things, use RedBaron instead of using Baron directly. Think of Baron as the "bytecode of python source code" and RedBaron as some sort of usable layer on top of it.

If you don't know what Baron is or don't understand yet why it might be useful for you, read the « Why is this important? » section.

Documentation

Baron documentation is available on Read The Docs.

Contributing

If you want to implement new grammar elements for newer python versions, here are the documented steps for that: https://github.com/PyCQA/baron/blob/master/add_new_grammar.md

Also note that reviewing most grammar modifications takes several hours of advanced focusing (we can't really afford bugs here) so don't despair if you PR seems to be hanging around, sorry for that :/

And thanks in advance for your work!

Financial support

Baron and RedBaron are a very advanced piece of engineering that requires a lot of time of concentration to work on. Until the end of 2018, the development has been a full volunteer work mostly done by Bram, but now, to reach the next level and bring those projects to the stability and quality you expect, we need your support.

You can join our contributors and sponsors on our transparent OpenCollective, every contribution will count and will be mainly used to work on the projects stability and quality but also on continuing, on the side, the R&D side of those projects.

Our supporters

badge with number of supporters at tier I like this, keep going! badge with number of supporters at tier it looks cool! badge with number of supporters at tier Oh god, that saved me so much time!

Why is this important?

The usage of a FST might not be obvious at first sight so let's consider a series of problems to illustrate it. Let's say that you want to write a program that will:

  • rename a variable in a source file... without clashing with things that are not a variable (example: stuff inside a string)
  • inline a function/method
  • extract a function/method from a series of line of code
  • split a class into several classes
  • split a file into several modules
  • convert your whole code base from one ORM to another
  • do custom refactoring operation not implemented by IDE/rope
  • implement the class browser of smalltalk for python (the whole one where you can edit the code of the methods, not just showing code)

It is very likely that you will end up with the awkward feeling of writing clumpsy weak code that is very likely to break because you didn't thought about all the annoying special cases and the formatting keeps bothering you. You may end up playing with ast.py until you realize that it removes too much information to be suitable for those situations. You will probably ditch this task as simply too complicated and really not worth the effort. You are missing a good abstraction that will take care of all of the code structure and formatting for you so you can concentrate on your task.

The FST tries to be this abstraction. With it you can now work on a tree which represents your code with its formatting. Moreover, since it is the exact representation of your code, modifying it and converting it back to a string will give you back your code only modified where you have modified the tree.

Said in another way, what I'm trying to achieve with Baron is a paradigm change in which writing code that will modify code is now a realist task that is worth the price (I'm not saying a simple task, but a realistic one: it's still a complex task).

Other

Having a FST (or at least a good abstraction build on it) also makes it easier to do code generation and code analysis while those two operations are already quite feasible (using ast.py and a templating engine for example).

Some technical details

Baron produces a FST in the form of JSON (and by JSON I mean Python lists and dicts that can be dumped into JSON) for maximum interoperability.

Baron FST is quite similar to Python AST with some modifications to be more intuitive to humans, since Python AST has been made for CPython interpreter.

Since playing directly with JSON is a bit raw I'm going to build an abstraction on top of it that will looks like BeautifulSoup/jQuery.

State of the project

Currently, Baron has been tested on the top 100 projects and the FST converts back exactly into the original source code. So, it can be considered quite stable, but it is far away from having been battle tested.

Since the project is very young and no one is already using it except my project, I'm open to changes of the FST nodes but I will quickly become conservative once it gets some adoption and will probably accept to modify it only once or twice in the future with clear indications on how to migrate.

Baron is supporting python 2 grammar and up to python 3.7 grammar.

Tests

Run either py.test tests/ or nosetests in the baron directory.

Community

You can reach us on irc.freenode.net#baron or irc.freenode.net##python-code-quality.

Code of Conduct

As a member of PyCQA, Baron follows its Code of Conduct.

Misc

Old blog post announcing the project. Not that much up to date.

Changelog

0.10.1 (2021-12-08)

  • bug fix: in "a." the "." part was incorrectly recognized as a float, by bram

0.10 (2021-12-08)

  • bug fix: baron is now able to parse "class A(b, c=d): pass" by bram
  • some project cleaned and integration of tox with good pratices like flake8 and check-manifest
  • bug fix for missing edge case in inner formatting by EhsanKia https://github.com/PyCQA/baron/pull/156
  • complet support for float with underscores in them by tamentis https://github.com/PyCQA/baron/pull/157
  • bug fix for failure of parsing of "{**a}" by wavenator https://github.com/PyCQA/baron/pull/161

0.9 (2019-02-01)

First version of full python 3.7 grammar support.

  • BREAKING CHANGE: annotations are now member of {def,list,dict}_argument to flatten the data structure
  • add support for ... in from import by bram
  • add support for return annotation by bram
  • add support for exec function by bram
  • add support for variable annotation https://github.com/PyCQA/baron/pull/145 by scottbelden and additional work by bram
  • add support for *var expressions in tuple assignment by bram
  • add support for raise from https://github.com/PyCQA/baron/pull/120 by odcinek with additional work by bram
  • add support for arglist usage in class definition inheritence by bram
  • bug fix by https://github.com/PyCQA/baron/pull/126/commits/91e839a228293698cc755a7f28afeca2669cb66e kyleatmakrs

0.8 (2018-10-29)

  • add typed parameters support https://github.com/PyCQA/baron/pull/140 by Scott Belden and and additional work by bram

0.7 (2018-08-21)

  • fix line continuation https://github.com/PyCQA/baron/pull/92 by ibizaman
  • handle corrupt cache file situation https://github.com/PyCQA/baron/pull/76 by ryu2
  • fix special crashing edge case in indentation marker https://github.com/PyCQA/bar by Ahuge
  • fixed incorrect tokenization case "d*e-1". Fixes #85 https://github.com/PyCQA/baron/pull/107 by boxed
  • fix endl handling inside groupings by kyleatmakrs (extracted from https://github.com/PyCQA/baron/pull/126)

Python 3:

  • python 3 parsing extracted from https://github.com/PyCQA/baron/pull/126
  • support ellipsis https://github.com/PyCQA/baron/pull/121 by odcinek
  • support matrix operator https://github.com/PyCQA/baron/pull/117 by odcinek
  • support f-strings https://github.com/PyCQA/baron/pull/110 by odcinek
  • support numeric literals https://github.com/PyCQA/baron/pull/111 by odcinek
  • support nonlocal statement https://github.com/PyCQA/baron/pull/112 by odcinek
  • support keyword only markers https://github.com/PyCQA/baron/pull/108 by boxed
  • support yield from statement https://github.com/PyCQA/baron/pull/113 by odcinek and additional work by bram
  • support async/await statements https://github.com/PyCQA/baron/pull/114 by odcinek and additional work by bram

0.6.6 (2017-06-12)

  • fix situation where a deindented comment between a if and elif/else broken parsing, see https://github.com/PyCQA/baron/issues/87
  • around 35-40% to 75% parsing speed improvment on big files by duncf https://github.com/PyCQA/baron/pull/99

0.6.5 (2017-01-26)

  • fix previous regression fix was broken

0.6.4 (2017-01-14)

  • fix regression in case a comment follow the ":" of a if/def/other

0.6.3 (2017-01-02)

  • group formatting at start of file or preceded by space with comment

0.6.2 (2016-03-18)

  • fix race condition when generating parser cache file
  • make all user-facing errors inherit from the same BaronError class
  • fix: dotted_name and float_exponant_complex were missing from nodes_rendering_order

0.6.1 (2015-01-31)

  • fix: the string was having a greedy behavior on grouping the string tokens surrounding it (for string chains), this ends up creating an inconsistancy in the way string was grouped in general
  • fix: better number parsing handling, everything isn't fixed yet
  • make all (expected) errors inherit from the same BaronError class
  • fix: parsing fails correctly if a quoted string is not closed

0.6 (2014-12-11)

  • FST structure modification: def_argument_tuple is no more and all arguments now have a coherent structure:
    • def_argument node name attribute has been renamed to target, like in assign
    • target attribute now points to a dict, not to a string
    • old name -> string are now target -> name_node
    • def_argument_tuple is now a def_argument where target points to a tuple
    • this specific tuple will only has name and comma and tuple members (no more def_argument for name)
  • new node: long, before int and long where merged but that was causing problems

0.5 (2014-11-10)

  • rename "funcdef" node to "def" node to be way more intuitive.

0.4 (2014-09-29)

  • new rendering type in the nodes_rendering_order dictionary: string. This remove an ambiguity where a key could be pointing to a dict or a string, thus forcing third party tools to do guessing.

0.3.1 (2014-09-04)

  • setup.py wasn't working if wheel wasn't used because the CHANGELOG file wasn't included in the MANIFEST.in

0.3 (2014-08-21)

  • path becomes a simple list and is easier to deal with
  • bounding box allows you to know the left most and right most position of a node see https://baron.readthedocs.io/en/latest/#bounding-box
  • redbaron is classified as supporting python3 https://github.com/PyCQA/baron/pull/51
  • ensure than when a key is a string, it's empty value is an empty string and not None to avoid breaking libs that use introspection to guess the type of the key
  • key renaming in the FST: "delimiteur" -> "delimiter"
  • name_as_name and dotted_as_name node don't have the "as" key anymore as it was useless (it can be deduce from the state of the "target" key)
  • dotted_name node doesn't exist anymore, its existance was unjustified. In import, from_import and decorator node, it has been replaced from a key to a dict (with only a list inside of it) to a simple list.
  • dumps now accept a strict boolean argument to check the validity of the FST on dumping, but this isn't that much a public feature and should probably be changed of API in the futur
  • name_as_name and dotted_as_name empty value for target is now an empty string and not None since this is a string type key
  • boundingbox now includes the newlines at the end of a node
  • all raised exceptions inherit from a common base exception to ease try/catch constructions
  • Position's left and right functions become properties and thus attributes
  • Position objects can be compared to other Position objects or any iterables
  • make_position and make_bounding_box functions are deleted in favor of always using the corresponding class' constructor

0.2 (2014-06-11)

  • Baron now provides documentation on https://baron.readthedocs.io
  • feature: baron now run in python3 (but doesn't implement the full python3 grammar yet) by Pierre Penninckx https://github.com/ibizaman
  • feature: drop the usage of ast.py to find print_function, this allow any version of python to parse any other version of python also by Pierre Penninckx
  • fix: rare bug where a comment end up being confused as an indentation level
  • 2 new helpers: show_file and show_node, see https://baron.readthedocs.io/en/latest/#show-file and https://baron.readthedocs.io/en/latest/#show-node
  • new dictionary that provides the informations on how to render a FST node: nodes_rendering_order see https://baron.readthedocs.io/en/latest/#rendering-the-fst
  • new utilities to find a node, see https://baron.readthedocs.io/en/latest/#locate-a-node
  • new generic class that provide templates to work on the FST see https://baron.readthedocs.io/en/latest/#rendering-the-fst

0.1.3 (2014-04-13)

  • set sugar syntaxic notation wasn't handled by the dumper (apparently no one use this on pypi top 100)

0.1.2 (2014-04-08)

  • baron.dumps now accept a single FST node, it was only working with a list of FST nodes
  • don't add a endl node at the end if not present in the input string
  • de-uniformise call_arguments and function_arguments node, this is just creating more problems that anything else
  • fix https://github.com/PyCQA/redbaron/issues/4
  • fix the fact that baron can't parse "{1,}" (but "{1}" is working)

0.1.1 (2014-03-23)

  • It appears that I don't know how to write MANIFEST.in correctly

0.1 (2014-03-22)

  • Init