Productivity-centric Python Big Data Framework
Ibis is a Python library that provides a lightweight, universal interface for data wrangling. It helps Python users explore and transform data of any size, stored anywhere.
Ibis has three primary components:
Ibis aims to be a future-proof solution to interacting with data using Python and can accomplish this goal through its main features:
f
-strings.
Ibis provides one syntax for multiple query engines and dataframe APIs that lets you avoid learning new flavors of SQL or other framework-specific code.
Learn the syntax once and use that syntax anywhere.Ibis acts as a universal frontend to the following systems:
The list of supported backends is continuously growing. Anyone can get involved in adding new ones! Learn more about contributing to ibis in our contributing documentation at https://github.com/ibis-project/ibis/blob/master/docs/CONTRIBUTING.md
Install Ibis from PyPI with:
pip install 'ibis-framework[duckdb]'
Or from conda-forge with:
conda install ibis-framework -c conda-forge
(It’s a common mistake to pip install ibis
. If you try to use Ibis and get errors early on try uninstalling ibis
and installing ibis-framework
)
To discover ibis, we suggest starting with the DuckDB backend (which is included by default in the conda-forge package). The DuckDB backend is performant and fully featured.
To use ibis with other backends, include the backend name in brackets for PyPI:
pip install 'ibis-framework[postgres]'
Or use ibis-$BACKEND
where $BACKEND
is the specific backend you want to use when installing from conda-forge:
conda install ibis-postgres -c conda-forge
We provide a number of tutorial and example notebooks in the ibis-examples. The easiest way to try these out is through the online interactive notebook environment provided here:
You can also get started analyzing any dataset, anywhere with just a few lines of Ibis code. Here’s an example of how to use Ibis with a SQLite database.
Download the SQLite database from the ibis-tutorial-data
GCS (Google Cloud
Storage) bucket, then connect to it using ibis.
curl -LsS -o geography.db 'https://storage.googleapis.com/ibis-tutorial-data/geography.db'
Connect to the database and show the available tables
>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> con = ibis.sqlite.connect("geography.db")
>>> con.tables
Tables
------
- countries
- gdp
- independence
Choose the countries
table and preview its first few rows
>>> countries = con.tables.countries
>>> countries.head()
┏━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ iso_alpha2 ┃ iso_alpha3 ┃ iso_numeric ┃ fips ┃ name ┃ capital ┃ area_km2 ┃ population ┃ continent ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ string │ string │ int32 │ string │ string │ string │ float64 │ int32 │ string │
├────────────┼────────────┼─────────────┼────────┼──────────────────────┼──────────────────┼──────────┼────────────┼───────────┤
│ AD │ AND │ 20 │ AN │ Andorra │ Andorra la Vella │ 468.0 │ 84000 │ EU │
│ AE │ ARE │ 784 │ AE │ United Arab Emirates │ Abu Dhabi │ 82880.0 │ 4975593 │ AS │
│ AF │ AFG │ 4 │ AF │ Afghanistan │ Kabul │ 647500.0 │ 29121286 │ AS │
│ AG │ ATG │ 28 │ AC │ Antigua and Barbuda │ St. Johns │ 443.0 │ 86754 │ NA │
│ AI │ AIA │ 660 │ AV │ Anguilla │ The Valley │ 102.0 │ 13254 │ NA │
└────────────┴────────────┴─────────────┴────────┴──────────────────────┴──────────────────┴──────────┴────────────┴───────────┘
Show the 5 least populous countries in Asia
>>> (
... countries.filter(_.continent == "AS")
... .select("name", "population")
... .order_by(_.population)
... .limit(5)
... )
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ name ┃ population ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ string │ int32 │
├────────────────────────────────┼────────────┤
│ Cocos [Keeling] Islands │ 628 │
│ British Indian Ocean Territory │ 4000 │
│ Brunei │ 395027 │
│ Maldives │ 395650 │
│ Macao │ 449198 │
└────────────────────────────────┴────────────┘
Ibis is an open source project and welcomes contributions from anyone in the community.
Join our community by interacting on GitHub or chatting with us on Zulip.
For more information visit https://ibis-project.org/.