Contains MLTable loading and authoring apis for the mltable package.
MLTable is a Python package that provides fast, flexible data loading functions designed to make accessing "tabular" data easy and intuitive. MLTable will help you to abstract the schema definition for tabular data so that it is easier to materialize the table into a Pandas dataframe. MlTable can be leveraged upon delimited text files, parquet files, delta lake, json-lines files from a cloud object store or local disk.
Here are a few things that mltable does well:
Flexible sampling and filtering functionality on large data
Robust IO tools for loading data from flat files (CSV and delimited), parquet files, delta lake and json-lines files
Capturing and defining schema contained in flat files
Fast materialization of data into Pandas DataFrame
You can install MLTable package via pip.
pip install mltable
Please note MLTable package is pre-installed on AzureML compute instances.
The official documentation is hosted on working with tables.
MLTable artifact’s metadata file is called MLTable which adheres to the AzureML MLTable schema.
MLTable.save()
supports cloud storage. Please find more details here.from_delta_lake
supports pulling latest version by defaultsupport_multi_line
issue for MLTable.from_delimited_files