Python package common tasks for users (e.g. copy examples, fetch data, ...)
A utility package that includes:
pyct.cmd: Makes various commands available to other packages. (Currently no sophisticated plugin system, just a try import/except in the other packages.) The same commands are available from within python. Can either add new subcommands to an existing argparse based command if the module has an existing command, or create the entire command if the module has no existing command. Currently, there are commands for copying examples and fetching data. See
pyct.build: Provides various commands to help package building, primarily as a convenience for project maintainers.
To install pyct with the dependencies required for pyct.cmd: pip install pyct[cmd]
or conda install -c pyviz pyct
.
An example of how to use in a project: https://github.com/holoviz/geoviews/blob/main/geoviews/main.py
Once added, users can copy the examples of a package and download the
required data with the examples
command:
$ datashader examples --help
usage: datashader examples [-h] [--path PATH] [-v] [--force] [--use-test-data]
optional arguments:
-h, --help show this help message and exit
--path PATH location to place examples and data
-v, --verbose
--force if PATH already exists, force overwrite existing examples
if older than source examples. ALSO force any existing data
files to be replaced
--use-test-data Use data's test files, if any, instead of fetching full
data. If test file not in '.data_stubs', fall back to
fetching full data.
To copy the examples of e.g. datashader but not download the data,
there's a copy-examples
command:
usage: datashader copy-examples [-h] [--path PATH] [-v] [--force]
optional arguments:
-h, --help show this help message and exit
--path PATH where to copy examples
-v, --verbose
--force if PATH already exists, force overwrite existing files if
older than source files
And to download the data only, the fetch-data
command:
usage: datashader fetch-data [-h] [--path PATH] [--datasets DATASETS] [-v]
[--force] [--use-test-data]
optional arguments:
-h, --help show this help message and exit
--path PATH where to put data
--datasets DATASETS *name* of datasets file; must exist either in path
specified by --path or in package/examples/
-v, --verbose
--force Force any existing data files to be replaced
--use-test-data Use data's test files, if any, instead of fetching full
data. If test file not in '.data_stubs', fall back to
fetching full data.
Can specify different 'datasets' file:
$ cat earthsim-examples/test.yml
---
data:
- url: http://s3.amazonaws.com/datashader-data/Chesapeake_and_Delaware_Bays.zip
title: 'Depth data for the Chesapeake and Delaware Bay region of the USA'
files:
- Chesapeake_and_Delaware_Bays.3dm
$ earthsim fetch-data --path earthsim-examples --datasets-filename test.yml
Downloading data defined in /tmp/earthsim-examples/test.yml to /tmp/earthsim-examples/data
Skipping Depth data for the Chesapeake and Delaware Bay region of the USA
Can use smaller files instead of large ones by using the --use-test-data
flag
and placing a small file with the same name in examples/data/.data_stubs
:
$ tree examples/data -a
examples/data
├── .data_stubs
│ └── nyc_taxi_wide.parq
└── diamonds.csv
$ cat examples/dataset.yml
data:
- url: http://s3.amazonaws.com/datashader-data/nyc_taxi_wide.parq
title: 'NYC Taxi Data'
files:
- nyc_taxi_wide.parq
- url: http://s3.amazonaws.com/datashader-data/maccdc2012_graph.zip
title: 'National CyberWatch Mid-Atlantic Collegiate Cyber Defense Competition'
files:
- maccdc2012_nodes.parq
- maccdc2012_edges.parq
- maccdc2012_full_nodes.parq
- maccdc2012_full_edges.parq
$ pyviz fetch-data --path=examples --use-test-data
Fetching data defined in /tmp/pyviz/examples/datasets.yml and placing in /tmp/pyviz/examples/data
Copying test data file '/tmp/pyviz/examples/data/.data_stubs/nyc_taxi_wide.parq' to '/tmp/pyviz/examples/data/nyc_taxi_wide.parq'
No test file found for: /tmp/pyviz/examples/data/.data_stubs/maccdc2012_nodes.parq. Using regular file instead
Downloading National CyberWatch Mid-Atlantic Collegiate Cyber Defense Competition 1 of 1
[################################] 59/59 - 00:00:00
To clean up any potential test files masquerading as real data use clean-data
:
usage: pyviz clean-data [-h] [--path PATH]
optional arguments:
-h, --help show this help message and exit
--path PATH where to clean data
Currently provides a way to package examples with a project, by copying an examples folder into the package directory whenever setup.py is run. The way this works is likely to change in the near future, but is provided here as the first step towards unifying/simplifying the maintenance of a number of pyviz projects.
Provides a way to check the package versions in the current environment using:
pyct report [packages]
, orimport pyct; pyct.report(packages)
The python function can be particularly useful for e.g. jupyter notebook users, since it is the packages in the current kernel that we usually care about (not those in the environment from which jupyter notebook server/lab was launched).
Note that packages
above can include the name of any Python package (returning the __version__
), along with the special cases python
or conda
(returning the version of the command-line tool) or system
(returning the OS version).