Item Generator
This library aims to be a generic tool for generating JSON documents which are STAC-like. You should be able to generate fully STAC compliant JSON or generate content which contains all the relevant information to allow you to construct a valid STAC item.
This library works on the premise that you can build a processing chain for each of your datasets by chaining together different processors to extract the relevant information. The core facet extraction chain works on an atomic basis, where input plugins provide a single object for inspection and output a single JSON object. Item IDs are generated based on selected facets. It is then up to your downstream processing to aggregate this information together.
Datastores such as Elasticsearch can make use of upserts which will merge the JSON documents in indexing.
Read the Orientation guide as a introduction into the framework.
Installation
At present, not all the required libraries are available via package managers. To install, you’ll
need to install the dependencies using the requirements.txt
$ git clone https://github.com/cedadev/item-generator
$ cd item-generator
$ pip install -r requirements.txt
$ pip install .
Configuration
Configuration takes the form a YAML formatted file.
Option |
Description |
---|---|
|
The python import path to the extractor class. If not specified, it picks up the
class installed with the entry point |
|
|
|
|
|
|
Sample configuration
item_descriptions: root_directory: /etc/item-generator/item_descriptions/descriptions inputs: - name: file_system path: /badc/faam/data/2005/b069-jan-05 outputs: - name: standard_out namespace: assets - name: standard_out namespace: facets
Configuration for the extraction pipelines is done separately. This could be stored in a different
repository to manage versions and additions from multiple sources. You could then clone or download
this repository and reference it using the item_descriptions.root_directory
.
These pipeline files are in the form of item description files.
These YAML files specify the processors to use to extract your desired facets.
Note
The item-generator outputs two things: 1. An item, including facets 2. An item ID to be applied to the asset.
These are separated using the namespace argument on the output plugin.
Usage
The tool is called using the asset-scanner
usage: asset_scanner [-h] conf
Run the asset scanner as configured
positional arguments:
conf Path to a yaml configuration file
optional arguments:
-h, --help show this help message and exit
Example:
$ asset_scanner conf/conf.yml
User guide: