extraction_methods.plugins.assets.backends package

Submodules

extraction_methods.plugins.assets.backends.elasticsearch module

Elasticsearch Assets Backend

class extraction_methods.plugins.assets.backends.elasticsearch.ElasticsearchAssets(*args: Any, **kwargs: Any)

Bases: Backend

Method: elasticsearch_assets

Description:

Using an ID. Generate a summary of information for higher level entities.

Configuration Options: .. list-table:

- ``index``: Name of the index holding the STAC entities
- ``id_term``: Term used for agregating the STAC entities
- ``connection_kwargs``: Connection parameters passed to
  `elasticsearch.Elasticsearch<https://elasticsearch-py.readthedocs.io/en/7.10.0/api.html>`_
- ``bbox``: list of terms for which their aggregate bbox should be returned.
- ``min``: list of terms for which the minimum of their aggregate should be returned.
- ``max``: list of terms for which the maximum of their aggregate should be returned.
- ``sum``: list of terms for which the sum of their aggregate should be returned.
- ``list``: list of terms for which a list of their aggregage should be returned.

Configuration Example: .. code-block:: yaml

  • name: elasticsearch inputs:

    index: ceda-index id_term: item_id client_kwargs:

    hosts: [‘host1:9200’,’host2:9200’]

    fields:
    • roles

input_class

alias of ElasticsearchAssetsInput

run(body: dict[str, Any]) Any

Run the backend.

Parameters:

body (dict) – current generated properties

Returns:

updated body dict

Return type:

dict

class extraction_methods.plugins.assets.backends.elasticsearch.ElasticsearchAssetsInput(*, exists_key: str = '$', exists_delimiter: str = '.', index: str, client_kwargs: dict[str, Any] = {}, request_timeout: int = 60, regex: str, search_field: str, href_term: str = 'path', extra_fields: list[KeyOutputKey] = [])

Bases: Input

Model for Elasticsearch Assets Backend Input.

client_kwargs: dict[str, Any]
extra_fields: list[KeyOutputKey]
href_term: str
index: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'client_kwargs': FieldInfo(annotation=dict[str, Any], required=False, default={}, description='Elasticsearch connection kwargs.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'extra_fields': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='term for method to output to.'), 'href_term': FieldInfo(annotation=str, required=False, default='path', description='term to use for href.'), 'index': FieldInfo(annotation=str, required=True, description='Elasticsearch index to search on.'), 'regex': FieldInfo(annotation=str, required=True, description='Regex to test against.'), 'request_timeout': FieldInfo(annotation=int, required=False, default=60, description='Request timeout for search.'), 'search_field': FieldInfo(annotation=str, required=True, description='Term to search for regex on.')}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

regex: str
request_timeout: int
search_field: str

extraction_methods.plugins.assets.backends.intake_esm module

extraction_methods.plugins.assets.backends.regex module

Regex Assets Backend

class extraction_methods.plugins.assets.backends.regex.RegexAssets(*args: Any, **kwargs: Any)

Bases: Backend

Method: regex_assets

Description:

Takes a regex glob and yields a dictionary for each matching path.

Configuration Options: .. list-table:

- ``input_term``:The regular expression to match against the path

Example configuration: .. code-block:: yaml

  • method: regex inputs:

    input_term: ^(?:[^_]*_){2}(?P<datetime>d*)

# noqa: W605

input_class

alias of RegexAssetsInput

run(body: dict[str, Any]) Any

Run the backend.

Parameters:

body (dict) – current generated properties

Returns:

updated body dict

Return type:

dict

class extraction_methods.plugins.assets.backends.regex.RegexAssetsInput(*, exists_key: str = '$', exists_delimiter: str = '.', input_term: str = '$uri')

Bases: Input

Model for Regex Assets Backend Input.

input_term: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'input_term': FieldInfo(annotation=str, required=False, default='$uri', description='term for method to run on.')}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

Module contents