extraction_methods.plugins package
Subpackages
Submodules
extraction_methods.plugins.bbox module
Bounding Box Method
- class extraction_methods.plugins.bbox.BboxExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - bbox- Description:
- Converts a coordinate values to RFC 7946, section 5 formatted bbox. 
 - Configuration Options: .. list-table: - - ``west``: ``REQUIRED`` Most westerly coordinate - ``south``: ``REQUIRED`` Most southernly coordinate - ``east``: ``REQUIRED`` Most easterly coordinate - ``north``: ``REQUIRED`` Most northernly coordinate - Example Configuration: .. code-block:: yaml - method: bbox inputs: - west: 0 south: 0 east: $east_variable north: $north_variable 
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.bbox.BboxInput(*, exists_key: str = '$', exists_delimiter: str = '.', west: float | str, south: float | str, east: float | str, north: float | str)
- Bases: - Input- Model for BBox Method Input. - east: float | str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'east': FieldInfo(annotation=Union[float, str], required=True, description='east coordinate.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'north': FieldInfo(annotation=Union[float, str], required=True, description='north coordinate.'), 'south': FieldInfo(annotation=Union[float, str], required=True, description='south coordinate.'), 'west': FieldInfo(annotation=Union[float, str], required=True, description='west coordinate.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - north: float | str
 - south: float | str
 - west: float | str
 
extraction_methods.plugins.ceda_observation module
CEDA Observation Method
- class extraction_methods.plugins.ceda_observation.CEDAObservationExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - ceda_observation- Description:
- Returns a ceda observation record for the - input_term.
 - Configuration Options: .. list-table: - - ``input_term``: ``REQUIRED`` term for method to run on - Example Configuration: .. code-block:: yaml - method: ceda_observation inputs: - input_term: $url 
 - input_class
- alias of - CEDAObservationInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.ceda_observation.CEDAObservationInput(*, exists_key: str = '$', exists_delimiter: str = '.', input_term: str = '$uri', request_timeout: int = 15, output_key: str = 'uuid')
- Bases: - Input- Model for CEDA Observation Method Input. - input_term: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'input_term': FieldInfo(annotation=str, required=False, default='$uri', description='term for method to run on.'), 'output_key': FieldInfo(annotation=str, required=False, default='uuid', description='key to output to.'), 'request_timeout': FieldInfo(annotation=int, required=False, default=15, description='request time out.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 - request_timeout: int
 
extraction_methods.plugins.ceda_vocabulary module
CEDA Vocabulary Method
- class extraction_methods.plugins.ceda_vocabulary.CEDAVocabularyExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - ceda_vocabulary- Description:
- Validates and sorts properties into vocabs and generates the general vocab for specified properties. 
 - Configuration Options: .. list-table: - - ``url``: ``REQUIRED`` url of vocabulary server - ``namespace``: ``REQUIRED`` namespace of vocab for terms - ``terms``: Terms to be validated - ``strict``: Boolean on whether values should be validated - ``request_timeout``: request time out - Example configuration: .. code-block:: yaml - method: ceda_vocabulary inputs: - url: vocab.ceda.ac.uk namespace: cmip6 strict: False terms: - start_time 
- model 
 
 - input_class
- alias of - CEDAVocabularyInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.ceda_vocabulary.CEDAVocabularyInput(*, exists_key: str = '$', exists_delimiter: str = '.', url: str, namespace: str, strict: bool = False, terms: list[str] = [], request_timeout: int = 15)
- Bases: - Input- Model for CEDA Vocab Method Input. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'namespace': FieldInfo(annotation=str, required=True, description='Namespace for vocab terms.'), 'request_timeout': FieldInfo(annotation=int, required=False, default=15, description='request time out.'), 'strict': FieldInfo(annotation=bool, required=False, default=False, description='True if values should be validated.'), 'terms': FieldInfo(annotation=list[str], required=False, default=[], description='terms to be validated.'), 'url': FieldInfo(annotation=str, required=True, description='URL of vocabulary server.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - namespace: str
 - request_timeout: int
 - strict: bool
 - terms: list[str]
 - url: str
 
extraction_methods.plugins.controlled_vocabulary module
Controlled Vocabulary Method
- class extraction_methods.plugins.controlled_vocabulary.ControlledVocabularyExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - controlled_vocabulary- Description:
- Compare properties to a controlled vocabulary defined by a pydantic.BaseModel. 
 - Configuration Options: .. list-table: - - ``model``: pydantic.BaseModel subclass to be imported at run-time, e.g. `package.module.class_name` - ``strict``: If True, raise ValidationError, otherwise simply log ValidationError messages - Example Configuration:
- - name: controlled_vocabulary inputs: model: my_cv.collections.CMIP5 strict: False 
 - input_class
- alias of - ControlledVocabularyInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.controlled_vocabulary.ControlledVocabularyInput(*, exists_key: str = '$', exists_delimiter: str = '.', model: str, strict: bool = False)
- Bases: - Input- Model for Contrilled Vocabulary Method Input. - model: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'model': FieldInfo(annotation=str, required=True, description='pydantic.BaseModel subclass to be imported at run-time, e.g. `package.module.class_name`.'), 'strict': FieldInfo(annotation=bool, required=False, default=False, description='If True, raise ValidationError, otherwise simply log ValidationError messages.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - strict: bool
 
extraction_methods.plugins.datetime_bound_to_centroid module
Datetime Bound to Centroid Method
- class extraction_methods.plugins.datetime_bound_to_centroid.DatetimeBoundToCentroidExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - datetime_bound_to_centroid- Description:
- Accepts a dictionary of coordinate values and converts to RFC 7946, section 5 formatted bbox. 
 - Configuration Options: .. list-table: - - ``start_datetime``: Start datetime bound - ``start_format``: Format of the start datetime - ``end_datetime``: End datetime bound - ``end_format``: Format of the end datetime - ``output_key``: Term for method to output to - ``output_format``: Format of the output datetime - Example Configuration: .. code-block:: yaml - method: datetime_bound_to_centroid inputs: - start_datetime: $start_date end_datetime: 2022-02-02 end_format: %Y-%m-%d output_key: polygon 
 - input_class
- alias of - DatetimeBoundToCentroidInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 - strip_time(datetime_str: str, datetime_format: str) datetime
- strip datetime from value. - Parameters:
- datetime_str (str) – string to convert to datetime 
- datetime_format (str) – format of datetime string 
 
- Returns:
- datetime object 
- Return type:
- datetime 
 
 
- class extraction_methods.plugins.datetime_bound_to_centroid.DatetimeBoundToCentroidInput(*, exists_key: str = '$', exists_delimiter: str = '.', start_datetime: str = '$start_datetime', start_format: str = '%Y-%m-%dT%H:%M:%S', end_datetime: str = '$end_datetime', end_format: str = '%Y-%m-%dT%H:%M:%S', output_key: str = 'datetime', output_format: str = '%Y-%m-%dT%H:%M:%SZ')
- Bases: - Input- Model for Datetime Bound to Centroid Method Input. - end_datetime: str
 - end_format: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'end_datetime': FieldInfo(annotation=str, required=False, default='$end_datetime', description='End datetime bound.'), 'end_format': FieldInfo(annotation=str, required=False, default='%Y-%m-%dT%H:%M:%S', description='Format of end datetime.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'output_format': FieldInfo(annotation=str, required=False, default='%Y-%m-%dT%H:%M:%SZ', description='format of output.'), 'output_key': FieldInfo(annotation=str, required=False, default='datetime', description='key to output to.'), 'start_datetime': FieldInfo(annotation=str, required=False, default='$start_datetime', description='Start datetime bound.'), 'start_format': FieldInfo(annotation=str, required=False, default='%Y-%m-%dT%H:%M:%S', description='Format for start datetime.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_format: str
 - output_key: str
 - start_datetime: str
 - start_format: str
 
extraction_methods.plugins.default module
Default Method
- class extraction_methods.plugins.default.DefaultExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - default- Description:
- Takes a set of default facets. 
 - Configuration Options: .. list-table: - - ``defaults``: Dictionary of defaults to be added - Example configuration: .. code-block:: yaml - method: default inputs: - defaults:
- mip_era: CMIP6 
 
 - input_class
- alias of - DefaultInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.default.DefaultInput(*, exists_key: str = '$', exists_delimiter: str = '.', defaults: dict[str, Any])
- Bases: - Input- Model for Default Method Input. - defaults: dict[str, Any]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'defaults': FieldInfo(annotation=dict[str, Any], required=True, description='Defaults to be added.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 
extraction_methods.plugins.dict_aggregator module
Dictionary Aggregator Method
- class extraction_methods.plugins.dict_aggregator.DictAggregatorExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - dict_aggregator- Description:
- Aggregate information within dictionary. 
 - Configuration Options: .. list-table: - - ``min``: list of terms for which the minimum of their aggregate should be returned - ``max``: list of terms for which the maximum of their aggregate should be returned - ``sum``: list of terms for which the sum of their aggregate should be returned - ``list``: list of terms for which a list of their aggregage should be returned - ``mean``: list of terms for which a list of their aggregage should be returned - Configuration Example: .. code-block:: yaml - method: dict_aggregator inputs: - min:
- start_time 
 
- max:
- end_time 
 
- sum:
- size 
 
- list:
- term1 
- term2 
 
 
 - input_class
- alias of - DictAggregatorInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.dict_aggregator.DictAggregatorInput(*, exists_key: str = '$', exists_delimiter: str = '.', input_term: str | dict[str, Any] = '$assets', min: list[KeyOutputKey] = [], max: list[KeyOutputKey] = [], sum: list[KeyOutputKey] = [], mean: list[KeyOutputKey] = [], bucket: list[KeyOutputKey] = [])
- Bases: - Input- Model for Dictionary Aggregator Method Input. - bucket: list[KeyOutputKey]
 - input_term: str | dict[str, Any]
 - max: list[KeyOutputKey]
 - mean: list[KeyOutputKey]
 - min: list[KeyOutputKey]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'bucket': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of terms for which the list of their aggregate should be returned.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'input_term': FieldInfo(annotation=Union[str, dict[str, Any]], required=False, default='$assets', description='term for method to run on.'), 'max': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of terms for which the maximum of their aggregate should be returned.'), 'mean': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of terms for which the mean of their summed aggregate should be returned.'), 'min': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of terms for which the minimum of their aggregate should be returned.'), 'sum': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of terms for which the sum of their aggregate should be returned.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - sum: list[KeyOutputKey]
 
extraction_methods.plugins.elasticsearch_aggregation module
Elasticsearch Aggregation Method
- class extraction_methods.plugins.elasticsearch_aggregation.ElasticsearchAggregationExtract(**kwargs: Any)
- Bases: - ExtractionMethod- Method: - elasticsearch_aggregation- Description:
- Using an ID. Generate a summary of information for higher level entities. 
 - Configuration Options: .. list-table: - - ``index``: Name of the index holding the STAC entities - ``id_term``: Term used for agregating the STAC entities - ``client_kwargs``: Session parameters passed to `elasticsearch.Elasticsearch<https://elasticsearch-py.readthedocs.io/en/7.10.0/api.html>`_ - ``bbox``: list of terms for which their aggregate bbox should be returned - ``min``: list of terms for which the minimum of their aggregate should be returned - ``max``: list of terms for which the maximum of their aggregate should be returned - ``sum``: list of terms for which the sum of their aggregate should be returned - ``list``: list of terms for which a list of their aggregage should be returned - Configuration Example: .. code-block:: yaml - method: elasticsearch_aggregation inputs: - index: ceda-index id_term: item_id client_kwargs: - hosts: [‘host1:9200’,’host2:9200’] - bbox:
- bbox 
 
- min:
- start_time 
 
- max:
- end_time 
 
- sum:
- size 
 
- list:
- term1 
- term2 
 
 
 - base_query() dict[str, Any]
- Base query to filter the results to a single collection. - Returns:
- base query 
- Return type:
- dict 
 
 - static basic_aggregation(agg_type: str, facet: KeyOutputKey) dict[str, Any]
- Query to retrieve the minimum value from docs. - Parameters:
- agg_type (str) – type of aggregation 
- facet (KeyOutputKey) – facet to aggregate 
 
- Returns:
- basic aggregation query 
- Return type:
- dict 
 
 - construct_query() dict[str, Any]
- Function to create the initial elasticsearch query. - Returns:
- aggregation query 
- Return type:
- dict 
 
 - extract_facet(aggregations: dict[str, Any], facet: KeyOutputKey) Any
- Function to extract the given facets from the aggregation. - Parameters:
- input_dict (dict) – aggregations 
- facet – facet to be extracted 
 
- Returns:
- extracted facet 
- Return type:
- Any 
 
 - extract_facet_lists(query: dict[str, Any], aggregations: dict[str, Any], facets: list[KeyOutputKey]) dict[str, Any]
- Function to extract the lists of given facets from the aggregation. - Parameters:
- query (dict) – attribute dictionary to update 
- aggregations (dict) – current generated properties 
- facets (list) – facets to be extracted 
 
- Returns:
- extracted list facets 
- Return type:
- dict 
 
 - extract_first_facet(properties: dict[str, Any], facet: KeyOutputKey) Any
- Function to extract the given default facets from the first hit. - Parameters:
- properties (dict) – properties from first record 
- facet – current facet to be extracted 
 
- Returns:
- extracted facet 
- Return type:
- Any 
 
 - extract_metadata(query: dict[str, Any], result: dict[str, Any]) dict[str, Any]
- Function to extract the required metadata from the returned query result. - Parameters:
- query (dict) – previous query 
- result (dict) – resutls from previous query 
 
- Returns:
- metadata 
- Return type:
- dict 
 
 - static facet_composite_aggregation(facet: KeyOutputKey) dict[str, Any]
- Generate the composite aggregation for the facet. - Parameters:
- facet (KeyOutputKey) – facet to aggregate 
- Returns:
- composite aggregation query 
- Return type:
- dict 
 
 - run(body: dict[str, Any]) dict[str, Any]
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.elasticsearch_aggregation.ElasticsearchAggregationInput(*, exists_key: str = '$', exists_delimiter: str = '.', index: str, id_term: str, client_kwargs: dict[str, Any] = {}, search_query: dict[str, Any] = {'bool': {'must': [{'term': {'path': {'value': '$uri'}}}], 'must_not': [{'term': {'categories.keyword': {'value': 'hidden'}}}]}}, geo_bound: list[KeyOutputKey] = [], first: list[KeyOutputKey] = [], min: list[KeyOutputKey] = [], max: list[KeyOutputKey] = [], sum: list[KeyOutputKey] = [], mean: list[KeyOutputKey] = [], bucket: list[KeyOutputKey] = [], request_tiemout: int = 15, allow_multiple: bool = True, output_key: str = 'label')
- Bases: - Input- Model for Elasticsearch Aggregation Input. - allow_multiple: bool
 - bucket: list[KeyOutputKey]
 - client_kwargs: dict[str, Any]
 - first: list[KeyOutputKey]
 - geo_bound: list[KeyOutputKey]
 - id_term: str
 - index: str
 - max: list[KeyOutputKey]
 - mean: list[KeyOutputKey]
 - min: list[KeyOutputKey]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'allow_multiple': FieldInfo(annotation=bool, required=False, default=True, description='True if multiple labels are allowed.'), 'bucket': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of terms for which the list of their aggregate should be returned.'), 'client_kwargs': FieldInfo(annotation=dict[str, Any], required=False, default={}, description='Parameters passed to elasticsearch client.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'first': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description="list of terms for which the first record's value should be returned."), 'geo_bound': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of terms for which the minimum of their aggregate should be returned.'), 'id_term': FieldInfo(annotation=str, required=True, description='Term used for agregating the STAC entities.'), 'index': FieldInfo(annotation=str, required=True, description='Name of the index holding the STAC entities.'), 'max': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of terms for which the maximum of their aggregate should be returned.'), 'mean': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of terms for which the mean of their summed aggregate should be returned.'), 'min': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of terms for which the minimum of their aggregate should be returned.'), 'output_key': FieldInfo(annotation=str, required=False, default='label', description='key to output to.'), 'request_tiemout': FieldInfo(annotation=int, required=False, default=15, description='Time out for search.'), 'search_query': FieldInfo(annotation=dict[str, Any], required=False, default={'bool': {'must_not': [{'term': {'categories.keyword': {'value': 'hidden'}}}], 'must': [{'term': {'path': {'value': '$uri'}}}]}}, description='Session parameters passed to elasticsearch client.'), 'sum': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of terms for which the sum of their aggregate should be returned.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 - request_tiemout: int
 - search_query: dict[str, Any]
 - sum: list[KeyOutputKey]
 
extraction_methods.plugins.facet_map module
Facet Map Method
- class extraction_methods.plugins.facet_map.FacetMapExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - facet_map- Description:
- In some cases, you may wish to map the header attributes to different facets. This method takes a map and converts the facet labels into those specified. 
 - Configuration Options: .. list-table: - - ``term_map``: Dictionary of terms to map. - Example Configuration: .. code-block:: yaml - method: facet_map inputs: - term_map:
- old_key: new_key time_coverage_start: start_time 
 
 - input_class
- alias of - FacetMapInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.facet_map.FacetMapInput(*, exists_key: str = '$', exists_delimiter: str = '.', term_map: dict[str, str] = {})
- Bases: - Input- Model for Facet Map Input. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'term_map': FieldInfo(annotation=dict[str, str], required=False, default={}, description='Dictionary of terms to be mapped.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - term_map: dict[str, str]
 
extraction_methods.plugins.facet_prefix module
Facet Prefix Method
- class extraction_methods.plugins.facet_prefix.FacetPrefixExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - facet_prefix- Description:
- In some cases, you may wish add a prefix to some or all of the facets based on the vocabulary they’re from. 
 - Configuration Options: .. list-table: - - ``prefix``: Prefix to be added. - ``keys``: List of keys that require prefix. - Example Configuration: .. code-block:: yaml - method: facet_prefix inputs: - prefix: cmip6 keys: - start_time 
- model 
 
 - input_class
- alias of - FacetPrefixInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.facet_prefix.FacetPrefixInput(*, exists_key: str = '$', exists_delimiter: str = '.', prefix: str, keys: list[str])
- Bases: - Input- Model for Facet Prefix Input. - keys: list[str]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'keys': FieldInfo(annotation=list[str], required=True, description='list of keys that require prefix.'), 'prefix': FieldInfo(annotation=str, required=True, description='Prefix to be added.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - prefix: str
 
extraction_methods.plugins.general_function module
General Function Method
- class extraction_methods.plugins.general_function.Function(*, name: str, args: list[Any] = [], kwargs: dict[str, Any] = {})
- Bases: - BaseModel- Model for Fuction. - args: list[Any]
 - kwargs: dict[str, Any]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'args': FieldInfo(annotation=list[Any], required=False, default=[], description='list of arguments for function.'), 'kwargs': FieldInfo(annotation=dict[str, Any], required=False, default={}, description='dictionary of key word arguments for function.'), 'name': FieldInfo(annotation=str, required=True, description='Name of function.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - name: str
 
- class extraction_methods.plugins.general_function.GeneralFunctionExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - general_function- Description:
- Accepts a dictionary. String values are popped from the dictionary and are put back into the dictionary with the - keyspecified.
 - Configuration Options: .. list-table: - - ``function``: ``REQUIRED`` Function to be run ``name``, ``args``, and ``kwargs``. - ``delimiter``: Optional text delimiter to put between module/function names ``Default`` "." - ``output_key``: Optional name of the key you would like to output else response will be merged.- Example Configuration: .. code-block:: yaml - method: general_function inputs: - funtion:
- name: import.path.to.the.fuction args: - hello 
- world 
 - kwargs:
- hello: world foo: bar 
 
 
 - input_class
- alias of - GeneralFunctionInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.general_function.GeneralFunctionInput(*, exists_key: str = '$', exists_delimiter: str = '.', function: Function, delimiter: str = '.', output_key: str = '')
- Bases: - Input- Model for General Fuction Input. - delimiter: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'delimiter': FieldInfo(annotation=str, required=False, default='.', description='text delimiter to put between module/function names.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'function': FieldInfo(annotation=Function, required=True, description='Function to be run name maybe seperatated my delimieter.'), 'output_key': FieldInfo(annotation=str, required=False, default='', description='key to output to, else response will be merged with body.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 
extraction_methods.plugins.geometry module
Geometry Method
- class extraction_methods.plugins.geometry.GeometryExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - geometry- Description:
- Accepts a dictionary of coordinate values and converts to RFC 7946, formatted geometry. 
 - Configuration Options: .. list-table: - - ``type``: ``REQUIRED`` Type of geometry to be produced. - ``coordinates``: ``REQUIRED`` list of coordinates to convert to geometry. Ordering is respected. - ``output_key``: key to output to. - Example Configuration: .. code-block:: yaml - name: geometry inputs: - type: line coordinates: - 0 
- 0 
 
- $lon_2 
- $lat_2 
 
 
 - get_coordinates(coordinate_type: str, coordinates: list[Any]) list[Any]
- Get coordinates - Parameters:
- coordinate_type (str) – type of coordinates 
- coordinates (list) – list of coordinates 
 
- Returns:
- coordinates 
- Return type:
- list 
 
 - input_class
- alias of - GeometryInput
 - line(coordinates: list[list[str | float]]) list[list[float]]
- Get line coordinates - Parameters:
- coordinates (list) – list of coordinates 
- Returns:
- coordinates 
- Return type:
- list 
 
 - multi(coordinate_type: str, coordinates: list[Any]) list[Any]
- Get polygon coordinates - Parameters:
- coordinate_type (str) – type of coordinates 
- coordinates (list) – list of coordinates 
 
- Returns:
- coordinates 
- Return type:
- list 
 
 - point(coordinates: list[str | float]) list[float]
- Get point coordinates - Parameters:
- coordinates (list) – list of coordinates 
- Returns:
- coordinates 
- Return type:
- list 
 
 - polygon(coordinates: list[list[str | float]]) list[list[list[float]]]
- Get polygon coordinates - Parameters:
- coordinates (list) – list of coordinates 
- Returns:
- coordinates 
- Return type:
- list 
 
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.geometry.GeometryInput(*, exists_key: str = '$', exists_delimiter: str = '.', type: Literal['Point', 'LineString', 'Polygon', 'MultiPointString', 'MultiLineString', 'MultiPolygon'], coordinates: list[Any], output_key: str = 'geometry')
- Bases: - Input- Model for Geometry Input. - coordinates: list[Any]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'coordinates': FieldInfo(annotation=list[Any], required=True, description='list of coordinates to convert to geometry. Ordering is respected.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'output_key': FieldInfo(annotation=str, required=False, default='geometry', description='key to output to.'), 'type': FieldInfo(annotation=Literal[str, str, str, str, str, str], required=True, description='Type of geometry to be produced.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 - type: Literal['Point', 'LineString', 'Polygon', 'MultiPointString', 'MultiLineString', 'MultiPolygon']
 
extraction_methods.plugins.geometry_to_bbox module
Geometry to Bounding Box Method
- class extraction_methods.plugins.geometry_to_bbox.GeometryToBboxExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - geometry_to_bbox- Description:
- Accepts a geometry with type and list of coordinates to RFC 7946, section 5 formatted bbox. 
 - Configuration Options: .. list-table: - - ``geometry``: ``REQUIRED`` geometry to be converted to bbox. - ''output_key'': key to output to. - Example Configuration: .. code-block:: yaml - method: geometry_to_bbox inputs: - geometry:
- type: point coordinates: - 20 
- 0 
 
 
 - get_bbox(coordinate_type: str, coordinates: list[Any]) list[float]
- Get bbox from geometry - Parameters:
- coordinate_type (str) – type of coordinates 
- coordinates (list) – list of coordinates 
 
- Returns:
- bounding box of coordinates 
- Return type:
- list 
 
 - input_class
- alias of - GeometryToBboxInput
 - line(coordinates: list[list[float]]) list[float]
- Get line bbox - Parameters:
- coordinates (list) – list of coordinates 
- Returns:
- bounding box of coordinates 
- Return type:
- list 
 
 - multi(coordinate_type: str, coordinates: list[Any]) list[float]
- Get polygon bbox - Parameters:
- coordinate_type (str) – type of coordinates 
- coordinates (list) – list of coordinates 
 
- Returns:
- bounding box of coordinates 
- Return type:
- list 
 
 - point(coordinates: list[float]) list[float]
- Get point bbox - Parameters:
- coordinates (list) – list of coordinates 
- Returns:
- bounding box of coordinates 
- Return type:
- list 
 
 - polygon(coordinates: list[list[float]]) list[float]
- Get polygon bbox - Parameters:
- coordinates (list) – list of coordinates 
- Returns:
- bounding box of coordinates 
- Return type:
- list 
 
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.geometry_to_bbox.GeometryToBboxInput(*, exists_key: str = '$', exists_delimiter: str = '.', geometry: dict[str, Any] = '$geometry', output_key: str = 'bbox')
- Bases: - Input- Model for Geometry to Bounding Box Input. - geometry: dict[str, Any]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'geometry': FieldInfo(annotation=dict[str, Any], required=False, default='$geometry', description='geometry to be converted to bbox.'), 'output_key': FieldInfo(annotation=str, required=False, default='bbox', description='key to output to.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 
extraction_methods.plugins.hash module
Hash Method
- class extraction_methods.plugins.hash.HashExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - hash- Description:
- Hashes input string. 
 - Configuration Options: .. list-table: - - ``hash_str``: string to be hashed. - ``output_key``: key to output to. - Example configuration: .. code-block:: yaml - method: hash
- inputs:
- hash_str: $model output_key: hashed_terms 
 
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.hash.HashInput(*, exists_key: str = '$', exists_delimiter: str = '.', hash_str: str, output_key: str)
- Bases: - Input- Model for Hash Input. - hash_str: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'hash_str': FieldInfo(annotation=str, required=True, description='string to be hashed.'), 'output_key': FieldInfo(annotation=str, required=True, description='key to output to.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 
extraction_methods.plugins.iso19115 module
ISO 19115 Method
- class extraction_methods.plugins.iso19115.ISO19115Extract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - iso19115- Description:
- Takes a URL and calls out to URL to retrieve the iso19115 record. 
 - Configuration Options: .. list-table: - - ``url``: ``REQUIRED`` URL to record store. - ``date_terms``: List of name, key, format of date terms to retrieve from the response. - Example configuration: .. code-block:: yaml - method: iso19115 inputs: - url: $url dates: - key: ‘.//gml:beginPosition’ output_key: start_datetime 
 
 - input_class
- alias of - ISO19115Input
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.iso19115.ISO19115Input(*, exists_key: str = '$', exists_delimiter: str = '.', url: str, dates: list[KeyOutputKey], request_timeout: int = 15)
- Bases: - Input- Model for ISO19115 Date Input. - dates: list[KeyOutputKey]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'dates': FieldInfo(annotation=list[KeyOutputKey], required=True, description='list of dates to extract.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'request_timeout': FieldInfo(annotation=int, required=False, default=15, description='request time out.'), 'url': FieldInfo(annotation=str, required=True, description='Url for record store.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - request_timeout: int
 - url: str
 
extraction_methods.plugins.iso_date module
ISO Date Method
- class extraction_methods.plugins.iso_date.DateTerm(*, input_term: str, format: str = '%Y-%m-%dT%H:%M:%SZ', output_key: str = 'datetime')
- Bases: - BaseModel- Model for Date terms with format. - format: str
 - input_term: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'format': FieldInfo(annotation=str, required=False, default='%Y-%m-%dT%H:%M:%SZ', description='Format of the date.'), 'input_term': FieldInfo(annotation=str, required=True, description='Term to run method on.'), 'output_key': FieldInfo(annotation=str, required=False, default='datetime', description='Key to output to.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 
- class extraction_methods.plugins.iso_date.ISODateExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - iso_date- Description:
- Takes the source dict and the key to access the date and converts the date to ISO 8601 Format. - e.g. - YYYY-MM-DDTHH:MM:SS.ffffff, if microsecond is not 0- YYYY-MM-DDTHH:MM:SS, if microsecond is 0- If the date format cannot be parsed, it is removed from the source dict with an error logged. 
 - Configuration Options: .. list-table: - - ``date_terms``: `REQUIRED` List keys to the date value. Using a list allows processing of multiple dates. - ``format``: Optional format string. Default behaviour uses `dateutil.parser.parse <https://dateutil.readthedocs.io/en/stable/parser.html#dateutil.parser.parse>`_. If a format string is supplied, this will change to use `datetime.datetime.strptime <https://docs.python.org/3/library/datetime.html#datetime.datetime.strptime>`_. - Example Configuration: .. code-block:: yaml - method: iso_date inputs: - dates:
- key: $datetime output_key: date format: “%Y-%m-%dT%H:%M:%S” 
- key: 2012-12-12 format: “%Y-%m-%d” 
 
 
 - input_class
- alias of - ISODateInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.iso_date.ISODateInput(*, exists_key: str = '$', exists_delimiter: str = '.', date_terms: list[DateTerm] = [])
- Bases: - Input- Model for ISO Date Input. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'date_terms': FieldInfo(annotation=list[DateTerm], required=False, default=[], description='List of date terms.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 
extraction_methods.plugins.json_file module
JSON File Method
- class extraction_methods.plugins.json_file.JsonFileExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - json_file- Description:
- Takes an input list of string to extract from the json file. 
 - Configuration Options: .. list-table: - - ``path``: Path to directory or single JSON file. - ``terms``: List of terms to extract. - Example configuration: .. code-block:: yaml - method: json_file inputs: - path: /path/to/file.json properties: - key: MIP_ERA output_key: mip_era 
 
 - extract_terms(path: Path) dict[str, Any]
- Extract terms from JSON file(s) at path. - Parameters:
- path (Path) – path to file 
- Returns:
- extracted terms 
- Return type:
- dict 
 
 - find_and_extract() dict[str, Any]
- Find and extract from JSON files. - Returns:
- extracted terms 
- Return type:
- dict 
 
 - input_class
- alias of - JsonFileInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.json_file.JsonFileInput(*, exists_key: str = '$', exists_delimiter: str = '.', path: str, properties: list[KeyOutputKey])
- Bases: - Input- Model for JSON File Input. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'path': FieldInfo(annotation=str, required=True, description='Path to directory of JSON files or single JSON file.'), 'properties': FieldInfo(annotation=list[KeyOutputKey], required=True, description='list of properties to extract.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - path: str
 - properties: list[KeyOutputKey]
 
extraction_methods.plugins.lambda module
Lambda Method
- class extraction_methods.plugins.lambda.LambdaExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - lambda- Description:
- Accepts a dictionary. String values are popped from the dictionary and are put back into the dictionary with the - keyspecified.
 - Configuration Options: .. list-table: - - ``function``: ``REQUIRED`` lambda function to be run. - ``output_key``: Optional name of the key you would like to output else response will be merged. - ``args``: Optional list of arguments for function. Use $ for previously extracted terms - ``kwargs``: Optional dictionary of key word arguments for function. Use $ for previously extracted terms- Example Configuration: .. code-block:: yaml - method: lambda inputs: - function: ‘lambda x: x * x’ args: - hello 
- $world 
 - kwargs:
- hello: world goodbye: all 
 
 - input_class
- alias of - LambdaInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.lambda.LambdaInput(*, exists_key: str = '$', exists_delimiter: str = '.', function: str, args: list[Any] = [], kwargs: dict[str, Any] = {}, output_key: str = 'label')
- Bases: - Input- Model for Lambda Input. - args: list[Any]
 - function: str
 - kwargs: dict[str, Any]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'args': FieldInfo(annotation=list[Any], required=False, default=[], description='list of arguments for function.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'function': FieldInfo(annotation=str, required=True, description='lambda function to be run.'), 'kwargs': FieldInfo(annotation=dict[str, Any], required=False, default={}, description='dictionary of key word arguments for function.'), 'output_key': FieldInfo(annotation=str, required=False, default='label', description='key to output to.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 
extraction_methods.plugins.netcdf module
NetCDF Method
- class extraction_methods.plugins.netcdf.NetCDFExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - netcdf- Description: Processes XML documents to extract metadata - Configuration Options: .. list-table: - - ``extraction_keys``: List of keys to retrieve from the document. - ``filter_expr``: Regex to match against files to limit the attempts to known files - ``namespaces``: Map of namespaces - Extraction Keys:
- Extraction keys should be a map. - Name - Description - output_key- Name of the outputted attribute - key- Access key to extract the required data. Passed to xml.etree.ElementTree.find() and also supports xpath formatted accessors - attribute- Allows you to select from the element attribute. In the absence of this value, the default behaviour is to access the text value of the key. In some cases, you might want to access and attribute of the element 
 - Example configuration: .. code-block:: yaml - method: xml inputs: - filter_expr: ‘.manifest$’ extraction_keys: - name: start_datetime key: ‘.//gml:beginPosition’ attribute: start 
 
 - # noqa: W605 - input_class
- alias of - NetCDFInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.netcdf.NetCDFInput(*, exists_key: str = '$', exists_delimiter: str = '.', input_term: str = '$uri', variable_id: str = '$uri', variable_attributes: list[KeyOutputKey] = [], global_attributes: list[KeyOutputKey] = [], cf_attributes: list[KeyOutputKey] = [], rio_attributes: list[KeyOutputKey] = [])
- Bases: - Input- Model for NetCDF Input. - cf_attributes: list[KeyOutputKey]
 - global_attributes: list[KeyOutputKey]
 - input_term: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'cf_attributes': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of cf attributes to extract.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'global_attributes': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of global attributes to extract.'), 'input_term': FieldInfo(annotation=str, required=False, default='$uri', description='term for method to run on.'), 'rio_attributes': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of rio attributes to extract.'), 'variable_attributes': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of variable attributes to extract.'), 'variable_id': FieldInfo(annotation=str, required=False, default='$uri', description='lambda function to be run.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - rio_attributes: list[KeyOutputKey]
 - variable_attributes: list[KeyOutputKey]
 - variable_id: str
 
extraction_methods.plugins.open_zip module
Open Zip Method
- class extraction_methods.plugins.open_zip.ZipExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - open_zip- Description:
- Open a zip file and read inner files 
 - Configuration Options: .. list-table: - - ``input_term``: List of keys to retrieve from the document. - ``inner_files``: Lost of inner zipped files to be read. - ``output_key``: key to output to. - Example configuration: .. code-block:: yaml - method: open_zip inputs: - input_term: /path/to/a/file inner_files: - key: hello.txt output_key: world 
 
 - # noqa: W605 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.open_zip.ZipInput(*, exists_key: str = '$', exists_delimiter: str = '.', input_term: str = '$uri', inner_files: list[KeyOutputKey] = [], output_key: str = '')
- Bases: - Input- Model for Zip Input. - check_root_read() Self
 - inner_files: list[KeyOutputKey]
 - input_term: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'inner_files': FieldInfo(annotation=list[KeyOutputKey], required=False, default=[], description='list of inner zipped files to be read.'), 'input_term': FieldInfo(annotation=str, required=False, default='$uri', description='term for method to run on.'), 'output_key': FieldInfo(annotation=str, required=False, default='', description='key to output to.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 
extraction_methods.plugins.path_parts module
Path Parts Method
- class extraction_methods.plugins.path_parts.PathPartsExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - path_parts- Description:
- Extracts the parts of a given path skipping - skipnumber of top level parts.
 - Configuration Options: .. list-table: - - ``skip``: The number of path parts to skip. ``default: 0`` - Example configuration: .. code-block:: yaml - method: path_parts inputs: - input_term: $uri skip: 2 
 - input_class
- alias of - PathPartsInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.path_parts.PathPartsInput(*, exists_key: str = '$', exists_delimiter: str = '.', path: str = '$uri', skip: int = 0)
- Bases: - Input- Model for Path Parts Input. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'path': FieldInfo(annotation=str, required=False, default='$uri', description='path for method to run on.'), 'skip': FieldInfo(annotation=int, required=False, default=0, description='number of path parts to skip.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - path: str
 - skip: int
 
extraction_methods.plugins.regex module
Regex Method
- class extraction_methods.plugins.regex.RegexExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - regex- Description:
- Takes an input string and a regex with named capture groups and returns a dictionary of the values extracted using the named capture groups. 
 - Configuration Options: .. list-table: - - ``input_term``: Term for regex to be ran on. - ``regex``: ``REQUIRED`` The regular expression to match against. - Example configuration: .. code-block:: yaml - method: regex inputs: - regex: ^(?:[^_]*_){2}(?P<datetime>d*) 
 - # noqa: W605 - input_class
- alias of - RegexInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.regex.RegexInput(*, exists_key: str = '$', exists_delimiter: str = '.', input_term: str = '$uri', regex: str)
- Bases: - Input- Model for Regex Input. - input_term: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'input_term': FieldInfo(annotation=str, required=False, default='$uri', description='term for method to run on.'), 'regex': FieldInfo(annotation=str, required=True, description='The regular expression to match against.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - regex: str
 
extraction_methods.plugins.regex_label module
Regex Label Method
- class extraction_methods.plugins.regex_label.RegexLabelExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - regex_label- Description:
- Adds label if full match of regex. 
 - Configuration Options: .. list-table: - - ``input_term``: term for method to run on. - ``label``: ``REQUIRED`` Label to add if regex passes. - ``regex``: ``REQUIRED`` Regex to test against. - ``allow_multiple``: True if multiple labels are allowed. - ``output_key``: Term for method to output to. - Example configuration: .. code-block:: yaml - method: regex_label inputs: - label: metadata regex: README allow_multiple: true 
 - # noqa: W605 - input_class
- alias of - RegexLabelInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.regex_label.RegexLabelInput(*, exists_key: str = '$', exists_delimiter: str = '.', input_term: str = '$uri', label: str, regex: str, allow_multiple: bool = True, output_key: str = 'label')
- Bases: - Input- Model for Regex Label Input. - allow_multiple: bool
 - input_term: str
 - label: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'allow_multiple': FieldInfo(annotation=bool, required=False, default=True, description='True if multiple labels are allowed.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'input_term': FieldInfo(annotation=str, required=False, default='$uri', description='term for method to run on.'), 'label': FieldInfo(annotation=str, required=True, description='Label to add if regex passes.'), 'output_key': FieldInfo(annotation=str, required=False, default='label', description='Term for method to output to.'), 'regex': FieldInfo(annotation=str, required=True, description='Regex to test against.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 - regex: str
 
extraction_methods.plugins.regex_rename module
Regex Rename Method
- class extraction_methods.plugins.regex_rename.RegexOutputKey(*, exists_key: str = '$', exists_delimiter: str = '.', regex: str, output_key: str)
- Bases: - Input- Model for Regex. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'output_key': FieldInfo(annotation=str, required=True, description='Term for method to output to.'), 'regex': FieldInfo(annotation=str, required=True, description='Regex to test against.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 - regex: str
 
- class extraction_methods.plugins.regex_rename.RegexRenameExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - regex_rename- Description:
- Takes a list of regex and output key combinations. Any existing properties that full match a regex are rename to the output key. Later regex take precedence. 
 - Configuration Options: .. list-table: - - ``regex_swaps``: Regex and output key combinations. - Example configuration: .. code-block:: yaml - method: regex_rename inputs: - regex_swaps:
- regex: README output_key: metadata 
 
 
 - # noqa: W605 - add(body: dict[str, Any], key_parts: list[str], value: Any) dict[str, Any]
- Rename terms - Parameters:
- body (dict) – current body 
- key_parts (list) – key parts seperated by delimiter 
 
- Returns:
- dict 
- Return type:
- update body 
 
 - find(body: dict[str, Any], key_parts: list[str]) tuple[dict[str, Any], Any]
- Rename terms - Parameters:
- body (dict) – current body 
- key_parts (list) – key parts seperated by delimiter 
 
- Returns:
- dict 
- Return type:
- update body 
 
 - input_class
- alias of - RegexRenameInput
 - matching_keys(keys: KeysView[str], key_regex: str) list[str]
- Find all keys that match regex - Parameters:
- keys (KeysView) – dictionary keys to test 
- key_regex (str) – regex to test against 
 
- Returns:
- matching keys 
- Return type:
- list 
 
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.regex_rename.RegexRenameInput(*, exists_key: str = '$', exists_delimiter: str = '.', regex_swaps: list[RegexOutputKey], delimiter: str = '')
- Bases: - Input- Model for Regex Rename Input. - delimiter: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'delimiter': FieldInfo(annotation=str, required=False, default='', description='delimiter for nested term.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'regex_swaps': FieldInfo(annotation=list[RegexOutputKey], required=True, description='Regex and output key combinations.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - regex_swaps: list[RegexOutputKey]
 
extraction_methods.plugins.regex_type_cast module
Regex Type Cast Method
- class extraction_methods.plugins.regex_type_cast.RegexCastType(*, exists_key: str = '$', exists_delimiter: str = '.', regex: str, cast_type: str)
- Bases: - Input- Model for Regex Cast Type. - cast_type: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'cast_type': FieldInfo(annotation=str, required=True, description='Python type to cast to.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'regex': FieldInfo(annotation=str, required=True, description='Regex to test against.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - regex: str
 
- class extraction_methods.plugins.regex_type_cast.RegexTypeCastExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - regex_type_cast- Description:
- Takes a list of regex and cast type combinations. Any existing properties that full match a regex are cast to the associated type. 
 - Configuration Options: .. list-table: - - ``regex_casts``: Regex and cast type combinations. - Example configuration: .. code-block:: yaml - method: regex_type_cast inputs: - regex_casts:
- regex: clound_cover cast_type: int 
 
 
 - # noqa: W605 - input_class
- alias of - RegexTypeCastInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.regex_type_cast.RegexTypeCastInput(*, exists_key: str = '$', exists_delimiter: str = '.', regex_casts: list[RegexCastType])
- Bases: - Input- Model for Regex Cast Type Input. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'regex_casts': FieldInfo(annotation=list[RegexCastType], required=True, description='Regex and cast type combinations.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - regex_casts: list[RegexCastType]
 
extraction_methods.plugins.remove module
Remove Method
- class extraction_methods.plugins.remove.RemoveExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - remove- Description:
- remove keys from body. 
 - Configuration Options: .. list-table: - - ``keys``: ``REQUIRED`` list of keys to remove. - ``delimiter``: delimiter for nested key. - Example Configuration: .. code-block:: yaml - method: remove inputs: - keys: - hello - world 
 - input_class
- alias of - RemoveInput
 - matching_keys(keys: KeysView[str], key_regex: str) list[str]
- Find all keys that match regex - Parameters:
- keys (KeysView) – dictionary keys to test 
- key_regex (str) – regex to test against 
 
- Returns:
- matching keys 
- Return type:
- list 
 
 - remove_key(body: dict[str, Any], key_parts: list[str]) dict[str, Any]
- Remove nested terms - Parameters:
- body (dict) – current body 
- key_parts (list) – key parts seperated by delimiter 
 
- Returns:
- dict 
- Return type:
- update body 
 
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.remove.RemoveInput(*, exists_key: str = '$', exists_delimiter: str = '.', keys: list[str], delimiter: str = '.')
- Bases: - Input- Model for Remove Input. - delimiter: str
 - keys: list[str]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'delimiter': FieldInfo(annotation=str, required=False, default='.', description='delimiter for nested term.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'keys': FieldInfo(annotation=list[str], required=True, description='list of keys to remove.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 
extraction_methods.plugins.stac_extension module
STAC Extension Method
- class extraction_methods.plugins.stac_extension.STACExtension(*, url: str, prefix: str, properties: list[str])
- Bases: - BaseModel- Model for STAC Extension. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'prefix': FieldInfo(annotation=str, required=True, description='Extension prefix.'), 'properties': FieldInfo(annotation=list[str], required=True, description='Extension properties.'), 'url': FieldInfo(annotation=str, required=True, description='Extension URL.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - prefix: str
 - properties: list[str]
 - url: str
 
- class extraction_methods.plugins.stac_extension.STACExtensionExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - stac_extension- Description:
- Accepts a list of extensions which contain url, prefix and list of properties. 
 - Configuration Options: .. list-table: - - ``extensions``: ``REQUIRED`` List of extensions. - Example Configuration: .. code-block:: yaml - method: stac_extension inputs: - extensions:
- url: hello.com/v1.0.0/world.json prefix: hello properties: - foo 
- bar 
 
 
 
 - input_class
- alias of - STACExtensionInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.stac_extension.STACExtensionInput(*, exists_key: str = '$', exists_delimiter: str = '.', extensions: list[STACExtension])
- Bases: - Input- Model for STAC Extension Input. - extensions: list[STACExtension]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'extensions': FieldInfo(annotation=list[STACExtension], required=True, description='List of extensions.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 
extraction_methods.plugins.string_template module
String Template Method
- class extraction_methods.plugins.string_template.StringTemplateExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - string_template- Description:
- Accepts a template and output_key. terms are added to the template. 
 - Configuration Options: .. list-table: - - ``template``: ``REQUIRED`` Template to follow. - ``descructive``: True if terms should be removed after templating. - ``output_key``: ``REQUIRED`` key to output to. - Example Configuration: .. code-block:: yaml - method: string_template inputs: - template: {hello}/{goodbye}/{hello}/bonjour.html output_key: manifest_url 
 - input_class
- alias of - StringTemplateInput
 - run(body: dict[str, Any]) Any
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.string_template.StringTemplateInput(*, exists_key: str = '$', exists_delimiter: str = '.', template: str, descructive: bool = False, output_key: str)
- Bases: - Input- Model for String Template Input. - descructive: bool
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'descructive': FieldInfo(annotation=bool, required=False, default=False, description='True if terms should be removed after templating.'), 'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'output_key': FieldInfo(annotation=str, required=True, description='key to output to.'), 'template': FieldInfo(annotation=str, required=True, description='Template to follow.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - output_key: str
 - template: str
 
extraction_methods.plugins.xml module
XML Method
- class extraction_methods.plugins.xml.XMLExtract(*args: Any, **kwargs: Any)
- Bases: - ExtractionMethod- Method: - xml- Description:
- Processes XML documents to extract metadata 
 - Configuration Options: .. list-table: - - ``input_term``: Term for method to run on. - ``template``: ``REQUIRED`` Template to follow. - ``properties``: ``REQUIRED`` List of properties to retrieve from the document. - ``namespaces``: ``REQUIRED`` Map of namespaces. - Extraction Keys:
- Extraction keys should be a map. - Name - Description - key- Key of the property. Passed to xml.etree.ElementTree.find() and also supports xpath formatted accessors - output_key- Key to output to. - attribute- Allows you to select from the element attribute. In the absence of this value, the default behaviour is to access the text value of the key. In some cases, you might want to access and attribute of the element. 
 - Example configuration: .. code-block:: yaml - method: xml inputs: - properties:
- name: start_datetime key: ‘.//gml:beginPosition’ attribute: start 
 
 
 - # noqa: W605 - run(body: dict[str, Any]) dict[str, Any]
- Run the method. - Parameters:
- body (dict) – current generated properties 
- Returns:
- updated body dict 
- Return type:
- dict 
 
 
- class extraction_methods.plugins.xml.XMLInput(*, exists_key: str = '$', exists_delimiter: str = '.', input_term: str = '$uri', properties: list[XMLProperty], namespaces: dict[str, str])
- Bases: - Input- Model for XML Input. - input_term: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'exists_delimiter': FieldInfo(annotation=str, required=False, default='.', description='Delimiter for nested exists terms.'), 'exists_key': FieldInfo(annotation=str, required=False, default='$', description='Key to signify a previously extracted terms.'), 'input_term': FieldInfo(annotation=str, required=False, default='$uri', description='Term for method to run on.'), 'namespaces': FieldInfo(annotation=dict[str, str], required=True, description='Map of namespaces.'), 'properties': FieldInfo(annotation=list[XMLProperty], required=True, description='List of properties to retrieve from the document.')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1. 
 - namespaces: dict[str, str]
 - properties: list[XMLProperty]
 
- class extraction_methods.plugins.xml.XMLProperty(*, key: str, output_key: str = '', attribute: str = '')
- Bases: - KeyOutputKey- Model for XML property. - attribute: str
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - model_fields: ClassVar[dict[str, FieldInfo]] = {'attribute': FieldInfo(annotation=str, required=False, default='', description='Attribute of the XML property.'), 'key': FieldInfo(annotation=str, required=True), 'output_key': FieldInfo(annotation=str, required=False, default='')}
- Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo]. - This replaces Model.__fields__ from Pydantic V1.