Skip to content

Regex label

RegexLabelExtract

Bases: ExtractionMethod

Adds label if full match of regex.

Method name: regex_label

Example configuration

.. code-block:: yaml

- method: regex_label
  inputs:
    label: metadata
    regex: README
    allow_multiple: true

noqa: W605

Source code in extraction_methods/plugins/regex_label.py
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
class RegexLabelExtract(ExtractionMethod):
    """
    Adds label if full match of regex.

    **Method name:** ``regex_label``

    Example configuration:
        .. code-block:: yaml

            - method: regex_label
              inputs:
                label: metadata
                regex: README
                allow_multiple: true

    # noqa: W605
    """

    input_class = RegexLabelInput

    @update_input
    def run(self, body: dict[str, Any]) -> dict[str, Any]:

        if re.fullmatch(rf"{self.input.regex}", self.input.input_term):
            if self.input.allow_multiple:
                body.setdefault(self.input.output_key, []).append(self.input.label)

            else:
                body[self.input.output_key] = self.input.label

        return body

RegexLabelInput

Bases: Input

Model for Regex Label Input.

Parameters:

Name Type Description Default
input_term str

term for method to run on.

'$uri'
label str

Label to add if regex passes.

required
regex str

Regex to test against.

required
allow_multiple bool

True if multiple labels are allowed.

True
output_key str

Term for method to output to.

'label'
Source code in extraction_methods/plugins/regex_label.py
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
class RegexLabelInput(Input):
    """
    Model for Regex Label Input.
    """

    input_term: str = Field(
        default="$uri",
        description="term for method to run on.",
    )
    label: str = Field(
        description="Label to add if regex passes.",
    )
    regex: str = Field(
        description="Regex to test against.",
    )
    allow_multiple: bool = Field(
        default=True,
        description="True if multiple labels are allowed.",
    )
    output_key: str = Field(
        default="label",
        description="Term for method to output to.",
    )