Skip to content

Regex

RegexExtract

Bases: ExtractionMethod

Takes an input string and a regex with named capture groups and returns a dictionary of the values extracted using the named capture groups.

Method name: regex

Example configuration

.. code-block:: yaml

- method: regex
  inputs:
    regex: ^(?:[^_]*_){2}(?P<datetime>\d*)

noqa: W605

Source code in extraction_methods/plugins/regex.py
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
class RegexExtract(ExtractionMethod):
    """
    Takes an input string and a regex with
    named capture groups and returns a dictionary of the values
    extracted using the named capture groups.

    **Method name:** ``regex``

    Example configuration:
        .. code-block:: yaml

            - method: regex
              inputs:
                regex: ^(?:[^_]*_){2}(?P<datetime>\d*)

    # noqa: W605
    """

    input_class = RegexInput

    @update_input
    def run(self, body: dict[str, Any]) -> dict[str, Any]:

        result = re.search(rf"{self.input.regex}", self.input.input_term)

        if result:
            body |= result.groupdict()

        else:
            LOGGER.debug("No matches found for regex extract")

        return body

RegexInput

Bases: Input

Model for Regex Input.

Parameters:

Name Type Description Default
input_term str

term for method to run on.

'$uri'
regex str

The regular expression to match against.

required
Source code in extraction_methods/plugins/regex.py
23
24
25
26
27
28
29
30
31
32
33
34
class RegexInput(Input):
    """
    Model for Regex Input.
    """

    input_term: str = Field(
        default="$uri",
        description="term for method to run on.",
    )
    regex: str = Field(
        description="The regular expression to match against.",
    )