Representations¶

Representations convert a model-agnostic TemplateSequence into the concrete input shape expected by a detector.

A representation receives the full TemplateSequence, not just sequence.templates. That means custom representations can use event timing (dt_prev_ms), extracted parameters, entity IDs, or any other sequence metadata when building model inputs.

>>> from anomalog.representations import (
...     SequentialRepresentation,
...     TemplateCountRepresentation,
...     TemplatePhraseRepresentation,
... )
>>> from anomalog.sequences import TemplateSequence
>>> sequence = TemplateSequence(
...     events=[
...         ("Error on node <*>", ["7"], None),
...         ("Error on node <*>", ["8"], 50),
...     ],
...     label=1,
...     entity_ids=["node-7"],
...     window_id=3,
... )
>>> SequentialRepresentation().represent(sequence)
['Error on node <*>', 'Error on node <*>']
>>> TemplateCountRepresentation().represent(sequence)
Counter({'Error on node <*>': 2})
>>> TemplatePhraseRepresentation(phrase_ngram_min=1, phrase_ngram_max=1).represent(sequence)
Counter({'error on node <*>': 2, 'error': 2, 'on': 2, 'node': 2})

Use:

SequentialRepresentation for ordered template streams
TemplateCountRepresentation for sparse template-count vectors
TemplatePhraseRepresentation for phrase-count features derived from template text

The built-ins are intentionally template-centric, but that is a choice of those representations rather than a limit of the interface.

Represented outputs are wrapped in SequenceSample, which preserves entity_ids, label, split_label, and window_id alongside the representation payload.

You can also define your own representation by implementing SequenceRepresentation[T] and passing it to represent_with(...).

>>> from dataclasses import dataclass
>>> @dataclass(frozen=True)
... class SequenceSummaryRepresentation:
...     name = "sequence_summary"
...
...     def represent(self, sequence: TemplateSequence) -> dict[str, int | list[str]]:
...         return {
...             "entity_count": len(sequence.entity_ids),
...             "timed_event_count": sum(
...                 dt_prev_ms is not None for _, _, dt_prev_ms in sequence.events
...             ),
...             "entity_ids": sequence.entity_ids,
...         }
>>> SequenceSummaryRepresentation().represent(sequence)
{'entity_count': 1, 'timed_event_count': 1, 'entity_ids': ['node-7']}

`anomalog.representations`¶

Public sequence representation exports.

`SequenceRepresentation` ¶

Bases: Protocol[TRepresentation]

Protocol for converting full grouped sequences into model inputs.

Implementations receive the complete TemplateSequence, including event timings, extracted parameters, entity IDs, labels, and split metadata, and may choose whichever fields are relevant for a detector.

Attributes:

Name	Type	Description
`name`	`ClassVar[str]`	Stable registry/config name for the representation.

`represent(sequence)` ¶

Convert one grouped sequence into a representation payload.

Parameters:

Name	Type	Description	Default
`sequence`	`TemplateSequence`	Full grouped sequence carrying events, labels, entity ids, and split metadata.	required

Returns:

Name	Type	Description
`TRepresentation`	`TRepresentation`	Detector-specific representation of the sequence.

`SequenceRepresentationView` `dataclass` ¶

Bases: Generic[TRepresentation]

Lazy iterable over represented sequence samples.

The representation stage is the point where a model decides which parts of TemplateSequence matter; the full sequence object is passed through to the representation implementation on each iteration.

Attributes:

Name	Type	Description
`sequences`	`SequenceBuilder`	Underlying sequence builder producing `TemplateSequence` objects lazily.
`representation`	`SequenceRepresentation[TRepresentation]`	Representation applied to each yielded sequence.

`iter()` ¶

Yield represented sequence samples.

Yields:

Type	Description
`SequenceSample[TRepresentation]`	SequenceSample[TRepresentation]: One represented sample per input template sequence.

`iter_labeled_examples()` ¶

Yield (x, y) pairs only, intentionally dropping split metadata.

Yields:

Type	Description
`tuple[TRepresentation, int]`	tuple[TRepresentation, int]: Representation payload and label pairs.

`SequenceSample` `dataclass` ¶

Bases: Generic[TRepresentation]

Model-ready data derived from a TemplateSequence.

TemplateSequence is the grouped log window; SequenceSample is the representation-specific payload passed to a detector.

Attributes:

Name	Type	Description
`data`	`TRepresentation`	Detector-ready representation payload.
`label`	`int`	Sequence-level anomaly label derived from the source window.
`entity_ids`	`list[str]`	Unique entity ids present in the source window.
`split_label`	`SplitLabel`	Train/test split assigned during sequence building.
`window_id`	`int`	Stable window identifier within the sequence builder.

`as_labeled_example()` ¶

Return a generic (x, y) example pair.

Returns:

Type	Description
`tuple[TRepresentation, int]`	tuple[TRepresentation, int]: Representation payload and label.

`from_sequence(sequence, *, data)` `classmethod` ¶

Build a model-ready sample from one template sequence.

Parameters:

Name	Type	Description	Default
`sequence`	`TemplateSequence`	Source grouped sequence carrying labels and metadata.	required
`data`	`TRepresentation`	Representation payload derived from the sequence.	required

Returns:

Type	Description
`SequenceSample[TRepresentation]`	SequenceSample[TRepresentation]: Sample carrying the represented payload together with the original sequence metadata.

`SequentialRepresentation` `dataclass` ¶

Bases: SequenceRepresentation[list[str]]

Ordered template-only representation for sequential models.

Attributes:

Name	Type	Description
`name`	`ClassVar[str]`	Registry/config name for the representation.

`represent(sequence)` ¶

Return the ordered template stream for one sequence.

Parameters:

Name	Type	Description	Default
`sequence`	`TemplateSequence`	Sequence whose template order should be preserved exactly.	required

Returns:

Type	Description
`list[str]`	list[str]: Ordered template stream for the sequence.

`TemplateCountRepresentation` `dataclass` ¶

Bases: SequenceRepresentation[Counter[str]]

Count-based representation that intentionally uses template text only.

Attributes:

Name	Type	Description
`name`	`ClassVar[str]`	Registry/config name for the representation.

`represent(sequence)` ¶

Return one template-count vector.

Parameters:

Name	Type	Description	Default
`sequence`	`TemplateSequence`	Sequence whose template frequencies are being counted.	required

Returns:

Type	Description
`Counter[str]`	Counter[str]: Template-frequency vector for the sequence.

`TemplatePhraseRepresentation` `dataclass` ¶

Bases: SequenceRepresentation[Counter[str]]

Phrase-count representation derived from template text only.

This expands each template into normalsed full-template phrases and token n-grams. The representation deliberately ignores parameters and timing so phrase-based detectors react only to recurring message wording.

Attributes:

Name	Type	Description
`name`	`ClassVar[str]`	Registry/config name for the representation.
`phrase_ngram_min`	`int`	Smallest token n-gram size to emit.
`phrase_ngram_max`	`int`	Largest token n-gram size to emit.

`__post_init__()` ¶

Validate phrase extraction settings.

Raises:

Type	Description
`ValueError`	If the configured n-gram bounds are invalid.

`represent(sequence)` ¶

Return one phrase-count vector.

Parameters:

Name	Type	Description	Default
`sequence`	`TemplateSequence`	Sequence whose template phrases should be counted.	required

Returns:

Type	Description
`Counter[str]`	Counter[str]: Phrase-frequency vector for the sequence.

Representations¶

anomalog.representations¶

SequenceRepresentation ¶

represent(sequence) ¶

SequenceRepresentationView dataclass ¶

__iter__() ¶

iter_labeled_examples() ¶

SequenceSample dataclass ¶

as_labeled_example() ¶

from_sequence(sequence, *, data) classmethod ¶

SequentialRepresentation dataclass ¶

represent(sequence) ¶

TemplateCountRepresentation dataclass ¶

represent(sequence) ¶

TemplatePhraseRepresentation dataclass ¶

__post_init__() ¶

represent(sequence) ¶

`anomalog.representations`¶

`SequenceRepresentation` ¶

`represent(sequence)` ¶

`SequenceRepresentationView` `dataclass` ¶

`iter()` ¶

`iter_labeled_examples()` ¶

`SequenceSample` `dataclass` ¶

`as_labeled_example()` ¶

`from_sequence(sequence, *, data)` `classmethod` ¶

`SequentialRepresentation` `dataclass` ¶

`represent(sequence)` ¶

`TemplateCountRepresentation` `dataclass` ¶

`represent(sequence)` ¶

`TemplatePhraseRepresentation` `dataclass` ¶

`__post_init__()` ¶

`represent(sequence)` ¶