Representations¶
Representations convert a model-agnostic TemplateSequence into the concrete
input shape expected by a detector.
A representation receives the full TemplateSequence, not just
sequence.templates. That means custom representations can use event timing
(dt_prev_ms), extracted parameters, entity IDs, or any other sequence
metadata when building model inputs.
>>> from anomalog.representations import (
... SequentialRepresentation,
... TemplateCountRepresentation,
... TemplatePhraseRepresentation,
... )
>>> from anomalog.sequences import TemplateSequence
>>> sequence = TemplateSequence(
... events=[
... ("Error on node <*>", ["7"], None),
... ("Error on node <*>", ["8"], 50),
... ],
... label=1,
... entity_ids=["node-7"],
... window_id=3,
... )
>>> SequentialRepresentation().represent(sequence)
['Error on node <*>', 'Error on node <*>']
>>> TemplateCountRepresentation().represent(sequence)
Counter({'Error on node <*>': 2})
>>> TemplatePhraseRepresentation(phrase_ngram_min=1, phrase_ngram_max=1).represent(sequence)
Counter({'error on node <*>': 2, 'error': 2, 'on': 2, 'node': 2})
Use:
SequentialRepresentationfor ordered template streamsTemplateCountRepresentationfor sparse template-count vectorsTemplatePhraseRepresentationfor phrase-count features derived from template text
The built-ins are intentionally template-centric, but that is a choice of those representations rather than a limit of the interface.
Represented outputs are wrapped in SequenceSample, which preserves
entity_ids, label, split_label, and window_id alongside the
representation payload.
You can also define your own representation by implementing
SequenceRepresentation[T] and passing it to represent_with(...).
>>> from dataclasses import dataclass
>>> @dataclass(frozen=True)
... class SequenceSummaryRepresentation:
... name = "sequence_summary"
...
... def represent(self, sequence: TemplateSequence) -> dict[str, int | list[str]]:
... return {
... "entity_count": len(sequence.entity_ids),
... "timed_event_count": sum(
... dt_prev_ms is not None for _, _, dt_prev_ms in sequence.events
... ),
... "entity_ids": sequence.entity_ids,
... }
>>> SequenceSummaryRepresentation().represent(sequence)
{'entity_count': 1, 'timed_event_count': 1, 'entity_ids': ['node-7']}
anomalog.representations¶
Public sequence representation exports.
SequenceRepresentation
¶
Bases: Protocol[TRepresentation]
Protocol for converting full grouped sequences into model inputs.
Implementations receive the complete TemplateSequence, including event
timings, extracted parameters, entity IDs, labels, and split metadata, and
may choose whichever fields are relevant for a detector.
represent(sequence)
¶
Convert one grouped sequence into a representation payload.
SequenceRepresentationView
dataclass
¶
Bases: Generic[TRepresentation]
Lazy iterable over represented sequence samples.
The representation stage is the point where a model decides which parts of
TemplateSequence matter; the full sequence object is passed through to the
representation implementation on each iteration.
__iter__()
¶
Yield represented sequence samples.
Yields:
| Type | Description |
|---|---|
SequenceSample[TRepresentation]
|
SequenceSample[TRepresentation]: One represented sample per input template sequence. |
SequenceSample
dataclass
¶
Bases: Generic[TRepresentation]
Model-ready data derived from a TemplateSequence.
TemplateSequence is the grouped log window; SequenceSample is the
representation-specific payload passed to a detector.
as_labeled_example()
¶
from_sequence(sequence, *, data)
classmethod
¶
Build a model-ready sample from one template sequence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sequence
|
TemplateSequence
|
Source grouped sequence carrying labels and metadata. |
required |
data
|
TRepresentation
|
Representation payload derived from the sequence. |
required |
Returns:
| Type | Description |
|---|---|
SequenceSample[TRepresentation]
|
SequenceSample[TRepresentation]: Sample carrying the represented payload together with the original sequence metadata. |
SequentialRepresentation
dataclass
¶
Bases: SequenceRepresentation[list[str]]
Ordered template-only representation for sequential models.
represent(sequence)
¶
Return the ordered template stream for one sequence.
TemplateCountRepresentation
dataclass
¶
Bases: SequenceRepresentation[Counter[str]]
Count-based representation that intentionally uses template text only.
represent(sequence)
¶
Return one template-count vector.
TemplatePhraseRepresentation
dataclass
¶
Bases: SequenceRepresentation[Counter[str]]
Phrase-count representation derived from template text only.
__post_init__()
¶
Validate phrase extraction settings.
Raises:
| Type | Description |
|---|---|
ValueError
|
If the configured n-gram bounds are invalid. |
represent(sequence)
¶
Return one phrase-count vector.