
Processor Abstract Classes

class mtap.processing.Processor[source]

Mixin used by all processor abstract base classes that provides the ability to update serving status and use timers.

update_serving_status(status: str)[source]

Updates the serving status of the processor for health checking.


status (str) – One of “SERVING”, “NOT_SERVING”, “UNKNOWN”.

static started_stopwatch(key: str) Stopwatch[source]

An object that can be used to time aspects of processing. The stopwatch will be started at creation.


key – The key to store the time under.


An object that is used to do the timing.


>>> # In a process method
>>> with self.started_stopwatch('key'):
>>>     # do work
>>>     ...
static unstarted_stopwatch(key: str) Stopwatch[source]

An object that can be used to time aspects of processing. The stopwatch will be stopped at creation.


key – The key to store the time under.


An object that is used to do the timing.


>>> # In a process method
>>> with self.unstarted_stopwatch('key') as stopwatch:
>>>     for _ in range(10):
>>>         # work you don't want timed
>>>         ...
>>>         stopwatch.start()
>>>         # work you do want timed
>>>         ...
>>>         stopwatch.stop()
class mtap.EventProcessor[source]

Bases: Processor

Abstract base class for an event processor.


>>> class ExampleProcessor(EventProcessor):
...     def process(self, event, params):
...          # do work on the event
...          ...
property custom_label_adapters: Mapping[str, ProtoLabelAdapter]

Optional method used to provide non-standard proto label adapters for specific index names. Default implementation returns an empty dictionary.


A mapping from strings to label adapters.

abstract process(event: Event, params: Dict[str, Any]) Dict[str, Any] | None[source]

Performs processing on an event, implemented by the subclass.

  • event – The event object to be processed.

  • params – Processing parameters. A dictionary of strings mapped to json-serializable values.


An arbitrary dictionary of strings mapped to json-serializable values which will be returned to the caller, even remotely.


Can be overridden for cleaning up anything that needs to be cleaned up. Will be called by the framework after it’s done with the processor.

class mtap.DocumentProcessor[source]

Bases: EventProcessor

Abstract base class for a document processor.


>>> class ExampleProcessor(mtap.DocumentProcessor):
...     def process(self, document, params):
...         # do processing on document
...         ...
>>> class ExampleProcessor(mtap.DocumentProcessor):
...     def process(self, document, params):
...          with self.started_stopwatch('key'):
...               # use stopwatch on something
...               ...
abstract process_document(document: Document, params: Dict[str, Any]) Dict[str, Any] | None[source]

Performs processing of a document on an event, implemented by the subclass.

  • document – The document object to be processed.

  • params – Processing parameters. A dictionary of strings mapped to json-serializable values.


An arbitrary dictionary of strings mapped to json-serializable values that will be returned to the caller of the processor.


Can be overridden for cleaning up anything that needs to be cleaned up. Will be called by the framework after it’s done with the processor.

Processor Utilities

class mtap.processing.Stopwatch(key: str | None = None, context: Optional = None)[source]

A class for timing runtime of components and returning the total runtime with the processor’s results.

Although it can be instantiated and used outside a processing context the normal usage would be to instantiate using Processor.started_stopwatch() or Processor.unstarted_stopwatch() methods.


The amount of time elapsed for this timer.




>>> # in an EventProcessor or DocumentProcessor process method call
>>> with self.started_stopwatch('key'):
>>>     timed_routine()
>>> # in an EventProcessor or DocumentProcessor process method call
>>> with self.unstarted_stopwatch('key') as stopwatch:
>>>     for _ in range(10):
>>>         # work you don't want timed
>>>         ...
>>>         stopwatch.start()
>>>         # work you want timed
>>>         ...
>>>         stopwatch.stop()

Starts the timer.


Stops / pauses the timer

Processor Description Decorators

Descriptors for processor functionality.

mtap.descriptors.processor(name: str, human_name: str | None = None, description: str | None = None, parameters: List[ParameterDescriptor] | None = None, inputs: List[LabelIndexDescriptor] | None = None, outputs: List[LabelIndexDescriptor] | None = None, additional_data: Dict[str, Any] | None = None) None

Decorator which attaches a service name and metadata to a processor. Which then can be used for runtime reflection of how the processor works.


A decorator to be applied to instances of EventProcessor or DocumentProcessor. This decorator attaches the metadata, so it can be reflected at runtime.


>>> from mtap.processing import EventProcessor
>>> @processor('example-text-converter')
>>> class TextConverter(EventProcessor):
>>>     ...


>>> from mtap.processing import DocumentProcessor
>>> @processor('example-sentence-detector')
>>> class SentenceDetector(DocumentProcessor):
>>>     ...

From our own example processor:

>>> from mtap.processing import DocumentProcessor
>>> @processor('mtap-example-processor-python',
>>>            human_name="Python Example Processor",
>>>            description="counts the number of times the letters a"
>>>                        "and b occur in a document",
>>>            parameters=[
>>>                parameter(
>>>                     'do_work',
>>>                     required=True,
>>>                     data_type='bool',
>>>                     description="Whether the processor should do"
>>>                                 "anything."
>>>                )
>>>            ],
>>>            outputs=[
>>>                labels('mtap.examples.letter_counts',
>>>                       properties=[label_property('letter',
>>>                                                  data_type='str'),
>>>                                   label_property('count',
>>>                                                  data_type='int')])
>>>            ])
>>> class ExampleProcessor(DocumentProcessor):
>>>     ...
mtap.descriptors.parameter(name: str, description: str | None = None, data_type: str | None = None, required: bool = False) None

Alias for ParameterDescriptor.

mtap.descriptors.labels(name: str, reference: str | None = None, name_from_parameter: str | None = None, optional: bool = False, description: str | None = None, properties: List[LabelPropertyDescriptor] | None = None) None

Alias for ParameterDescriptor

mtap.descriptors.label_property(name: str, description: str | None = None, data_type: str | None = None, nullable: bool = False) None

Alias for LabelPropertyDescriptor.

class mtap.descriptors.ProcessorDescriptor(name: str, human_name: str | None = None, description: str | None = None, parameters: List[ParameterDescriptor] | None = None, inputs: List[LabelIndexDescriptor] | None = None, outputs: List[LabelIndexDescriptor] | None = None, additional_data: Dict[str, Any] | None = None)[source]

Decorator which attaches a service name and metadata to a processor. Which then can be used for runtime reflection of how the processor works.


A decorator to be applied to instances of EventProcessor or DocumentProcessor. This decorator attaches the metadata, so it can be reflected at runtime.


From our own example processor:

name: str

Identifying service name both for launching via command line and for service registration.

Should be a mix of alphanumeric characters and dashes so that it plays nice with the DNS name requirements of service discovery tools like Consul.

human_name: str | None = None

An optional human name for the processor.

description: str | None = None

A short description of the processor and what it does.

parameters: List[ParameterDescriptor] | None = None

The processor’s parameters.

inputs: List[LabelIndexDescriptor] | None = None

String identifiers for the label output from a previously-run processor that this processor requires as an input.

Takes the format "[processor-name]/[output]". Examples would be "tagger/pos_tags" or "sentence-detector/sentences".

outputs: List[LabelIndexDescriptor] | None = None

The label indices this processor outputs.

additional_data: Dict[str, Any] | None = None

Any other data that should be added to the processor’s metadata, should be serializable to yaml and json.

class mtap.descriptors.ParameterDescriptor(name: str, description: str | None = None, data_type: str | None = None, required: bool = False)[source]

A description of one of the processor’s parameters.

name: str

The parameter name / key.

description: str | None = None

A short description of the property and what it does.

data_type: str | None = None

The data type of the parameter. str, float, or bool; List[T] or Mapping[T1, T2] of those.

required: bool = False

Whether the processor parameter is required.

class mtap.descriptors.LabelIndexDescriptor(name: str, reference: str | None = None, name_from_parameter: str | None = None, optional: bool = False, description: str | None = None, properties: List[LabelPropertyDescriptor] | None = None)[source]

A description for a label type.

name: str

The label index name.

reference: str | None = None

If this is an output of another processor, that processor’s name followed by a slash and the default output name of the index go here. Example: “sentence-detector/sentences”.

name_from_parameter: str | None = None

If the label index gets its name from a parameter of the processor, specify that name here.

optional: bool = False

Whether this label index is an optional input or output.

description: str | None = None

A short description of the label index.

properties: List[LabelPropertyDescriptor] | None = None

The properties of the labels in the label index.

class mtap.descriptors.LabelPropertyDescriptor(name: str, description: str | None = None, data_type: str | None = None, nullable: bool = False)[source]

Creates a description for a property on a label.

name: str

The property’s name.

description: str | None = None

A short description of the property.

data_type: str | None = None

The data type of the property. Options are "str", "float", or "bool"; "List[T]" or "Mapping[str, T]" where T is one of those types.

nullable: bool = False

Whether the property can have a valid value of null.

Running Services

mtap.processor_parser() ArgumentParser[source]

An ArgumentParser that can be used to parse the settings for run_processor().


A parser containing server settings.


Using this as a parent parser:

>>> parser = ArgumentParser(parents=[processor_parser()])
>>> parser.add_argument('--my-arg-1')
>>> parser.add_argument('--my-arg-2')
>>> args = parser.parse_args()
>>> processor = MyProcessor(args.my_arg_1, args.my_arg_2)
>>> run_processor(processor, args)
mtap.run_processor(proc: EventProcessor, *, options: Namespace | None = None, args: Sequence[str] | None = None, mp_context=None)[source]

Runs the processor as a GRPC service, blocking until an interrupt signal is received.

  • proc – The processor to host.

  • mp – If true, will create instances of proc on multiple forked processes to process events. This is useful if the processor is computationally intensive and would run into Python GIL issues on a single process.

  • options – The parsed arguments from the parser returned by processor_parser().

  • args – Arguments to parse server settings from if namespace was not supplied.

  • mp_context – A multiprocessing context that gets passed to the process pool executor in the case of mp = True.


Will automatically parse arguments:

>>> run_processor(MyProcessor())

Manual arguments:

>>> run_processor(MyProcessor(), args=['-p', '8080'])