Workflow creation headaches
Designing workflows requires an understanding of all respective extractor interdependencies, parameters and constraints. Beyond that, the overall quality of the workflow annotations often depends heavily on the content used and the context and goal of a specific application.
In a word, at least for more complex workflows, it is a tedious and error-prone task: Setting up, testing, failing, modifying, testing, failing, modifying again, etc.
Within the second phase of the MICO project, when compiling the broker wishlist (see blog post #1) and deriving the broker design (see blog post #2), we found that it would be possible to support semi-automatic workflow creation: The idea was to use information from the broker data model including extractor properties, parameters, input and output, and to implement a GUI that interacted with this model to help a user to quickly identify and select compatible extractors, thereby creating workflows much easier than beforehand.
The 3 pillars and initial extractor selection
As outlined in blog post #2, possible connections between extractors are determined by the MimeType for the data provided/consumed, the SyntacticType to ensure syntactical interoperability, and the the respective SemanticType (which can be considered a tag) to signal a semantic match. By using this information, all possible connections are identified, and visualized in the GUI (extractors: yellow, extractor output/input: green):
The task of combining extractors is now as simple as activating or deactivating them with a click on the respective (yellow) node – in this example, in order to create a combination that uses video, applies demux, then triggering Lium speaker diarization and Kaldi ASR, and finally language detection and OpenNLP NER on the resulting transcription.
After selection, and possible adaptation of extractor parameters by the user (in many cases, the default settings can be used), this is then converted into an actual workflow. For the above example, this results in the following workflow (extractors: blue, extractor output/input: arrows with green label):
This can now be persisted as a Camel workflow, pushed to the platform, and used for content analysis.
Workflow recycling and extension
Of course, existing workflows can also be extended,e.g. to allow processing of different input material. One example to this is show in the following simple workflow, which applies YoloDetector, DPMDetector, and HOGDetector in parallel, in order to detect animals within images:
With an additional click, this workflow can be easily extended with the TVSDetection (Temporal Video Segmentation) component, which provides several functionalities (modes), including key frame extraction, which can be used to allow the processing of videos instead of images:
Making extractor relationships work
The broker data model allows to provide further functionalities which simplify the creation of workflows and validation of extractor interoperability, e.g.
- If two extractors provide and consume the same mimeType and syntacticType, but lack the same semanticType, this means that extractors also have a potential semantic match and hence should be linked via a new semanticType, requiring human feedback to create this link.
- If two extractors provide and consumer the same syntacticType and semanticType, but mimeType does not match, this signals that a simple extension of the extractor to support a new mimeType should be implemented, e.g., via format conversion, can do the trick to make them interoperable.
One example to the latter is shown in the following two workflows, which only differ with respect to the output mimeType of the audio-demux-mp4 component, which does (left) or does not (right) provide audio/wav as output, which is required by the subsequent speaker-diarization extractor.
Of course, for the future, several further possibilities to support the creation process exist and could be supported, and completely different approaches and UIs may be used for implementation!
In the next episode, we will finally discuss how the created workflows are executed.