Technology

MICO Framework

The MICO Platform is an environment that will allow to analyse “media in context” by orchestrating a set of different analysis components that can work in sequence on content, each adding their bit of additional information to the final result. Analysis components can e.g. be a “language detector” (identifying the language of text or an audio track), a “keyframe extractor” (identifying relevant images from a video), a “face detector” (identifying objects that could be faces), a “face recognizer” (assigning faces to concrete persons), an “entity linker” (assigning objects to concrete entities) or a “disambiguation component” (resolving possible alternatives be choosing the more likely given the context). Further details are described in the deliverables D6.1.1 and D6.2.1.

  • Media Extractors

    The MICO extractors include animal detection, textual analysis, video quality, temporal, segmentation, automatic speech recognition, speech-music discrimination, face detection, audio tampering detection and further extractors based on showcase requirements

  • Media Broker

    Built on top of RabbitMQ, providing processing queues, infrastructure for service registration and extractor scheduling including event API for Java and C++ extractors and extended extractor model for properties, parameters, multiple inputs and outputs, syntactic vs. semantic descriptions, contexts; extractor registration and discovery service: “workflow planner” for manual an semi-automatic creation of workflows; Apache Camel for workflow execution, dynamic routing in analysis workflows; support for Enterprise Integration Patterns (EIP).

  • Metadata Model

    Provides harmonised model for publishing analysis results with a common data semantic in the cross-media context based on RDF, support for all media types, spatial/temporal media fragments, provenance, core concept: Open Annotation Data Model (OADM), with MICO-specific vocabularies and relation extensions, able to represent accumulations and correlations of annotation results and facilitates mapping of native extractor output of RDF via API.

  • Querying

    MICO querying extends SPARQL to SPARQL-MM to bridge the gap between interlinked resources and multimedia data, e.g. “point me to scenes within videos where Barack Obama is standing to the left of the CEO of Greenpeace while talking about whale hunting” and supports additional features like asset relatedness, proper media presentation layer

  • Recommendation

    MICO cross-media recommendation provides extractor and manual annotations, content metadata, and usage information for all relevant types can be used to enrich and extend the recommendation models, providing recommendations for all relevant media types, e.g. user preferences fro image with lions (e.g. identified by automatic image analysis or manual annotation, and usage information) can be used to recommend related images, documents, posts videos, or video fragments, to current focus: content-based recommendation, using PredictionIO and a monitoring API for data collection and collaboration filtering based on user-item interaction, recommendation of users (using e.g. Apache Mahout; recommendation of fragments (e.g. specific news segments); support for cross-media recommendation based on e.g. topics, specifies via media extraction. The persistence layer is based on HDFS (content) and Apache Marmotta (metadata)

OPEN SOURCE

The MICO platform and the extractor APIs are Open Source Software (OSS) and licensed under the Apache License Version 2.0. . As for the MICO extractors, there is an “open business” approach: Many extractors are  OSS (and will be released soon) while other are closed source commercial software(e.g. they have been brought into the project as so-called “background knowledge”). The modular, service based MICO system architecture makes it easy to develop, deploy and run both types of extractors.

mico_extractor_oss

We used a lot of existing open source software for extractor implementation (see Figure below). Thus MICO directly benefit from the scientific advances and improvements made within these popular OSS projects.

Apache Stanbol™ provides a set of reusable components for semantic content management.

Apache Stanbol’s extends traditional content management systems with semantic services. Other feasible use cases include: direct usage from web applications (e.g. for tag extraction/suggestion; or text completion in search fields), ‘smart’ content workflows or email routing based on extracted entities, topics and more. Apache Stanbol™ is a trademark and project of the Apache Software Foundation independent of MICO.

Apache Marmotta™ (top-level project) is an Open Platform for Linked Data.

Apache Marmotta™ provides an open implementation of a Linked Data Platform that can be used, extended and deployed easily by organisations who want to publish Linked Data or build custom applications on Linked Data. Apache Marmotta™ is a project and a trademark of the Apache Software Foundation, independent of MICO, although our team is committed to the development and continuous growth of this project. Apache Marmotta™ is a trademark and project of the Apache Software Foundation independent of MICO.