Practical DevOps for Big Data/Introduction to Modelling

Introduction In the scope of this book, we introduce the DICE profile, its foundations and its architecture; further on, we outline how DICE-profiled models can be exploited in DevOps software development phases. The DICE profile has been structured to fit different abstraction levels (DPIM, DTSM, DDSM) similarly to the OMG Model-Driven Architecture (MDA) standard. For its construction, we have followed a guided process as recommended by state of the art works or building quality profiles. The DICE profile has been implemented and integrated within Papyrus UML, a UML modelling tool based on the well-known Eclipse integrated development environment. The DICE domain models and the DICE profile are publicly available under an open source license in their corresponding repositories, namely the DICE-Models Repository and the DICE-Profiles Repository. In the future, we will focus on the continued validation of the DICE profile.

For constructing a technically correct high-quality UML profile that covers the necessary concepts according to data intensive applications and corresponding Big Data technologies, several steps need to be followed. First, conceptual models for each abstraction level, i.e. DPIM, DTSM and DDSM, are needed. We have carried out this step by carefully reviewing the abstract concepts for modeling data intensive applications. Hence, we have obtained the abstractions for the DPIM level, which then conform the DICE domain model at DPIM level (see Platform-Independent Modelling). Later, we have reviewed the different Big Data technologies addressed by DICE (e.g., Hadoop, Spark or Storm) and we have defined the abstractions of interest, consequently obtaining the DICE domain model at DTSM level (see Technology-Specific Modelling). Finally, a Big Data application is deployed over a specific cloud architecture, whose deployment is captured by a DICE deployment model at DDSM level (see Deployment--Specific Modelling). As a second step, we realized the need of introducing fresh concepts for quality assessment since the DICE Profile, at DPIM and DTSM domain model, initially just offers concepts for describing an architectural view. Therefore, we searched in the literature for existing UML profiles that leverage quality concerns, and decided to incorporate MARTE and DAM. Hence, the DICE Profile has deep roots on these two profiles. Our task was to select from the domain models of MARTE and DAM those metaclasses of interest for supporting our specific needs on assessment. We studied how to integrate such metaclasses and the already developed DPIM domain model. Consequently, we gained a final domain model which integrates all needed features: applications abstractions at DPIM level and behavioral abstractions for quality assessment. As a third step, we faced the technical details of a profile construction. We needed to map the concepts of the DICE profile, at DPIM, DTMS and DDSM domain models, into proper UML profile constructors, i.e., stereotypes and tags. In particular, we have designed: (i) the DICE Library, containing data intensive applications specific types; and (ii) the DICE UML Extensions (stereotypes and tags). The objective was to introduce a small yet comprehensive set of stereotypes for the software designer. Finally, as a fourth and last step, we conducted a DICE UML profile assessment by identifying a set of requirements based on three case studies from different application domains: fraud detection (see Fraud Detection), vessel traffic management (see Maritime Operations), and acquisition of news from social sensors (see News and Media). We checked if the requirements were met by the profile and, if a requirement was not met, we went back to the previous step in order to refine it. Therefore, we followed an iterative process for the profile definition.

How the Profile Works While the annotated UML model is useful for the engineer to specify both the workflow of the DIA and its data characteristics, it is not suitable for an assessment of its performance requirements. Following the model-driven engineering paradigm, our goal is to define a quality-driven framework for developing DIA applications leveraging Big Data technologies. A key asset of DICE is the so called DICE profile, which offers the ability to design DIA using UML and a set of additional stereotypes to characterize specific DIA features. DICE-profiled models are the cornerstone of the DICE framework, since they are exploited by the DICE tool-chain to guide developers through the whole DIA lifecycle (e.g., development, quality analysis, deployment, testing, monitoring, etc.).

In this chapter, we have presented the DICE profile, its foundations and its architecture; and we have outlined how DICE-profiled models can be exploited in further software development phases. The DICE profile has deep roots on other two profiles, namely MARTE and DAM, and has been structured to fit different abstraction levels (DPIM, DTSM, DDSM) similarly to the MDA standard. For its construction, we have followed a guided process as recommended by state of the art works for building quality profiles. In the following sections, we give the details of the DICE Profile at each abstraction level.