Myths and Challenges in Knowledge Extraction and Big Data Analysis

For centuries, science (in German “Wissenschaft”) has aimed to create (“schaften”) new knowledge (“Wissen”) from the observation of physical phenomena, their modelling, and empirical validation.

Recently, a new source of knowledge has emerged: not (only) the physical world any more, but the virtual world, namely the Web with its ever-growing stream of data materialized in the form of social network chattering, content produced on demand by crowds of people, messages exchanged among interlinked devices in the Internet of Things. The knowledge we may find there can be dispersed, informal, contradicting, unsubstantiated and ephemeral today, while already tomorrow it may be commonly accepted.

Picture2The challenge is once again to capture and create consolidated knowledge that is new, has not been formalized yet in existing knowledge bases, and is buried inside a big, moving target (the live stream of online data).

The myth is that existing tools (spanning fields like semantic web, machine learning, statistics, NLP, and so on) suffice to the objective. While this may still be far from true, some existing approaches are actually addressing the problem and provide preliminary insights into the possibilities that successful attempts may lead to.

I gave a few keynote speeches on this matter (at ICEIS, KDWEB,…), and I also use this argument as a motivating class in academic courses for letting students understand how crucial is to focus on the problems related to big data modeling and analysis. The talk, reported in the slides below, explores through real industrial use cases, the mixed realistic-utopian domain of data analysis and knowledge extraction and reports on some tools and cases where digital and physical world have brought together for better understanding our society.

The presentation is available on SlideShare and are reported here below:

Modeling, Modeling, Modeling: From Web to Enterprise to Crowd to Social

This is our perspective on the world: it’s all about modeling. 

So, why is it that model-driven engineering is not taking over the whole technological and social eco-system?

Let me make the case that it is.

A Comprehensive Guide Through the Italian Database Research Over the Last 25 YearsIn the occasion of the 25th edition of the Italian Symposium of Database Systems (SEBD 2017) we (Stefano Ceri and I) have been asked to write a retrospective on the last years of database and systems research from our perspective, published in a dedicated volume by Springer. After some brainstorming, we agreed that it all boils down to this: modeling, modeling, modeling.

Long time ago, in the past century, the International DB Research Community used to meet for assessing new research directions, starting the meetings with 2-minutes gong shows  to tell each one’s opinion and influencing follow-up discussion. Bruce Lindsay from IBM had just been quoted for his message:

There are 3 important things in data management: performance, performance, performance.

Stefano Ceri had a chance to speak out immediately after and to give a syntactically similar but semantically orthogonal message:

There are 3 important things in data management: modeling, modeling, modeling.

Data management is continuously evolving for serving the needs of an increasingly connected society. New challenges apply not only to systems and technology, but also to the models and abstractions for capturing new application requirements.

In our retrospective paper, we describe several models and abstractions which have been progressively designed to capture new forms of data-centered interactions in the last twenty five years – a period of huge changes due to the spreading of web-based applications and the increasingly relevant role of social interactions.

We initially focus on Web-based applications for individuals, then discuss applications among enterprises, and this is all about WebML and IFML; then we discuss how these applications may include rankings which are computed using services or using crowds, and this is related to our work on crowdsourcing (liquid query and crowdsearcher tool); we conclude with hints to a recent research discussing how social sources can be used for capturing emerging knowledge (the social knowledge extractor perspective and tooling).


All in all, modeling as a cognitive tool is all around us, and is growing in terms of potential impact thanks to formal cognification.

It’s also true that model-driven engineering is not necessarily the tool of choice for this to happen. Why? As technician, we always tend to blame the customer for not understanding our product. But maybe we should look into ourselves and the kind of tools (conceptual and technical) the MDE community is offering. I’m pretty sure we could find plenty of space for improvement.

Any idea on how to do this?


A Vision towards the Cognification of Model-driven Software Engineering

Jordi Cabot, Robert Clarisó, Marco Brambilla and Sébastien Gerard submitted a visionary paper on Cognifying Model-driven Software Development to the workshop GrandMDE (Grand Challenges in Modeling) co-located with STAF 2017 in Margburg (Germany) on July 17, 2017. The paper advocates for the cross-domain fertilization of disciplines such as machine learning and artificial intelligence, behavioural analytics, social studies, cognitive science, crowdsourcing and many more, in order to help model-driven software development.
But actually, what is cognification?

Cognification is the application of knowledge to boost the performance and impact of any process.

It is recognized as one of the 12 technological forces that will shape our future. We are flooded with data, ideas, people, activities, businesses, and goals. But this flooding could even be helpful.

The thesis of our paper is that cognification will also revolution in the way software is built. In particular, we discuss the opportunities and challenges of cognifying Model-Driven Software Engineering (MDSE or MDE) tasks.

MDE has seen limited adoption in the software development industry, probably because the perception from developers’ and managers’ perspective is that its benefits do not outweigh its costs.

We believe cognification could drastically improve the benefits and reduce the costs of adopting MDSE, and thus boost its adoption.

At the practical level, cognification comprises tools that go from artificial intelligence (machine learning, deep learning, as well as human cognitive capabilities, exploited through online activities, crowdsourcing, gamification and so on.

Opportunities (and challenges) for MDE

Here is a set of MDSE tasks and tools whose benefits can be especially boosted thanks to cognification.

  • A modeling bot playing the role of virtual assistant in the modeling tasks
  • A model inferencer able to deduce a common schema behind a set of unstructured data coming from the software process
  • A code generator able to learn the style and best practices of a company
  • A real-time model reviewer able to give continuous quality feedback
  • A morphing modeling tool, able to adapt its interface at run-time
  • A semantic reasoning platform able to map modeled concepts to existing ontologies
  • A data fusion engine that is able to perform semantic integration and impact analysis of design-time models with runtime data
  • A tool for collaboration between domain experts and modeling designers

A disclaimer

Obviously, we are aware that some research initiatives aiming at cognifying specific tasks in Software Engineering exist (including some activities of ours). But what we claim here is a change in magnitude of their coverage, integration, and impact in the short-term future.

If you want to get a more detailed description, you can go through the detailed post by Jordi Cabot that reports the whole content of the paper.

Model-driven Development of User Interfaces for IoT via Domain-specific Components & Patterns

This is the summary of a joint contribution with Eric Umuhoza to ICEIS 2017 on Model-driven Development of User Interfaces for IoT via Domain-specific Components & Patterns.
Internet of Things technologies and applications are evolving and continuously gaining traction in all fields and environments, including homes, cities, services, industry and commercial enterprises. However, still many problems need to be addressed.
For instance, the IoT vision is mainly focused on the technological and infrastructure aspect, and on the management and analysis of the huge amount of generated data, while so far the development of front-end and user interfaces for IoT has not played a relevant role in research.
On the contrary, we believe that user interfaces in the IoT ecosystem they can play a key role in the acceptance of solutions by final adopters.
In this paper we present a model-driven approach to the design of IoT interfaces, by defining a specific visual design language and design patterns for IoT applications, and we show them at work. The language we propose is defined as an extension of the OMG standard language called IFML.

The slides of this talk are available online on Slideshare as usual:

Business Process Management & Enterprise Architecture track of ACM SAC 2017

This year I’m co-organizing with Davide Rossi and a bunch of experts in Business Process Management and Enterprise Architecture a new event called BPM-EA, which aims at bringing together the broad topics of business processes, modeling, and enterprise architecture.

These disciplines are quickly evolving and intertwining with each other, and are often referred to with the broad term of business modeling.
I believe there is a strong need of exploring new paths of improvement, integration and consolidation of these disciplines.
If you are interested to participate and contribute, we seek contributions in the areas of enterprise and systems architecture and modeling, multilevel models tracing and alignment, models transformation, IT & business alignment (both in terms of modeling and goals), tackling both technical (languages, systems, patterns, tools) and social (collaboration, human-in-the-loop) issues.
The deadline for submitting a paper is September 15, 2016.
You can find the complete call and further details on the event website:
BPMEA track at SAC 2017

Feel free to share your ideas, opinions and criticisms here or as a submission to the event.

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Modeling and data science for citizens: multicultural diversity and environmental monitoring at ICWSM

This year we decided to be present at ICWSM 2016 in Cologne, with two contributions that basically blend model driven software engineering and big data analysis, to provide value to users and citizens both in terms of high quality software and added value information provision.

We joined with two papers, respectively:
Model Driven Development of Social Media Environmental Monitoring Applications presented at the SWEEM (Workshop on the Social Web for Environmental and Ecological Monitoring) workshop.

Slides here:


Studying Multicultural Diversity of Cities and Neighborhoods through Social Media Language Detection, presented at the CityLab workshop at ICWSM 2016. The focus of this work is to study cities as melting pots of people with different culture, religion, and language. Through multilingual analysis of Twitter contents shared within a city, we analyze the prevalent language in the different neighborhoods of the city and we compare the results with census data, in order to highlight any parallelisms or discrepancies between the two data sources. We show that the officially identified neighborhoods are actually representing significantly different communities and that the use of the social media as a data source helps to detect those weak signals that are not captured from traditional data. Slides here:

We now continuously look for new dataset and computational challenges. Feel free to ask or to propose ideas!

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Ready to crowdsourcing your modeling language notation?

As model-driven engineering practitioners, we sometimes encounter weird modelling notations for the languages we use… and this is also definitely true for modelling language adopters!

We always end up wondering who could ever think about such or such terrible syntax for a language, also for very well established notations (including, for instance, some pieces of UML or BPMN). I take it for granted this is a common experience (raise your hand if not).

 This lead to the idea that also syntax definition should be a more collaborative task. Therefore, we decided to give it a try and test whether crowdsourcing techniques can be used to create and validate language constructs, in particular, its concrete syntax (i.e. notation).
As part of our research work in this area, together with Jordi Cabot’s group, we have setup as an experiment a crowdsourcing campaign using our tool CrowdSearcher.
This boils down to a very simple case: we are asking anyone on the web to look into a very small subset of  BPMN, and to participate into 3 simple tasks, including questions for selecting the best notation for some of the BPMN concepts (it won’t take more than 3 minutes).
We asked people to help us responding these 3 quick questions!
Some disclaimers:
1. we don’t care if you are the world’s expert in BPMN or if you never heard about it. We want you!
2. we ask you to register before taking the task (just click on the Register button once you enter the task), simply to make sure we only have one performance per person. All the analysis will run on anonymous data.
3. The results of the survey will be made publicly available in the following months.
To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

No, MDE is not Engineering!

Following up on my previous post on the actual “Engineering” contribution of Model Driven Engineering, here is the final result of the 2-day poll posted on twitter:

While this is definitely not a statistically significant benchmark, I think it’s a significant insight on the field and on how ourselves (MDE practitioners and researchers) see the field.
Basically, there is absolutely no agreement and common understanding!!

On the question on whether MDE is a sound engineering discipline, one third of responders said yes, one third said no, and one third is not sure. Perfectly even distribution!

In summary, if you don’t count uncertainty, here is what we collected:

No, MDE is NOT Engineering!

Anyone wants to comment on this?

You can also go through the discussion between Lionel Briand, Paola Inverardi and Manfred Broy on MDE maturity in my previous post about the panel at ModelsWard 2016.

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

How Mature is of Model-driven Engineering as an Engineering Discipline? – Panel with Manfred Broy, Paola Inverardi and Lionel Briand

Within ModelsWard 2016, just after the opening speech I gave on February 19 in Rome, the opening panel has been about the current maturity of model-driven engineering. I also hosted a poll on twitter on this matter (results are available in this other post).  

I’m happy the panelists raised several issues I pointed out myself in the introduction to the conference: as software modelling scientists, we are facing big challenges nowadays, as the focus of modelling is shifting, due to the fact that now software is more and more pervasive, in fields like IoT, social network and social media, personal and wearable devices, and so on.

Panel included the keynote speakers of the conference: Manfred Broy, Paola Inverardi and Lionel Briand, three well known names in the Software Engineering and Modeling community.

Manfred Broy highlighted:

  • there is a different between scientific maturity and practical maturity. Sometimes, the latter in companies is far beyond the former.
  • a truck company in Germany has been practicing modelling for years, and now has this take on the world: whatever is not in the models, doesn’t exist
  • The current challenges are about how to model cyber-physical systems
  • The flow of model must be clarified: traceability, refinement, model integration are crucial. You must grant syntactic and semantic coherence
  • You also need a coherent infrastructure of tools and artefacts, that grants logic integration. You cannot obtain coherence of models without coherence of tools.
  • You need a lot of automation, otherwise you won’t get practical maturity. This doesn’t mean to have end-to-end, or round-trip complete model transformations, but you need to push automaton as much as possible

Lionel Briand clarified that:

  • by definition, engineering underpins deep mathematical background as a foundation and implies application of the scientific method to solving problems
  • maturity can be evaluated in terms of: how much math underpinning is foundational, how many standards and tools exist and are used, whether the scientific approach is used
  •  Tools, methods, engineers, and scale of MDE are increasing (aka. MDE is increasingly more difficult to avoid)
Paola Inverardi recalled a position by Jean Bezivin:
  • we need to split Domain Engineering (where the problem is) and Support Engineering (where the solution will be)
  • MDE is the application of modelling principles and tools to any engineering field
  • So: is actually SOFTWARE the main field of interest of model-driven engineering?
  • In the modern interpretation of life, covering from smart cities to embedded, wearable, and cyber-physical systems, is the border between the environment and the system still relevant?
  • In the future we will need to rely less and less on the “creativity” of engineers when building models, and more and more on the scientific/ quantitative/ empirical methods for building models

The debate obviously stirred around this aspects, starting from Bran Selic who asked a very simple question:

Isn’t it the case that the real problem is about the word “modeling”? In any other fields (architecture, mechanics, physics) modelling is implicit and obvious. Why not in our community? At the end, what we want to achieve is to raise abstraction and increase automation, nothing else.

Other issues have been raised too:

  • why is there so much difference in attitude towards modelling between Europe and US?
  • what’s the role of notations and standards in the success / failure of MDE?

What’s your take on this issue?
Feel free to share your thoughts here or on Twitter, mentioning me  (@MarcoBrambi).
Respond to my poll on twitter!

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Webinar on WebRatio BPM Platform 8.4

I’m glad to share the video of the most recent webinar on WebRatio BPM Platform, the BPMN-based tool designed to support you in building high-end BPM Web and mobile Apps with a tailored User Experience. If you never experienced WebRatio BPM Platform, here is a summary of what you can do with it:

  •  DEVELOP WEB AND MOBILE APPS through prototypes, then change them as many times as you need. No more time wasted building mockups on paper.
  • NO VENDOR LOCK IN thanks to highly optimized generated code that is open, human readable and based on the most recent Java and JS frameworks.
  • DEFINE A CUSTOM WEB OR MOBILE FRONT END for your BPM App and create a customized user interface, giving every channel a different user experience.
  • SUPPORT YOUR USERS’ MOBILITY thanks to the mobile BPM capabilities that let you work on your BPM App on any device, desktop or mobile, and deliver a seamless user experience.
Discover more on the WebRatio site or watch the video of the webinar on YouTube:

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).