Extracting Emerging Knowledge from Social Media

Today I presented our full paper titled “Extracting Emerging Knowledge from Social Media” at the WWW 2017 conference.

The work is based on a rather obvious assumption, i.e., that knowledge in the world continuously evolves, and ontologies are largely incomplete for what concerns low-frequency data, belonging to the so-called long tail.

Socially produced content is an excellent source for discovering emerging knowledge: it is huge, and immediately reflects the relevant changes which hide emerging entities.

In the paper we propose a method and a tool for discovering emerging entities by extracting them from social media.

Once instrumented by experts through very simple initialization, the method is capable of finding emerging entities; we propose a mixed syntactic + semantic method. The method uses seeds, i.e. prototypes of emerging entities provided by experts, for generating candidates; then, it associates candidates to feature vectors, built by using terms occurring in their social content, and then ranks the candidates by using their distance from the centroid of seeds, returning the top candidates as result.

The method can be continuously or periodically iterated, using the results as new seeds.

The PDF of the full paper presented at WWW 2017 is available online (open access with Creative Common license).

You can also check out the slides of my presentation on Slideshare.

A version of the tool is available online for free use, thanks also to our partners Dandelion API and Microsoft Azure. The most recent version of the tool is available on GitHub here.

Reason and meaning of (top) models

Photo of a top model girl

After a semester in teaching model driven software engineering in my Advanced Software Engineering course at Politecnico, I feel the urgency of echoing once again a few words about the reason of models, inspired also from some content in our last book on Model-driven Software Development, or MDSE (see more here or on www.mdse-book.com).

By the way, don’t mind the girl picture now, we will come back to her later.. by now, you (and my students) can just consider her as attention-catching trick.

My main point is that you cannot avoid modeling.
The human mind inadvertently and continuously re-works reality by applying cognitive processes that alter the subjective perception of it. Humans generate a mental representation of the reality which is at the same time able to:

  •  generalize specific features of real objects (generalization);
  • classify the objects into coherent clusters (classification);
  •  aggregate objects into more complex ones (aggregation).

These represent natural behaviors that the human mind is natively able to perform (babies start performing these processes since they are a few months old) and that are performed by people in their everyday life. This process is known as abstraction, also widely applied in science and technology, where it is often referred to as modeling.

So, we can informally define a model as a simplified or partial representation of reality, defined in order to accomplish a task or to reach an agreement on a topic. Therefore, by definition, a model will never describe reality in its entirety.

And here we are to our nice girl in the picture. She actually is a “model” (actually, a top model). She is not reality. She is an idealized representation of reality, incarnating beauty, grace, desire or whatever feature you want to name, which is instrumental to some purpose (in this case, to show and sell clothes). You will not see all the aspects of her life. You only see her as a partial abstraction of the concept of “desirable girl”.

If we want to go back to more “serious” usages, models have been and are of central importance in many scientific contexts. Just think about physics or chemistry: the billiard ball model of a gas or the Bohr model of the atom are probably unacceptable simplifications of reality from many points of view, but at the same time have been paramount for understanding the basics of the field; the uniform motion model in physics is something that will never be accomplished in the real world, but is extremely useful for teaching purposes and as a basis for subsequent, more complex theories. Mathematics and other formal descriptions have been extremely useful in all fields for modeling and building upon models.
Modeling has been proven very effective at description and powerful at prediction.

A huge branch of philosophy of science itself is based on models. Thinking about models at the abstract and philosophical level raises questions in semantics (i.e., the representational function performed by models), ontology (i.e., the kind of things that models are), epistemology (i.e., how to learn through or from models) and philosophy.

In many senses, also considering that it is recognized that observer and observations alter the reality itself, at a philosophical level one can agree that “everything is a model”, since nothing can be processed by the human mind without being “modeled”.

Therefore, it’s not surprising that models have become crucial also in technical fields such as mechanics, civil engineering, and ultimately in computer science and computer engineering.
Within production processes, modeling allows us to investigate, verify, document, and discuss properties of products before they are actually produced. In many cases, models are even used for directly automating the production of goods.

That’s why discussion about whether modeling is good or bad is not really appropriate. As I said at the beginning, we all and always create a mental model of reality. This is even more appropriate when dealing with objects or systems that need to be developed: in this case, the developer must have in mind a model for his objective.

The model always exists, the only option designers have is about its form: it may be mental (existing only in the designers’ heads) or explicit. In other words, the designer can decide whether to dedicate effort to realizing an explicit representation of the model or to keep it within her/his own mind.

Model-driven Software Engineering in Practice (MDSE)

I find this discussion intriguing and profoundly motivating for neophytes. You can read more on this and on the purpose of modeling in the book Model-driven Software Development in Practice written by Jordi Cabot, Manuel Wimmer and myself (see more here and on www.mdse-book.com).

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).