Urbanscope: Digital Whispers from the Urban Landscape. TedX Talk Video

Together with the Urbanscope team, we gave a TedX talk on the topics and results of the project here at Politecnico di Milano. The talk was actually given by our junior researchers, as we wanted it to be a choral performance as opposed to the typical one-man show.

The message is that cities are not mere physical and organizational devices only: they are informational landscapes where places are shaped more by the streams of data and less by the traditional physical evidences. We devise tools and analysis for understanding these streams and the phenomena they represent, in order to understand better our cities.

Two layers coexist: a thick and dynamic layer of digital traces – the informational membrane – grows everyday on top of the material layer of the territory, the buildings and the infrastructures. The observation, the analysis and the representation of these two layers combined provides valuable insights on how the city is used and lived.

You can now find the video of the talk on the official TedX YouTube channel:

Urbanscope is a research laboratory where collection, organization, analysis, and visualization of cross domain geo-referenced data are experimented.
The research team is based at Politecnico di Milano and encompasses researchers with competencies in Computing Engineering, Communication and Information Design, Management Engineering, and Mathematics.

The aim of Urbanscope is to systematically produce compelling views on urban systems to foster understanding and decision making. Views are like new lenses of a macroscope: they are designed to support the recognition of specific patterns thus enabling new perspectives.

If you enjoyed the show, you can explore our beta application at:


and discover the other data science activities we are conducting at the Data Science Lab of Politecnico, DEIB.


The role of Big Data in Banks

I was listening at R. Martin Chavez, Goldman Sachs deputy CFO just last month in Harvard at the ComputeFest 2017 event, more precisely, the SYMPOSIUM ON THE FUTURE OF COMPUTATION IN SCIENCE AND ENGINEERING on “Data, Dollars, and Algorithms: The Computational Economy” held in Harvard on Thursday, January 19, 2017.

His claim was that

Banks are essentially API providers.

The entire structure and infrastructure of Goldman Sachs is being restructured for that. His case is that you should not compare a bank with a shop or store, you should compare it with Google. Just imagine that every time you want to search on Google you need to get in touch (i.e., make a phone call or submit a request) to some Google employee, who at some points comes back to you with the result. Non sense, right?  Well, but this is what actually happens with banks. It was happening with consumer-oriented banks before online banking, and it’s still largely happening for business banks.

But this is going to change. Amount of data and speed and volume of financial transaction doesn’t allow that any more.

Banks are actually among the richest (not [just] in terms of money, but in data ownership). But they are also craving for further “less official” big data sources.

Juri Marcucci: Importance of Big Data for Central (National) Banks.

Today at the ISTAT National Big Data Committee meeting in Rome, Juri Marcucci from Bank of Italy discussed their research activity in integration of Google Trends information in their financial predictive analytics.

Google Trends provide insights of user interests in general, as the probability that a random user is going to search for a particular keyword (normalized and scaled, also with geographical detail down to city level).

Bank of Italy is using Google Trends data for complementing their prediction of unemployment rates in short and mid term. It’s definitely a big challenge, but preliminary results are promising in terms of confidence on the obtained models. More details are available in this paper.

Paolo Giudici from University of Pavia showed how one can correlate the risk of bank defaults with their exposition on Twitter:

Paolo Giudici: bank risk contagion based (also) on Twitter data.

Obviously, all this must take into account the bias of the sources and the quality of the data collected. This was pointed out also by Paolo Giudici from University of Pavia. Assessment of “trustability” of online sources is crucial. In their research, they defined the T-index on Twitter accounts in a very similar way academics define the h-index for relevance of publications, as reported in the photographed slide below.

Paolo Giudici: T-index describing the quality of Twitter authors in finance.

It’s very interesting to see how creative the use of (non-traditional, web based) big data is becoming, in very diverse fields, including very traditional ones like macroeconomy and finance.

And once again, I think the biggest challenges and opportunities come from the fusion of multiple data sources together: mobile phones, financial tracks, web searches, online news, social networks, and official statistics.

This is also the path that ISTAT (the official institute for Italian statistics) is pursuing. For instance, in the calculation of official national inflation rates, web scraping techniques (for ecommerce prices) upon more than 40.000 product prices are integrated in the process too.



The Harvard-Politecnico Joint Program on Data Science in full bloom

After months of preparation, here we are.

This week we kicked off the second edition of the DataShack program on Data Science that brings together interdisciplinary teams of data science, software engineering & computer science, and design students from Harvard (Institute of Applied Computational Science) and Politecnico di Milano (faculties of Engineering and Design).

The students will address big data extraction, analysis, and visualization problems provided by two real-world stakeholders in Italy: the Como city municipality and Moleskine.

logo-moleskineThe Moleskine Data-Shack project will explore the popularity and success of different Moleskine products co-branded with other famous brands (also known as special editions) and launched in specific periods in time. The main field of analysis is the impact that different products have on social media channels. Social media analysis then will be correlated with product distribution and sales performance data, along multiple dimensions (temporal, geographical, etc.) and product features.

logo-comoThe project consists of collecting and analyzing data about the city and the way people live and move within it, by integrating multiple and diverse data sources. The problems to be addressed may include providing estimates of human density and movements within the city, predicting the impact of hypothetical future events, determining the best allocation of sensors in the streets, and defining optimal user experience and interaction for exploring the city data.

The kickoff meeting of the DataShack 2017 projects, in Harvard. Faculties Pavlos Protopapas, Stefano Ceri, Paola Bertola, Paolo Ciuccarelli and myself (Marco Brambilla) are involved in the program.

The teams have been formed, and the problems assigned. I really look forward to advising the groups in the next months and seeing the results that will come out. The students have shown already commitment and engagement. I’m confident that they will be excellent and innovative this year!

For further activities on data science within our group you can refer to the DataScience Lab site, Socialometers, and Urbanscope.


Modeling and Analyzing Engagement in Social Network Challenges

Within a completely new line of research, we are exploring the power of modeling for human behaviour analysis, especially within social networks and/or in occasion of large scale live events. Participation to challenges within social networks is a very effective instrument for promoting a brand or event and therefore it is regarded as an excellent marketing tool.
Our first reasearch has been published in November 2016 at WISE Conference, covering the analysis of user engagement within social network challenges.
In this paper, we take the challenge organizer’s perspective, and we study how to raise the
engagement of players in challenges where the players are stimulated to
create and evaluate content, thereby indirectly raising the awareness about the brand or event itself. Slides are available on slideshare:

We illustrate a comprehensive model of the actions and strategies that can be exploited for progressively boosting the social engagement during the challenge evolution. The model studies the organizer-driven management of interactions among players, and evaluates
the effectiveness of each action in light of several other factors (time, repetition, third party actions, interplay between different social networks, and so on).
We evaluate the model through a set of experiment upon a real case, the YourExpo2015 challenge. Overall, our experiments lasted 9 weeks and engaged around 800,000  users on two different social platforms; our quantitative analysis assesses the validity of the model.



To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

How Mature is of Model-driven Engineering as an Engineering Discipline? – Panel with Manfred Broy, Paola Inverardi and Lionel Briand

Within ModelsWard 2016, just after the opening speech I gave on February 19 in Rome, the opening panel has been about the current maturity of model-driven engineering. I also hosted a poll on twitter on this matter (results are available in this other post).  

I’m happy the panelists raised several issues I pointed out myself in the introduction to the conference: as software modelling scientists, we are facing big challenges nowadays, as the focus of modelling is shifting, due to the fact that now software is more and more pervasive, in fields like IoT, social network and social media, personal and wearable devices, and so on.

Panel included the keynote speakers of the conference: Manfred Broy, Paola Inverardi and Lionel Briand, three well known names in the Software Engineering and Modeling community.

Manfred Broy highlighted:

  • there is a different between scientific maturity and practical maturity. Sometimes, the latter in companies is far beyond the former.
  • a truck company in Germany has been practicing modelling for years, and now has this take on the world: whatever is not in the models, doesn’t exist
  • The current challenges are about how to model cyber-physical systems
  • The flow of model must be clarified: traceability, refinement, model integration are crucial. You must grant syntactic and semantic coherence
  • You also need a coherent infrastructure of tools and artefacts, that grants logic integration. You cannot obtain coherence of models without coherence of tools.
  • You need a lot of automation, otherwise you won’t get practical maturity. This doesn’t mean to have end-to-end, or round-trip complete model transformations, but you need to push automaton as much as possible

Lionel Briand clarified that:

  • by definition, engineering underpins deep mathematical background as a foundation and implies application of the scientific method to solving problems
  • maturity can be evaluated in terms of: how much math underpinning is foundational, how many standards and tools exist and are used, whether the scientific approach is used
  •  Tools, methods, engineers, and scale of MDE are increasing (aka. MDE is increasingly more difficult to avoid)
Paola Inverardi recalled a position by Jean Bezivin:
  • we need to split Domain Engineering (where the problem is) and Support Engineering (where the solution will be)
  • MDE is the application of modelling principles and tools to any engineering field
  • So: is actually SOFTWARE the main field of interest of model-driven engineering?
  • In the modern interpretation of life, covering from smart cities to embedded, wearable, and cyber-physical systems, is the border between the environment and the system still relevant?
  • In the future we will need to rely less and less on the “creativity” of engineers when building models, and more and more on the scientific/ quantitative/ empirical methods for building models

The debate obviously stirred around this aspects, starting from Bran Selic who asked a very simple question:

Isn’t it the case that the real problem is about the word “modeling”? In any other fields (architecture, mechanics, physics) modelling is implicit and obvious. Why not in our community? At the end, what we want to achieve is to raise abstraction and increase automation, nothing else.

Other issues have been raised too:

  • why is there so much difference in attitude towards modelling between Europe and US?
  • what’s the role of notations and standards in the success / failure of MDE?

What’s your take on this issue?
Feel free to share your thoughts here or on Twitter, mentioning me  (@MarcoBrambi).
Respond to my poll on twitter!

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).