Urbanscope: Digital Whispers from the Urban Landscape. TedX Talk Video

Together with the Urbanscope team, we gave a TedX talk on the topics and results of the project here at Politecnico di Milano. The talk was actually given by our junior researchers, as we wanted it to be a choral performance as opposed to the typical one-man show.

The message is that cities are not mere physical and organizational devices only: they are informational landscapes where places are shaped more by the streams of data and less by the traditional physical evidences. We devise tools and analysis for understanding these streams and the phenomena they represent, in order to understand better our cities.

Two layers coexist: a thick and dynamic layer of digital traces – the informational membrane – grows everyday on top of the material layer of the territory, the buildings and the infrastructures. The observation, the analysis and the representation of these two layers combined provides valuable insights on how the city is used and lived.

You can now find the video of the talk on the official TedX YouTube channel:

Urbanscope is a research laboratory where collection, organization, analysis, and visualization of cross domain geo-referenced data are experimented.
The research team is based at Politecnico di Milano and encompasses researchers with competencies in Computing Engineering, Communication and Information Design, Management Engineering, and Mathematics.

The aim of Urbanscope is to systematically produce compelling views on urban systems to foster understanding and decision making. Views are like new lenses of a macroscope: they are designed to support the recognition of specific patterns thus enabling new perspectives.

If you enjoyed the show, you can explore our beta application at:

http://www.urbanscope.polimi.it

and discover the other data science activities we are conducting at the Data Science Lab of Politecnico, DEIB.

 

Modeling, Modeling, Modeling: From Web to Enterprise to Crowd to Social

This is our perspective on the world: it’s all about modeling. 

So, why is it that model-driven engineering is not taking over the whole technological and social eco-system?

Let me make the case that it is.

A Comprehensive Guide Through the Italian Database Research Over the Last 25 YearsIn the occasion of the 25th edition of the Italian Symposium of Database Systems (SEBD 2017) we (Stefano Ceri and I) have been asked to write a retrospective on the last years of database and systems research from our perspective, published in a dedicated volume by Springer. After some brainstorming, we agreed that it all boils down to this: modeling, modeling, modeling.

Long time ago, in the past century, the International DB Research Community used to meet for assessing new research directions, starting the meetings with 2-minutes gong shows  to tell each one’s opinion and influencing follow-up discussion. Bruce Lindsay from IBM had just been quoted for his message:

There are 3 important things in data management: performance, performance, performance.

Stefano Ceri had a chance to speak out immediately after and to give a syntactically similar but semantically orthogonal message:

There are 3 important things in data management: modeling, modeling, modeling.

Data management is continuously evolving for serving the needs of an increasingly connected society. New challenges apply not only to systems and technology, but also to the models and abstractions for capturing new application requirements.

In our retrospective paper, we describe several models and abstractions which have been progressively designed to capture new forms of data-centered interactions in the last twenty five years – a period of huge changes due to the spreading of web-based applications and the increasingly relevant role of social interactions.

We initially focus on Web-based applications for individuals, then discuss applications among enterprises, and this is all about WebML and IFML; then we discuss how these applications may include rankings which are computed using services or using crowds, and this is related to our work on crowdsourcing (liquid query and crowdsearcher tool); we conclude with hints to a recent research discussing how social sources can be used for capturing emerging knowledge (the social knowledge extractor perspective and tooling).

162940660.KcsktUrP

All in all, modeling as a cognitive tool is all around us, and is growing in terms of potential impact thanks to formal cognification.

It’s also true that model-driven engineering is not necessarily the tool of choice for this to happen. Why? As technician, we always tend to blame the customer for not understanding our product. But maybe we should look into ourselves and the kind of tools (conceptual and technical) the MDE community is offering. I’m pretty sure we could find plenty of space for improvement.

Any idea on how to do this?

 

A Vision towards the Cognification of Model-driven Software Engineering

Jordi Cabot, Robert Clarisó, Marco Brambilla and Sébastien Gerard submitted a visionary paper on Cognifying Model-driven Software Development to the workshop GrandMDE (Grand Challenges in Modeling) co-located with STAF 2017 in Margburg (Germany) on July 17, 2017. The paper advocates for the cross-domain fertilization of disciplines such as machine learning and artificial intelligence, behavioural analytics, social studies, cognitive science, crowdsourcing and many more, in order to help model-driven software development.
But actually, what is cognification?

Cognification is the application of knowledge to boost the performance and impact of any process.

It is recognized as one of the 12 technological forces that will shape our future. We are flooded with data, ideas, people, activities, businesses, and goals. But this flooding could even be helpful.

The thesis of our paper is that cognification will also revolution in the way software is built. In particular, we discuss the opportunities and challenges of cognifying Model-Driven Software Engineering (MDSE or MDE) tasks.

MDE has seen limited adoption in the software development industry, probably because the perception from developers’ and managers’ perspective is that its benefits do not outweigh its costs.

We believe cognification could drastically improve the benefits and reduce the costs of adopting MDSE, and thus boost its adoption.

At the practical level, cognification comprises tools that go from artificial intelligence (machine learning, deep learning, as well as human cognitive capabilities, exploited through online activities, crowdsourcing, gamification and so on.

Opportunities (and challenges) for MDE

Here is a set of MDSE tasks and tools whose benefits can be especially boosted thanks to cognification.

  • A modeling bot playing the role of virtual assistant in the modeling tasks
  • A model inferencer able to deduce a common schema behind a set of unstructured data coming from the software process
  • A code generator able to learn the style and best practices of a company
  • A real-time model reviewer able to give continuous quality feedback
  • A morphing modeling tool, able to adapt its interface at run-time
  • A semantic reasoning platform able to map modeled concepts to existing ontologies
  • A data fusion engine that is able to perform semantic integration and impact analysis of design-time models with runtime data
  • A tool for collaboration between domain experts and modeling designers

A disclaimer

Obviously, we are aware that some research initiatives aiming at cognifying specific tasks in Software Engineering exist (including some activities of ours). But what we claim here is a change in magnitude of their coverage, integration, and impact in the short-term future.

If you want to get a more detailed description, you can go through the detailed post by Jordi Cabot that reports the whole content of the paper.

Model-driven Development of User Interfaces for IoT via Domain-specific Components & Patterns

This is the summary of a joint contribution with Eric Umuhoza to ICEIS 2017 on Model-driven Development of User Interfaces for IoT via Domain-specific Components & Patterns.
Internet of Things technologies and applications are evolving and continuously gaining traction in all fields and environments, including homes, cities, services, industry and commercial enterprises. However, still many problems need to be addressed.
For instance, the IoT vision is mainly focused on the technological and infrastructure aspect, and on the management and analysis of the huge amount of generated data, while so far the development of front-end and user interfaces for IoT has not played a relevant role in research.
On the contrary, we believe that user interfaces in the IoT ecosystem they can play a key role in the acceptance of solutions by final adopters.
In this paper we present a model-driven approach to the design of IoT interfaces, by defining a specific visual design language and design patterns for IoT applications, and we show them at work. The language we propose is defined as an extension of the OMG standard language called IFML.

The slides of this talk are available online on Slideshare as usual:

Spark-based Big Data Analysis of Semantic IFML Models and Web Logs for Enhanced User Behavior Analytics

I’d like to report on our demonstration paper at WWW 2017, focusing on Spark-based Big Data Analysis of  Semantic IFML Models and Web Logs  for Enhanced User Behavior Analytics.

The motivation of the work is that  no approaches exist for merging web log analysis and statistics with information about the Web application structure, content and semantics. Indeed, basic Web analytics tools are widespread and provide statistics about Web site navigation at the syntactic level only: they analyze the user interaction at page level in terms of page views, entry and landing page, page views per visit, and so on. Unfortunately, those tools do not provide precise statistics neither about the content and semantics of the visited pages, nor about the actual reactions of the users to the actual content (instances) he is shown.

With our work we demonstrate the advantages of combining Web application models with runtime navigation logs, at the purpose of deepening the understanding of users behaviour.

We propose a model-driven approach that combines user interaction modeling (based on the IFML standard), full code generation of the designed application, user tracking at runtime through logging of runtime component execution and user activities, integration with page content details, generation of integrated schema-less data streams, and application of large-scale analytics and visualization tools for big data, by applying data visualization techniques that build direct representation of statistics on the IFML visual models of the Web application.

The paper describing the approach is available in the WWW 2017 proceedings.

The video of the demo is available on YouTube:

Social Media Behaviour during Live Events: the Milano Fashion Week #MFW case

Social media are getting more and more  important in the context of live events, such as fairs, exhibits, festivals, concerts, and so on,  as they play an essential role in communicating them to  fans, interest groups, and the general population. These kinds of events are geo-localized within a city or territory and are scheduled within a public calendar.

Together with the people in the Fashion in Process group of Politecnico di Milano, we studied the impact on social media of a specific scenario, the Milano Fashion Week (MFW), which is an important event in Milano for the whole fashion business.

We presented this work at the Location and the Web workshop co-located with the WWW 2017 Conference in Perth, Australia.

We focus our attention on the spreading of social content  in space, measuring the spreading of the event propagation in space. We build different clusters of fashion brands, we characterize several features of propagation in space and we correlate them to the popularity of the brand and temporal propagation.

We show that the clusters along space, time and popularity dimensions are loosely correlated, and therefore trying to  understand the dynamics of the events only based on popularity  aspects would not be appropriate.

The paper PDF is available as open access PDF online on the WWW 2017 Conference web site. You can download it here.

A subsequent paper on the temporal analysis of the same event “Temporal Analysis of Social Media Response to Live Events: The Milano Fashion Week”, focusing on Granger Causality and other measures, has been published at ICWE 2017 and is available in the proceedings by Springer.

The PowerPoint presentation is available on SlideShare.

When a Smart City gets Personal

When people talk about smart cities, the tendency is to think about them in a technology-oriented or sociology-oriented manner.

However, smart cities are the places where we leave and work everyday now.

Here is a very broad perspective (in Italian) about the experience of big data analysis and smart city instrumentation for the town of Como, in Italy: an experience on how phone calls, mobility data, social media, people counters can contribute to take and evaluate decisions.

skype-2

You can read it on my Medium channel.

View story at Medium.com

The role of Big Data in Banks

I was listening at R. Martin Chavez, Goldman Sachs deputy CFO just last month in Harvard at the ComputeFest 2017 event, more precisely, the SYMPOSIUM ON THE FUTURE OF COMPUTATION IN SCIENCE AND ENGINEERING on “Data, Dollars, and Algorithms: The Computational Economy” held in Harvard on Thursday, January 19, 2017.

His claim was that

Banks are essentially API providers.

The entire structure and infrastructure of Goldman Sachs is being restructured for that. His case is that you should not compare a bank with a shop or store, you should compare it with Google. Just imagine that every time you want to search on Google you need to get in touch (i.e., make a phone call or submit a request) to some Google employee, who at some points comes back to you with the result. Non sense, right?  Well, but this is what actually happens with banks. It was happening with consumer-oriented banks before online banking, and it’s still largely happening for business banks.

But this is going to change. Amount of data and speed and volume of financial transaction doesn’t allow that any more.

Banks are actually among the richest (not [just] in terms of money, but in data ownership). But they are also craving for further “less official” big data sources.

c4tmizavuaa1fc3
Juri Marcucci: Importance of Big Data for Central (National) Banks.

Today at the ISTAT National Big Data Committee meeting in Rome, Juri Marcucci from Bank of Italy discussed their research activity in integration of Google Trends information in their financial predictive analytics.

Google Trends provide insights of user interests in general, as the probability that a random user is going to search for a particular keyword (normalized and scaled, also with geographical detail down to city level).

Bank of Italy is using Google Trends data for complementing their prediction of unemployment rates in short and mid term. It’s definitely a big challenge, but preliminary results are promising in terms of confidence on the obtained models. More details are available in this paper.

Paolo Giudici from University of Pavia showed how one can correlate the risk of bank defaults with their exposition on Twitter:

c4tuo4yxuae86gm
Paolo Giudici: bank risk contagion based (also) on Twitter data.

Obviously, all this must take into account the bias of the sources and the quality of the data collected. This was pointed out also by Paolo Giudici from University of Pavia. Assessment of “trustability” of online sources is crucial. In their research, they defined the T-index on Twitter accounts in a very similar way academics define the h-index for relevance of publications, as reported in the photographed slide below.

dig
Paolo Giudici: T-index describing the quality of Twitter authors in finance.

It’s very interesting to see how creative the use of (non-traditional, web based) big data is becoming, in very diverse fields, including very traditional ones like macroeconomy and finance.

And once again, I think the biggest challenges and opportunities come from the fusion of multiple data sources together: mobile phones, financial tracks, web searches, online news, social networks, and official statistics.

This is also the path that ISTAT (the official institute for Italian statistics) is pursuing. For instance, in the calculation of official national inflation rates, web scraping techniques (for ecommerce prices) upon more than 40.000 product prices are integrated in the process too.

 

 

The Dawn of a new Digital Renaissance in Cultural Heritage

Fluxedo joined forces with the Observatory of Digital Innovation in Arts & Culture Heritage (Osservatorio per l’innovazione digitale nei beni e attività culturali) by the School of Management (MIP) of Politecnico di Milano, for covering the social media analytics of Italian and international museums.

The results of the work have been presented during an event on January 19th, 2017 hosted by Piccolo Teatro di Milano, which was very successful.

The live dashboard of the SocialOmeters analysis on the museums is available here:

www.socialometers.com/osservatoriomusei/

socialometers_musei_

A summary of the event through social media content of the event as generated via Storify is available here.

beni_e_attivita_culturali__l_alba_del_rinascimento_digitale__with_images__tweets__%c2%b7_marcobrambi_%c2%b7_storify

The official hashtag of the event #OBAC17 has become Twitter trend  in Italy, with 579 tweets, 187 users, around 600 likes and retweets, and a potential audience of 2.2 million users.

The event had a huge visibility on the national media, as reported in this press review:

1.      La rivoluzione dei musei online. Il primato di Triennale e Pinacoteca – read Il Corriere della Sera Milano

2.      L’innovazione prolifera (ma fatica) – read Il Sole 24 Ore Nòva

3.      I musei italiani e la digitalizzazione: il punto del Politecnico di Milano – read Advertiser

4.      Osservatorio Politecnico, musei social ma con pochi servizi digitali – read Arte Magazine

5.      Arte & Innovazione. Musei italiani sempre più social (52%) e virtuali (20%) – read
Corriere del Web

6.      Boom di visitatori nei musei ma è flop dei servizi digitali – read Il Sole 24 Ore Blog

7.      Capitolini, comunali e Maxxi di Roma tra i musei più popolari sui social network – read La Repubblica Roma

8.      La pagina Facebook della Reggia di Venaria è la più apprezzata d’Italia con oltre 166 mila “like” – read La Stampa Torino

9.      Il 52% dei musei italiani è social ma i servizi digitali per la fruizione delle opere sono limitati. Un’analisi dell’Osservatorio Innovazione Digitale nei Beni e Attività Culturali – read Lombard Street

10.  Musei sempre più social, ecco i più cliccati  – read TTG Italia

11.  Tanta cultura, poco digitale: solo il 52% dei musei italiani è sui social e il 43% non ha ancora un sito – read Vodafone News

12.  Tra Twitter e Instagram, 52% musei italiani punta sui social media – read ADNKronos

13.  Musei strizzano occhi a social, ma strada è lunga – read ANSA ViaggiArt

14.  Oltre la metà dei musei italiani è online e sui social, ma i servizi digitali evoluti e quelli on site sono ancora scarsi – read Brand News

15.  Il 52% dei musei italiani è social ma i servizi digitali per la fruizione delle opere sono limitati – read DailyNet

16.  Musei italiani sempre più social, ma i servizi digitali sono limitati – read Diario Innovazione

17.  Musei e social network – read Inside Art

18.  Musei Vaticani e Maxxi tra i più social d’Italia – read Il Messaggero

19.  I musei si fanno spazio sui social – read Italia

20.  Musei italiani social, ma non troppo – read La Repubblica

21.  Capitolini e Maxxi da record sui social – read La Repubblica Roma

22.  Musei lucani poco social e poco visitabili in web – read La Siritide

23.  Venaria regina dei social – read La Stampa Torino

24.  In calo nel 2016 il numero degli ingressi nei luoghi di cultura in Basilicata – read Oltre

25.  Musei sempre più social, ma poco interattivi – read QN ILGIORNO – il Resto del Carlino – LA NAZIONE

26.  Dal marketing alle guide per disabili. Cultura, boom dell’industria digitale – read QN ILGIORNO – il Resto del Carlino – LA NAZIONE

27.  Musei romani sempre più social – read Radio Colonna

28.  Social network: il Maxxi tra i musei più popolari – read Roma2Oggi

29.  Musei italiani sempre più social, ma la strada è ancora lunga – read Travel No Stop

30.  Musei italiani sempre più social e virtuali – read Uomini & Donne della Comunicazione

31.  Tra Twitter e Instagram, un museo su due in Italia scommette sui …  – read Italia per Me

32.  Beni culturali, musei lucani poco social – read La Nuova del Sud

33.  Fb, Instagram e Twitter: i musei italiani puntano sui social ma non basta – read  La Repubblica

34.  Tra Twitter e Instagram, un museo su due in Italia scommette sui social media – read La Stampa

35.  Il 52% dei musei italiani è social ma i servizi digitali per la fruizione delle opere sono limitati – read Sesto Potere

36.  Il 52% dei musei italiani è social, ma la fruizione delle opere digital è limitata – read Il Sole 24 Ore

The Harvard-Politecnico Joint Program on Data Science in full bloom

After months of preparation, here we are.

This week we kicked off the second edition of the DataShack program on Data Science that brings together interdisciplinary teams of data science, software engineering & computer science, and design students from Harvard (Institute of Applied Computational Science) and Politecnico di Milano (faculties of Engineering and Design).

The students will address big data extraction, analysis, and visualization problems provided by two real-world stakeholders in Italy: the Como city municipality and Moleskine.

logo-moleskineThe Moleskine Data-Shack project will explore the popularity and success of different Moleskine products co-branded with other famous brands (also known as special editions) and launched in specific periods in time. The main field of analysis is the impact that different products have on social media channels. Social media analysis then will be correlated with product distribution and sales performance data, along multiple dimensions (temporal, geographical, etc.) and product features.

logo-comoThe project consists of collecting and analyzing data about the city and the way people live and move within it, by integrating multiple and diverse data sources. The problems to be addressed may include providing estimates of human density and movements within the city, predicting the impact of hypothetical future events, determining the best allocation of sensors in the streets, and defining optimal user experience and interaction for exploring the city data.

img_3doc5n
The kickoff meeting of the DataShack 2017 projects, in Harvard. Faculties Pavlos Protopapas, Stefano Ceri, Paola Bertola, Paolo Ciuccarelli and myself (Marco Brambilla) are involved in the program.

The teams have been formed, and the problems assigned. I really look forward to advising the groups in the next months and seeing the results that will come out. The students have shown already commitment and engagement. I’m confident that they will be excellent and innovative this year!

For further activities on data science within our group you can refer to the DataScience Lab site, Socialometers, and Urbanscope.