I’ve been invited to give a keynote talk at the WISE 2022 Conference. Thinking about it, I decided to focus on my idea of a bi-verse. To me, the bi-verse is the duality between the physical and digital worlds.
On one side, the Web and social media are the environments where people post their content, opinions, activities, and resources. Therefore, a considerable amount of user-generated content is produced every day for a wide variety of purposes.
On the other side, people live their everyday life immersed in the physical world, where society, economy, politics, and personal relations continuously evolve. These two opposite and complementary environments are today fully integrated: they reflect each other and they interact with each other in a stronger and stronger way.
Exploring and studying content and data coming from both environments offers a great opportunity to understand the ever-evolving modern society, in terms of topics of interest, events, relations, and behavior.
This slidedeck summarizes my contribution:
In my speech, I discuss business cases and socio-political scenarios, to show how we can extract insights and understand reality by combining and analyzing data from the digital and physical world, so as to reach a better overall picture of reality itself. Along this path, we need to keep into account that reality is complex and varies in time, space, and many other dimensions, including societal and economic variables. The speech highlights the main challenges that need to be addressed and outlines some data science strategies that can be applied to tackle these specific challenges.
We will join and contribute to the final TRIGGER conference is scheduled for May 31st, 2022 in Brussels.
The theme is: “Rethinking the EU’s role in global governance”. In this context, the TRIGGER project is going to present the main research outcomes of the H2020 research program that started in 2018, setting the stage for the collaboration among 14 international partners.
Online reviews have long represented a valuable source for data analysis in the tourism field, but these data sources have been mostly studied in terms of the numerical ratings offered by the review platforms.
In a recent article (available as full open-access) and a respective blog post, we explored if social media and online review platforms can be a good source of quantitative evaluation of service quality of cultural venues, such as museums, theaters and so on. Our paper applies automatic analysis of online reviews, by comparing two different automated analysis approaches to evaluate which of the two is more adequate for assessing the quality dimensions. The analysis covers user-generated reviews over the top 100 Italian museums.
Specifically, we compare two approaches:
a ‘top-down’ approach that is based on a supervised classification based upon strategic choices defined by policy makers’ guidelines at the national level;
a ‘bottom-up’ approach that is based on an unsupervised topic model of the online words of reviewers.
The misalignment of the results of the ‘top-down’ strategic studies and ‘bottom-up’ data-driven approaches highlights how data science can offer an important contribution to decision making in cultural tourism. Both the analysis approaches have been applied to the same dataset of 14,250 Italian reviews.
We identified five quality dimensions that follow the ‘top-down’ perspective:Ticketing and Welcoming, Space, Comfort, Activities, and Communication. Each of these dimensions has been considered as a class in a classification problem over user reviews. The top down approach allowed us to tag each review as descriptive of one of those 5 dimensions. Classification has been implemented both as a machine learning classification problem (using BERT, accuracy 88%) and as and keyword-based tagging (accuracy 80%).
The ‘bottom-up’ approach has been implemented through an unsupervised topic modelling approach, namely LDA (Latent Dirichlet Allocation), implemented and tuned over a range up to 30 topics. The best ‘bottom-up’ model we selected identifies 13 latent dimensions in review texts. We further integrated them in 3 main topics: Museum Cultural Heritage, Personal Experience and Museum Services.
The ‘top-down’ approach (based on a set of keywords defined from the standards issued by the policy maker) resulted in 63% of online reviews that did not fit into any of the predefined quality dimension.
The ‘bottom-up’ data-driven approach overcomes this limitation by searching for the aspects of interest using reviewers’ own words. Indeed, usually museum reviews discuss more about a museum’s cultural heritage aspects (46% average probability) and personal experiences (31% average probability) than the services offered by the museum (23% average probability).
Among the various quantitative findings of the study, I think the most important point is that the aspects considered as quality dimensions by the decision maker can be highly different from those aspects perceived as quality dimensions by museum visitors.
You can find out more about this analysis by reading the full article published online as open-access, or this longer blog post . The full reference to the paper is:
Agostino, D.; Brambilla, M.; Pavanetto, S.; Riva, P. The Contribution of Online Reviews for Quality Evaluation of Cultural Tourism Offers: The Experience of Italian Museums. Sustainability2021, 13, 13340. https://doi.org/10.3390/su132313340
Coronavirus COVID-19 is an extreme challenge for our society, economy, and individual life. However, governments should have learnt from each other. The impact has been spreading slowly across countries. There has been plenty of time to take action. But apparently people and government can’t grasp the risk until it’s onto them. And the way European and American governments are acting is to slow and incremental.
I live in Italy, we rank second in the world for healthcare quality. The mindset of “this won’t happen here” was the attitude at the beginning of this challenge, and look at what happened. I’m reporting here two links to articles that mention a data-driven vision, but also the human, psychological an behavioural aspects involved. They are two simple stories that report the Italian perspective on the virus.
For centuries, science (in German “Wissenschaft”) has aimed to create (“schaften”) new knowledge (“Wissen”) from the observation of physical phenomena, their modelling, and empirical validation.
Recently, a new source of knowledge has emerged: not (only) the physical world any more, but the virtual world, namely the Web with its ever-growing stream of data materialized in the form of social network chattering, content produced on demand by crowds of people, messages exchanged among interlinked devices in the Internet of Things. The knowledge we may find there can be dispersed, informal, contradicting, unsubstantiated and ephemeral today, while already tomorrow it may be commonly accepted.
The challenge is once again to capture and create consolidated knowledge that is new, has not been formalized yet in existing knowledge bases, and is buried inside a big, moving target (the live stream of online data).
The myth is that existing tools (spanning fields like semantic web, machine learning, statistics, NLP, and so on) suffice to the objective. While this may still be far from true, some existing approaches are actually addressing the problem and provide preliminary insights into the possibilities that successful attempts may lead to.
I gave a few keynote speeches on this matter (at ICEIS, KDWEB,…), and I also use this argument as a motivating class in academic courses for letting students understand how crucial is to focus on the problems related to big data modeling and analysis. The talk, reported in the slides below, explores through real industrial use cases, the mixed realistic-utopian domain of data analysis and knowledge extraction and reports on some tools and cases where digital and physical world have brought together for better understanding our society.
We organize a crash-course on how the science of urban data can be applied to solve metropolitan issues.
The course is a 2 days face-to-face event with teaching sessions, workshops, case study discussions and hands-on activities for non-IT professionals in the field of city management. It is issued in two editions along the year:
in Milan, Italy, on November 8th-9th, 2017
in Amsterdam, The Netherlands, on November 30th-December 1st, 2017.
Ideal participants include: Civil servants, Professionals, Students, Urban planners, and managers of city utilities and services. No previous experience in data science or computer science is required. Attendees should have experience in areas such as economic affairs, urban development, management support, strategy & innovation, health & care, public order & safety.
Data is the catalyst needed to make the smart city vision a reality in a transparent and evidence-based (i.e. data-driven) manner. The skills required for data-driven urban analysis and design activities are diverse, and range from data collection (field work, crowdsensing, physical sensor processing, etc.); data processing by employing established big data technology frameworks; data exploration to find patterns and outliers in spatio-temporal data streams; and data visualization conveying the right information in the right manner.
The CrowdInsights professional school “Urban Data Science Bootcamp” provides a no-frills, hands-on introduction to the science of urban data; from data creation, to data analysis, data visualization and sense-making, the bootcamp will introduce more than 10 real-world application uses cases that exemplifies how urban data can be applied to solve metropolitan issues. Attendees will explore the challenges and opportunities that come from the adoption of novel types of urban data source, including social media, mobile phone data, IoT networks, etc.
Together with the Urbanscope team, we gave a TedX talk on the topics and results of the project here at Politecnico di Milano. The talk was actually given by our junior researchers, as we wanted it to be a choral performance as opposed to the typical one-man show.
The message is that cities are not mere physical and organizational devices only: they are informational landscapes where places are shaped more by the streams of data and less by the traditional physical evidences. We devise tools and analysis for understanding these streams and the phenomena they represent, in order to understand better our cities.
Two layers coexist: a thick and dynamic layer of digital traces – the informational membrane – grows everyday on top of the material layer of the territory, the buildings and the infrastructures. The observation, the analysis and the representation of these two layers combined provides valuable insights on how the city is used and lived.
Urbanscope is a research laboratory where collection, organization, analysis, and visualization of cross domain geo-referenced data are experimented.
The research team is based at Politecnico di Milano and encompasses researchers with competencies in Computing Engineering, Communication and Information Design, Management Engineering, and Mathematics.
The aim of Urbanscope is to systematically produce compelling views on urban systems to foster understanding and decision making. Views are like new lenses of a macroscope: they are designed to support the recognition of specific patterns thus enabling new perspectives.
If you enjoyed the show, you can explore our beta application at:
Social media are getting more and more important in the context of live events, such as fairs, exhibits, festivals, concerts, and so on, as they play an essential role in communicating them to fans, interest groups, and the general population. These kinds of events are geo-localized within a city or territory and are scheduled within a public calendar.
Together with the people in the Fashion in Process group of Politecnico di Milano, we studied the impact on social media of a specific scenario, the Milano Fashion Week (MFW), which is an important event in Milano for the whole fashion business.
We focus our attention on the spreading of social content in space, measuring the spreading of the event propagation in space. We build different clusters of fashion brands, we characterize several features of propagation in space and we correlate them to the popularity of the brand and temporal propagation.
We show that the clusters along space, time and popularity dimensions are loosely correlated, and therefore trying to understand the dynamics of the events only based on popularity aspects would not be appropriate.
Daniele Quercia leads the Social Dynamics group at Bell Labs in Cambridge
(UK). He has been named one of Fortune magazine’s 2014 Data All-Stars, and spoke about “happy maps” at TED.His research has been focusing in the area of urban informatics and received best paper awards from Ubicomp 2014 and from ICWSM 2015, and an honourable mention from ICWSM 2013. He was Research Scientist at Yahoo Labs, a Horizon senior researcher at the University of Cambridge, and Postdoctoral Associate at the department of Urban Studies and Planning at MIT. He received his PhD from UC London. His thesis was sponsored by Microsoft Research and was nominated for BCS Best British PhD dissertation in Computer Science.
His presentation will contrast the corporate smart-city rhetoric about efficiency, predictability, and security with a different perspective on the cities, which I think is very inspiring and visionary.
“You’ll get to work on time; no queue when you go shopping, and you are safe because of CCTV cameras around you”. Well, all these things make a city acceptable, but they don’t make a city great.
Daniele is launching goodcitylife.org – a global group of like-minded people who are passionate about building technologies whose focus is not necessarily to create a smart city but to give a good life to city dwellers. The future of the city is, first and foremost, about people, and those people are increasingly networked. We will see how a creative use of network-generated data can tackle hitherto unanswered research questions. Can we rethink existing mapping tools [happy-maps]? Is it possible to capture smellscapes of entire cities and celebrate good odors [smelly-maps]? And soundscapes [chatty-maps]?
When people talk about smart cities, the tendency is to think about them in a technology-oriented or sociology-oriented manner.
However, smart cities are the places where we leave and work everyday now.
Here is a very broad perspective (in Italian) about the experience of big data analysis and smart city instrumentation for the town of Como, in Italy: an experience on how phone calls, mobility data, social media, people counters can contribute to take and evaluate decisions.