A bottom-up, knowledge-aware approach to integrating and querying web data services – ACM Trans. on the Web

The October 2013 issue of the ACM Transaction on the Web includes an article of ours on bottom-up domain model design of connected web data sources. This is becoming a more and more important problem as a wealth of data services is becoming available on the Web. Indeed, building and querying Web applications that effectively integrate Web content is increasingly important. However, schema integration and ontology matching with the aim of registering data services often requires a knowledge-intensive, tedious, and error-prone manual process. In the paper we tackle this issue as described below.

The paper has been authored by Stefano Ceri, Silvia Quarteroni and myself within the research project Search Computing.

The full paper is available for download on the ACM Digital Library (free of charge, courtesy of the ACM Author-izer service) through this URL:

http://dl.acm.org/citation.cfm?id=2493536

This is the summary of the contribution:

We present a bottom-up, semi-automatic service registration process that refers to an external knowledge base and uses simple text processing techniques in order to minimize and possibly avoid the contribution of domain experts in the annotation of data services. The first by-product of this process is a representation of the domain of data services as an entity-relationship diagram, whose entities are named after concepts of the external knowledge base matching service terminology rather than being manually created to accommodate an application-specific ontology. Second, a three-layer annotation of service semantics (service interfaces, access patterns, service marts) describing how services “play” with such domain elements is also automatically constructed at registration time. When evaluated against heterogeneous existing data services and with a synthetic service dataset constructed using Google Fusion Tables, the approach yields good results in terms of data representation accuracy.

We subsequently demonstrate that natural language processing methods can be used to decompose and match simple queries to the data services represented in three layers according to the preceding methodology with satisfactory results. We show how semantic annotations are used at query time to convert the user’s request into an executable logical query. Globally, our findings show that the proposed registration method is effective in creating a uniform semantic representation of data services, suitable for building Web applications and answering search queries.

The bibtex reference is as follows:

@article{QBC2013,
author = {Quarteroni, Silvia and Brambilla, Marco and Ceri, Stefano},
title = {A bottom-up, knowledge-aware approach to integrating and querying web data services},
journal = {ACM Trans. Web},
issue_date = {October 2013},
volume = {7},
number = {4},
month = nov,
year = {2013},
issn = {1559-1131},
pages = {19:1--19:33},
articleno = {19},
numpages = {33},
url = {http://doi.acm.org/10.1145/2493536},
doi = {10.1145/2493536},
acmid = {2493536},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {Web data integration, Web data services, Web services, natural language Web query, service querying, structured Web search},
}

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Answering search queries with CrowdSearcher – WWW2012

Our paper:

 Answering search queries with CrowdSearcher
has been accepted and presented at WWW 2012 in Lyon.

Here is the abstract:
Web users are increasingly relying on social interaction to complete and validate the results of their search activities. While search systems are superior machines to get world-wide information, the opinions collected within friends and expert/local communities can ultimately determine our decisions: human curiosity and creativity is often capable of going much beyond the capabilities of search systems in scouting “interesting” results, or suggesting new, unexpected search directions. Such personalized interaction occurs in most times aside of the search systems and processes, possibly instrumented and mediated by a social network; when such interaction is completed and users resort to the use of search systems, they do it through new queries, loosely related to the previous search or to the social interaction. In this paper we propose CrowdSearcher, a novel search paradigm that embodies crowds as first-class sources for the information seeking process. CrowdSearcher aims at filling the gap between generalized search systems, which operate upon world-wide information – including facts and recommendations as crawled and indexed by computerized systems – with social systems, capable of interacting with real people, in real time, to capture their opinions, suggestions, emotions. The technical contribution of this paper is the discussion of a model and architecture for integrating computerized search with human interaction, by showing how search systems can drive and encapsulate social systems. In particular we show how social platforms, such as Facebook, LinkedIn and Twitter, can be used for crowdsourcing search-related tasks; we demonstrate our approach with several prototypes and we report on our experiment upon real user communities.

The full paper is available here:
http://dl.acm.org/citation.cfm?id=2187836.2187971&coll=DL&dl=ACM

The presentation I gave is this one:

The demo video can be found on YouTube:


To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Search Computing demonstration at WWW 2010, Hyderabad, India

Together with Alessandro Bozzon, I’ve presented a demonstration of the search computing exploratory search paradigm at WWW 2010.

The demonstrated scenario is in the real estate and job search field. Suppose that a user is willing to find a new job with a specific expertise and in a certain city. Based on his findings, he also wants to search for housing opportunities in the closeby neighbourhoods. Hence, he wants to check for additional information on the quality of life in the area, on availability of services (public transportation, schools for his children, and so on). The final decision will be based on a complex function of all these aspects. The figure below shows the graph of actually existing and registered searchable concepts within this scenario. All these concepts are searched through third-party services.

 
Here is a short video with a summary of the demonstration:

Here you can see Alessandro at work, while demonstrating the approach to some visitor (big prize if you guess who he is:) :

Btw, if you are looking for some more exciting pictures I took in Hyderabad, India you can have a look at this Flickr set of pictures from Hyderabad (while at WWW 2011).


To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Our paper at WWW 2010

The paper by Bozzon, A., Brambilla, M., Ceri, S., Fraternali, P. on Liquid query: multi-domain exploratory search on the web has been published in the proceedings of the 19th international conference on World Wide Web (WWW conference 2010, Raleigh, NC, USA). ACM, New York, NY, USA, pp. 161-170.

Further details available here:

http://dbgroup.como.polimi.it/brambilla/node/131