Yesterday, we presented a new work at The Web Conference in Lyon along the research line on knowledge extraction from human generated content started with our paper “Extracting Emerging Knowledge from Social Media” presented at the WWW 2017 Conference (see also this past post).
Our motivation starts from the fact that knowledge in the world continuously evolves, and thus ontologies and knowledge bases are largely incomplete, especially regarding data belonging to the so-called long tail. Therefore, we proposed a method for discovering emerging knowledge by extracting it from social content. Once initialized by domain experts, the method is capable of finding relevant entities by means of a mixed syntactic-semantic method. The method uses seeds, i.e. prototypes of emerging entities provided by experts, for generating candidates; then, it associates candidates to feature vectors built by using terms occurring in their social content and ranks the candidates by using their distance from the centroid of seeds, returning the top candidates.
Based on this foundational idea, we explored the possibility of running our method iteratively, using the results as new seeds. In this paper we address the following research questions:
- How does the reconstructed domain knowledge evolve if the candidates of one extraction are recursively used as seeds?
- How does the reconstructed domain knowledge spread geographically?
- Can the method be used to inspect the past, present, and future of knowledge?
- Can the method be used to find emerging knowledge?
This is the presentation given at the conference:
You can also find here a PDF preprint version of “Iterative Knowledge Extraction from Social Networks” by Brambilla et al.