Ontology of Data Mining

Stay connected



Share on facebook
Share on twitter
Share on linkedin

CIS Colloquium, Oct 16, 2015, 11:00AM – 12:00PM, SERC 306

Ontology of Data Mining

Pance Panov , Jozef Stefan Institute, Ljubljana, Slovenia

In this talk, I will present our work on an ontology for representing entities from the domain of data mining (OntoDM). The OntoDM ontology defines the most essential data mining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. The ontology comprises of three modular sub-ontologies, covering different aspects of data mining and knowledge discovery process. These include an ontology of core data mining entities (OntoDMcore), an ontology of datatypes (OntoDT), and ontology for representing the knowledge discovery process (OntoDM-KDD). OntoDM is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations and others. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with various domain resources and is easy to extend. Finally, the ontology is freely available at http://www.ontodm.com.

Panče Panov is a postdoctoral researcher at the Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia. He completed his PhD in 2012 in the area of data mining at the Jožef Stefan International Postgraduate School, Ljubljana, Slovenia. His thesis concerned the design and implementation of a modular ontology for the domain of data mining. His research interests are related to machine learning, data mining, the knowledge discovery process, and applying ontology in these domains. His contributions include developments of ontologies for describing the domain of data mining and the process of knowledge discovery, which can be employed in various applications. He was actively involved in several EU-funded projects in the past (IQ, SUMO) and is currently involved in the MAESTRA project. In addition, he participated in several projects financed by the Slovenian research agency and one bilateral project between Slovenia and Croatia. He is co-editor of the book entitled “Inductive databases and constraint-based data mining” published in 2010 by Springer. In 2014, he was program cochair of the International Conference on Discovery Science (2014) and co-editor of the proceedings of the conference published by Springer. Finally, in 2015 he a co-editor of a special issue of the Journal Machine Learning on Discovery Science.