Motivation & Workshop Description
Ontologies serve as a means for establishing a conceptually concise basis for communicating knowledge for many purposes. Previous, successful workshops on ontology engineering and problem-solving methods have shown that there is a huge interest in the area of engineering ontologies for a very wide range of interesting applications and, in fact, the community in that field is steadily growing. Also in recent years, we have seen a surge of interest in other fields than ontology engineering that tackle the discovery and automatic creation of complex, multirelational knowledge structures. For example, the natural language community tries to acquire word semantics from natural language texts, database researchers tackle the problem of schema induction, and people building intelligent information agents research the learning of complex structures from semi-structured input (HTML, XML). All the while, efforts in the machine learning community pursue the induction of more concise and more expressive knowledge structures (e.g., relational learning) in general. Traditionally, there has been only very few interactions between these groups in spite of the fact that they all try to learn conceptual structures, which are termed "schemata'', "concept hierarchies'' or "heterarchies'', " conceptual patterns'', or "ontologies'' - depending on which community you talk to. We aim at furthering, or even establishing, communication between these communities through our workshop on ontology learning.
Ontology modeling and maintenance is a time consuming task. Human expert modeling by hand is biased, error prone, and expensive. It is very difficult and cumbersome to manually derive ontologies from data. This appears to be true even regardless of the type of data one might consider. In the workshop we plan to attract researchers that try to overcome the problem through learning ontologies from natural language text, semi-structured data (e.g., HTML or XML) or structured data such as found in databases.
Natural language texts exhibit morphological, syntactic, semantic, pragmatic and conceptual constraints that interact in order to convey a particular meaning to the reader. Thus, the text transports information to the reader and the reader embeds this information into his background knowledge. Through the understanding of the text data is associated with conceptual structures and new conceptual structures are learned from the interacting constraints given through language. Tools that learn ontologies from natural language exploit the interacting constraints on the various language levels (from morphology to pragmatics and background knowledge) in order to discover new concepts and stipulate relationships between concepts. We solicit submissions that investigate the combination of natural language processing techniques and machine learning methods for the learning task.
With the success of new standards for document publishing on the web there will be a proliferation of semi-structured data and formal descriptions of semi-structured data freely and widely available. HTML data, XML data, XML Document Type Definitions (DTDs), XML-Schemata (cf. http://w3c.org), and their likes add -- more or less expressive -- semantic information to documents. A number of approaches understand ontologies as a common generalizing level that may communicate between the various data types and data descriptions. Ontologies play a major role for allowing semantic access to these vast resources of semi-structured data. Though only few approaches do yet exist we belief that learning of ontologies from these data and data descriptions may considerably leverage the application of ontologies and, thus, facilitate the access to these data.
Ontologies have been firmly established as a means for mediating between different databases. Nevertheless, the manual creation of a mediating ontology is again a tedious, often extremely difficult, task that may be facilitated through learning methods. The negotiation of a common ontology from a set of data and the evolution of ontologies through the observation of data is a hot topic these days. The same applies to the learning of ontologies from metadata, such as database schemata, in order to derive a common high-level abstraction of underlying data descriptions - an important precondition for data warehousing or intelligent information agents.
The exchange of experience that comes from these different, newly emerging, subareas of ontology learning has been neglected so far. We want to stimulate interaction across these disciplines and initiate a dialogue with research in learning complex structures in general. In particular, we are also interested in maintenance (revision, incrementality) and integration (from various sources) aspects of learning ontologies. We want to further, or even establish, the exchange of ideas between these communities --- and maybe even others that we have not thought of. Hence, we solicit papers that present innovative approaches to ontology learning that are to be discussed in the workshop, system demonstrations and applications or position statements.