Research at Kiminkii

Kiminkii has a very strong research background. Our main topic of interest is Multi-Relational Data Mining, a specific branch of Data Mining that deals with analysing structured objects. An animated demo gives a good example of how MRDM works. For detailed information about our current and past research, check out the list of publications.

Kiminkii's primary product, Safarii, acts as a test-bed for new algorithms and techniques, and typically succesful developments are included in new releases of the software. See the product pages for a description of features of the latest release of Safarii, as well as a walkthrough of the product.

Kiminkii's research activities are described in the doctoral thesis Multi-Relational Data Mining, by Arno Knobbe (ISBN 90-292-3834-5, 130 pages). The book gives an overview of the aspects involved in MRDM, and describes in detail a number of important Data Mining techniques. Futhermore, pointers are given for the practical implementation of MRDM in an enterprise environment.

Multi-Relational Data Mining can be ordered by sending an email to info@kiminkii.com, at a price of € 25,- plus shipping.

From the introduction:

"This thesis is concerned with Data Mining: extracting useful insights from large and detailed collections of data. With the increased possibilities in modern society for companies and institutions to gather data cheaply and efficiently, this subject has become of increasing importance. This interest has inspired a rapidly maturing research field with developments both on a theoretical, as well as on a practical level with the availability of a range of commercial tools. Unfortunately, the widespread application of this technology has been limited by an important assumption in mainstream Data Mining approaches. This assumption – all data resides, or can be made to reside, in a single table – prevents the use of these Data Mining tools in certain important domains, or requires considerable massaging and altering of the data as a pre-processing step. This limitation has spawned a relatively recent interest in richer Data Mining paradigms that do allow structured data as opposed to the traditional flat representation.
Over the last decade, we have seen the emergence of Data Mining techniques that cater to the analysis of structured data. These techniques are typically upgrades from well-known and accepted Data Mining techniques for tabular data, and focus on dealing with the richer representational setting. Within these techniques, which we will collectively refer to as Structured Data Mining techniques, we can identify a number of paradigms or ‘traditions’, each of which is inspired by an existing and well-known choice for representing and manipulating structured data. For example, Graph Mining deals with data stored as graphs, whereas Inductive Logic Programming builds on techniques from the logic programming field. This thesis specifically focuses on a tradition that revolves around relational database theory: Multi-Relational Data Mining (MRDM).
Building on relational database theory is an obvious choice, as most data-intensive applications of industrial scale employ a relational database for storage and retrieval. But apart from this pragmatic motivation, there are more substantial reasons for having a relational database view on Structured Data Mining. Relational database theory has a long and rich history of ideas and developments concerning the efficient storage and processing of structured data, which should be exploited in successful Multi-Relational Data Mining technology. Concepts such as data modelling and database normalisation may help to properly approach an MRDM project, and guide the effective and efficient search for interesting knowledge in the data. Recent developments in dealing with extremely large databases and managing query-intensive analytical processing will aid the application of MRDM in larger and more complex domains..."