Technical Report RR-17-333, 15 September 2017
We present RuDiK, a system for the discovery of declarative rules over knowledge-bases (KBs). RuDiK discovers rules that rely on positive relationships between entities, such as "if two persons have the same parent, they are siblings", and negative rules, i.e., patterns that lead to contradictions in the data, such as "if two persons are married, one cannot be the child of the other". While the former class infers new facts in the KB, the latter class is crucial for detecting erroneous triples. The system is designed to: (i) enlarge the expressive power of the rule language to obtain complex rules and wide coverage of the facts in the KB, (ii) allow the discovery of approximate rules to be robust to errors and incompleteness in the KB, (iii) use disk-based algorithms, effectively enabling rule mining upon large KBs in commodity machines. We have conducted extensive experiments using real-world KBs to show that RuDiK outperforms previous proposals in terms of efficiency and that it discovers more effective rules for the application at hand.
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in Technical Report RR-17-333, 15 September 2017 and is available at :