Abstract
Machine learning has become a central research area, with increasing attention devoted to explainable clustering, also known as conceptual clustering, which is a knowledge-driven unsupervised learning paradigm that partitions data into \theta disjoint clusters, where each cluster is described by an explicit symbolic representation, typically expressed as a closed pattern or itemset. By providing human-interpretable cluster descriptions, explainable clustering plays an important role in explainable artificial intelligence and knowledge discovery. Recent work improved clustering quality by introducing k-relaxed frequent patterns (k-RFPs), a pattern model that relaxes strict coverage constraints through a generalized kcover definition. This framework integrates constraint-based reasoning, using SAT solvers for pattern generation, with combinatorial optimization, using Integer Linear Programming (ILP) for cluster selection. Despite its effectiveness, this approach suffers from a critical limitation: multiple distinct k-RFPs may induce identical k-covers, leading to redundant symbolic representations that unnecessarily enlarge the search space and increase computational complexity during cluster construction. In this paper, we address this redundancy through a pattern reduction framework. Our contributions are threefold. First, we formally characterize the conditions under which distinct k-RFPs induce identical kcovers, providing theoretical foundations for redundancy detection. Second, we propose an optimization strategy that removes redundant patterns by retaining a single representative pattern for each distinct k-cover. Third, we investigate the interpretability and representativeness of the patterns selected by the ILP model by analyzing their robustness with respect to their induced clusters. Extensive experiments conducted on several real-world datasets demonstrate that the proposed approach significantly reduces the pattern search space, improves computational efficiency, preserves and enhances in some cases the quality of the resulting clusters.