Abstract
Mining multiple longest common subsequences (\textit{MLCS}) from a set of sequences of three or more over a finite alphabet \Sigma (a classical NP-hard problem) is an important task in a wide variety of application fields. Unfortunately, there is still no exact \textit{MLCS} algorithm/tool that can handle long (length \ge 1,000) or big (length \ge 10,000) sequences, which seriously hinders the development and utilization of massive long or big sequences from various application fields today. To address the challenge, we first propose a novel key point-based \textit{MLCS} algorithm for mining big sequences, called \textit{KP-MLCS}, and then present a new method, which can compactly represent all mined \textit{MLCSs} and quickly reveal common patterns among them. Furthermore, by introducing some new techniques, e.g., real-time graphic visualization and serialization, we have developed a new online visual \textit{MLCS} mining tool, called OVT-MLCS. OVT-MLCS demonstrates that it not only enables effective online mining, storing, and downloading of \textit{MLCSs} in the form of graphs and text from long or big sequences with a scale of 3 to 5000 but also provides user-friendly interactive functions to facilitate inspection and analysis of the mined \textit{MLCS}s. We believe that the functions provided by OVT-MLCS will promote stronger and wider applications of \textit{MLCS}.