This repository aims at sharing data sets for the analysis of co-clustering algorithms. It currently contains 72 artificial data tables of reals, which have been generated from latent block models. The main benefits of using these data sets are that:
Item 1. ensures that the co-clustering structure exists; items 2. and 3. enable the accurate absolute and relative assessments of the benchmarked co-clustering compounds; item 4. ensures that the learning problem indeed belongs to the co-clustering problems category, whose analysis cannot be conducted by one-way clustering tools. A more comprehensive description of the data sets is given here, and supporting evidences regarding the claims above are detailed there.
For information about citing data sets in publications, please see here.
If you have comments, suggestions, if you wish to donate a series of data sets, or for any other question, feel free to contact the repository maintainer.