04/01/2021

    Find a good number of salient patterns in a matrix
    - Infinite plaid models for infinite bi-clustering -NTT Machine Learning and Data Science Center

    NTT Communication Science Laboratories Open House 2016.

    Abstract

    Our goal is to find salient bi-clusters from a given relational data matrix automatically. Salient bi-clusters are sub-matrices that have distinct values from other entries of the data matrix. Such bi-clusters often corresponds to informative subsets of the data; e.g. "good customer groups with best-selling items for them", and "specific gene clusters that are reactive for a specific treatment/chemicals".

    Conventionally, bi-clustering requires us to specify the number of bi-clusters to be extracted before the analysis. Howeverit is generally difficult to know the number of bi-clusters before conducting an actual analysis.

    Our proposed model enables us to forget about this specification of the number of bi-clusters. The model automatically infer an appropriate number of bi-clusters (up to infinite!) for the given data matrix, and performs effective bi-clustering. This model will help users to conduct "easy-to-go" bi-clustering for several situations.

    NTT Communication Science Laboratories Open House 2016 exhibition 2

    Reference

    1. [1]K. Ishiguro, I. Sato, M. Nakano, A. Kimura, N. Ueda, "Infinite plaid models for infinite bi-clustering," in Proc. 30th AAAI Conference on Artificial Intelligence (AAAI), 2016.

    Related content