Generalized Domain Adaptation

[Japanese|English]

Motivation

It is well-known that deep-learning models degrade in performance when there are environmental differences (domain differences) between training data and test data. Domain adaptation methods are proposed to address this. Generally, they are used to train a domain-invariant model by minimizing the distributional discrepancy between the domains. Previous domain adaptation methods mostly require information about which domain each data sample is obtained from. However, it is very costly to know the domain information entirely in advance. We propose a data augmentation method for training a domain-invariant model even when the training data are from various domains and the domain information of each data sample is entirely unknown.

Contributions

We propose a data augmentation method that destructs the class information to reveal the domain information of each data sample. Specifically, we split the image data into pixel blocks and randomly swap their positions. This data augmentation enables us to estimate the domain of each sample in a self-supervised manner. We then use the estimated domain information to train a domain-invariant model.

Effectiveness of our method

The figure below shows how data samples behave in the feature space during training. The marker colors represent the ground truth class of each sample. With the baseline method, the samples are clustered with different classes, and it fails to train a domain-invariant model. With our method, data samples are clustered with the same class, and it successfully trains a domain-invariant model.

Future work

Our work is expected to open up avenues of the utilization of data that could not be used for current deep-learning techniques and expand the application areas of machine learning. In the future, we aim to make our method applicable to media data other than images.

Publications

  1. Mitsuzumi, Irie, Ikami, Shibata, “Generalized Domain Adaptation,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
    CVPR 2021 Open Access Repository (thecvf.com)
  2. GitHub: GitHub - nttcslab/Generalized-Domain-Adaptation

Contact

Yu Mitsuzumi
Recognition Research Group, Media Information Laboratory, NTT Communication Science Laboratories

Related Research