[Japanese|English]
It is well-known that deep-learning models degrade in performance when there are environmental differences (domain differences) between training data and test data. Domain adaptation methods are proposed to address this. Generally, they are used to train a domain-invariant model by minimizing the distributional discrepancy between the domains. Previous domain adaptation methods mostly require information about which domain each data sample is obtained from. However, it is very costly to know the domain information entirely in advance. We propose a data augmentation method for training a domain-invariant model even when the training data are from various domains and the domain information of each data sample is entirely unknown.
We propose a data augmentation method that destructs the class information to reveal the domain information of each data sample. Specifically, we split the image data into pixel blocks and randomly swap their positions. This data augmentation enables us to estimate the domain of each sample in a self-supervised manner. We then use the estimated domain information to train a domain-invariant model.
The figure below shows how data samples behave in the feature space during training. The marker colors represent the ground truth class of each sample. With the baseline method, the samples are clustered with different classes, and it fails to train a domain-invariant model. With our method, data samples are clustered with the same class, and it successfully trains a domain-invariant model.
Our work is expected to open up avenues of the utilization of data that could not be used for current deep-learning techniques and expand the application areas of machine learning. In the future, we aim to make our method applicable to media data other than images.
Yu Mitsuzumi
Recognition Research Group, Media Information Laboratory, NTT Communication Science Laboratories