Science of Machine Learning
Exhibition Program 5
Stable deep learning for time-series data
Preventing gradient explosions in gated recurrent units
Abstract
We propose a method to stabilize training of Recurrent Neural Networks (RNNs). The RNN is one of the most successful models to handle the time-series data in many applications such as speech recognition or machine translation. However, training of RNNs requires trial and error, and expertise since training of RNNs is difficult due to the gradient exploding problem. In this study, we focus on the Gated Recurrent Unit (GRU), which is one of the modern RNN models. We reveal the parameter point at which training of GRUs is disrupted by the gradient exploding problem and propose an algorithm to prevent the gradient from exploding. Our method can reduce time for trial and error, and does not require in-depth expertise to tune the hyper-parameters for training of GRU.
Photos
Poster
Please click the thumbnail image to open the full-size PDF file.
Presenters
Sekitoshi Kanai
Software Innovation Center
Yasutoshi Ida
Software Innovation Center
Yu Oya
Software Innovation Center
Yasuhiro Iida
Software Innovation Center
Oral Presentations:
Eisaku Maeda (Director's Talk) |
Tomoharu Iwata |
Takuhiro Kaneko |
Makio Kashino |
Takashi G. Sato |
Exhibition:
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
11
|
12
|
13
|
14
|
15
|
16
|
17
|
18
|
19
|
20
|
21
|
22
|
23
|
24
|
25
|
26
|
27
|
28
|
29
Prev |
Next