$\approx 15$ years of CPU time), which makes our study the largest of its kind on LSTM networks. Our results show that none of the variants can improve upon the standard LSTM architecture significantly, and demonstrate the forget gate and the output activation function to be its most critical components. We further observe that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.">

LSTM: A Search Space Odyssey (original) (raw)

IEEE Account

Purchase Details

Profile Information

Need Help?

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
© Copyright 2026 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.