In pytorch LSTM, RNN or GRU models, there is a parameter called "num_layers", which controls the number of hidden layers in an LSTM. I wonder that since there are multiple layers in an LSTM, why the parameter "hidden_size" is only one number instead of a list containing the number of hidden states in multiple layers, like [10, 20, 30].
I came across when I worked on a regression project, in which I feed sequence data of (seq_len, batch, feature) to LSTM, and I want to get the scalar output of every time step.
A helpful link to understand the pytorch LSTM framework, here. I'd really appreciate it if anyone can answer this.