•Treat the hidden activities of the first levelTRBM as the data for the second-levelTRBM.
–So when we learn the second level, weget connections across time in the firsthidden layer.
•After greedy learning, we can generate fromthe composite model
–First, generate from the top-level modelby using alternating Gibbs samplingbetween the current hiddens andvisibles of the top-level model, using thedynamic biases created by the previoustop-level visibles.
–Then do a single top-down pass throughthe lower layers, but using theautoregressive inputs coming fromearlier states of each layer.