Question 1 of 25
Why do CNNs work better than plain MLPs on images?
Question 2 of 25
What does parameter sharing in a conv layer mean?
Question 3 of 25
Pooling (e.g. max pool 2×2) mainly helps by:
Question 4 of 25
An RNN is designed for:
Question 5 of 25
Why do plain RNNs struggle with long sequences?
Question 6 of 25
In sentiment analysis word-by-word, the RNN hidden state at step *t* summarizes:
Question 7 of 25
What problem does LSTM mainly address vs a vanilla RNN?
Question 8 of 25
GRU compared to LSTM:
Question 9 of 25
The LSTM forget gate decides:
Question 10 of 25
What is a word embedding?
Question 11 of 25
Why are embeddings important for GenAI?
Question 12 of 25
Pre-trained embeddings (e.g. GloVe, Word2Vec) vs training from scratch on 500 reviews:
Question 13 of 25
A 3×3 conv with stride 1 on a 32×32 map (no padding) produces:
Question 14 of 25
In sentiment analysis, averaging word embeddings for a review is:
Question 15 of 25
For product review sentiment, why might LSTM beat a bag-of-words model?
Question 16 of 25
Stride 2 convolution (vs stride 1) typically:
Question 17 of 25
Why share the same filter weights across all image locations in a conv layer?
Question 18 of 25
In an RNN, hₜ (hidden state at time t) mainly summarizes:
Question 19 of 25
The LSTM input gate decides:
Question 20 of 25
GRU compared to LSTM often:
Question 21 of 25
Words with similar meaning (e.g. king and queen) in a good embedding space:
Question 22 of 25
Backprop through time (BPTT) means:
Question 23 of 25
Padding (e.g. same padding) around a conv layer is used to:
Question 24 of 25
Word2Vec / GloVe pre-trained embeddings help downstream tasks because:
Question 25 of 25
For very long documents, plain RNNs struggle mainly because: