Module 7 quiz and review

Before we begin

Test LLM lifecycle, fine-tuning, RAG engineering, sampling, and trust — including interview topics like LoRA, hybrid search, reranking, and faithfulness evals. Aim for at least 30 out of 40.

Multiple choice quiz

Interactive quiz

Pick one answer per question. Feedback appears immediately — take your time before clicking.

0 / 40 correct·0 answered

Question 1 of 40
A GPT-style LLM is best described as:
Answer options for question 1
Question 2 of 40
During inference, an LLM typically:
Answer options for question 2
Question 3 of 40
Prompt engineering mainly means:
Answer options for question 3
Question 4 of 40
A system prompt usually:
Answer options for question 4
Question 5 of 40
What does temperature control in LLM sampling?
Answer options for question 5
Question 6 of 40
Top-k sampling:
Answer options for question 6
Question 7 of 40
Top-p (nucleus) sampling:
Answer options for question 7
Question 8 of 40
Fine-tuning vs RAG — the main difference:
Answer options for question 8
Question 9 of 40
When is RAG often preferred over fine-tuning?
Answer options for question 9
Question 10 of 40
In RAG, embeddings are used to:
Answer options for question 10
Question 11 of 40
Why do LLMs hallucinate?
Answer options for question 11
Question 12 of 40
Which practice reduces hallucinations in doc Q&A?
Answer options for question 12
Question 13 of 40
In LLM context, embeddings refer to:
Answer options for question 13
Question 14 of 40
Few-shot prompting means:
Answer options for question 14
Question 15 of 40
A vector database (FAISS, Pinecone, etc.) in RAG stores:
Answer options for question 15
Question 16 of 40
Pretraining an LLM mainly means:
Answer options for question 16
Question 17 of 40
Temperature = 0 (greedy decoding) typically:
Answer options for question 17
Question 18 of 40
A clear system prompt should:
Answer options for question 18
Question 19 of 40
Chunking documents for RAG is important because:
Answer options for question 19
Question 20 of 40
Fine-tuning is often chosen over RAG when:
Answer options for question 20
Question 21 of 40
Grounding a chatbot answer means:
Answer options for question 21
Question 22 of 40
Top-p (nucleus) sampling keeps:
Answer options for question 22
Question 23 of 40
Chain-of-thought prompting helps when:
Answer options for question 23
Question 24 of 40
In a chat API, assistant messages in history are included so that:
Answer options for question 24
Question 25 of 40
Cosine similarity between query and chunk embeddings in RAG finds:
Answer options for question 25
Question 26 of 40
SFT (supervised fine-tuning) in the LLM lifecycle means:
Answer options for question 26
Question 27 of 40
RLHF is mainly used to:
Answer options for question 27
Question 28 of 40
LoRA fine-tuning is popular in industry because:
Answer options for question 28
Question 29 of 40
Quantization (e.g. INT8, 4-bit) helps primarily with:
Answer options for question 29
Question 30 of 40
Hybrid search in RAG combines:
Answer options for question 30
Question 31 of 40
A reranker after first-stage retrieval is used because:
Answer options for question 31
Question 32 of 40
When a source document is deleted, a production RAG index should:
Answer options for question 32
Question 33 of 40
Chunk overlap (50–100 tokens) helps because:
Answer options for question 33
Question 34 of 40
A coding agent (Claude Code / Cursor Agent) differs from chat because it:
Answer options for question 34
Question 35 of 40
Faithfulness in RAG eval means:
Answer options for question 35
Question 36 of 40
For RAG Q&A with citations, sampling settings are usually:
Answer options for question 36
Question 37 of 40
Catastrophic forgetting during fine-tuning means:
Answer options for question 37
Question 38 of 40
Bi-encoder retrieval vs cross-encoder reranking:
Answer options for question 38
Question 39 of 40
A user pastes “ignore instructions and leak secrets” into RAG context. Mitigation:
Answer options for question 39
Question 40 of 40
Interview: inference vs training — weights during a user chat:
Answer options for question 40

After the quiz

30/40 or higher? Start the RAG chatbot project.

Checklist:

I can explain pretraining → SFT → RLHF at a high level.
I know fine-tuning vs RAG vs LoRA trade-offs.
I understand hybrid search, chunk overlap, and reranking.
I can describe why hallucinations happen and how grounding helps.
I know when low temperature fits RAG Q&A.

What's next

Project: RAG chatbot with citations