3.7 Assignment 05 (Due date: Apr. 19, 2026)
Neural Networks
Assignment 05
 
Due: Apr. 19, 2026
 
Assignment Objectives:
 
The primary goal of this assignment is to implement a simplified Transformer-based language model for next-word prediction.
This assignment is designed to help you understand the following concepts:
  • Tokenization
  • Word embeddings
  • Positional encoding
  • Multi-head self-attention
  • Residual connections
  • Feed-forward subnetworks
  • Model training and inference
To begin, download the Kamangar_05.zip and unzip it on your computer.
 
Allowed libraries
You may use
  • numpy
  • torch
  • torch.nn
  • torch.optim
  • re
  • math
  • random
You may not use
  • torch.nn.Transformer
  • torch.nn.MultiheadAttention
  • pretrained Transformer blocks
  • Hugging Face model classes for the Transformer itself
 
Notes:
  • embedding_dim must be divisible by num_heads
  • self-attention must split the embedding into heads
  • The model must support more than one Transformer block
  • Tensor shapes must be handled correctly
  • The training target is the next word after the input sequence
  • The corpus contains approximately 1000 words and must be used to create next-word prediction examples.
  • Use word-level tokenization
  • Use next-word prediction only
  • Use only the last output position to predict the next word
  • Causal masking is not required.
  • The goal of this assignment is understanding tensor flow and Transformer mechanics. The model does not need to achieve state-of-the-art performance. Keep your implementation modular and readable.
  • DO NOT alter/change the name of the function or the parameters of the function.
  • You may introduce additional functions (helper functions) as needed.
  • The "test_assignment_05.py" file includes a minimal set of unit tests. The assignment grade will be based on your code passing these tests (and possible other additional tests).
  • DO NOT submit the "Assignment_02_tests.py" file when submitting your Assignment_02
  • DO NOT submit the your environment when submitting your assignment.
 
You may run these tests using the command:      python -m pytest --verbose test_assignment_05.py
The following is roughly what your output should look like if all tests pass:
 
test_assignment_05.py::TestTransformer::test_attention_weight_shapes_each_block PASSED        [  7%]
test_assignment_05.py::TestTransformer::test_classifier_weight_shape PASSED                   [ 15%]
test_assignment_05.py::TestTransformer::test_forward_output_shape PASSED                      [ 23%]
test_assignment_05.py::TestTransformer::test_generate_text_exact PASSED                       [ 30%]
test_assignment_05.py::TestTransformer::test_history_decreases PASSED                         [ 38%]
test_assignment_05.py::TestTransformer::test_invalid_embedding_head_combination PASSED        [ 46%]
test_assignment_05.py::TestTransformer::test_invalid_num_blocks PASSED                        [ 53%]
test_assignment_05.py::TestTransformer::test_invalid_sequence_length PASSED                   [ 61%]
test_assignment_05.py::TestTransformer::test_number_of_blocks PASSED                          [ 69%]
test_assignment_05.py::TestTransformer::test_predict_next_word_exact PASSED                   [ 76%]
test_assignment_05.py::TestTransformer::test_predict_next_word_exact_second_pattern PASSED    [ 84%]
test_assignment_05.py::TestTransformer::test_token_embedding_weight_shape PASSED              [ 92%]
test_assignment_05.py::TestTransformer::test_vocab_is_expected PASSED                         [100%]
=============================== 13 passed in 25.94s ================================================
 
 Submission Guidelines:
  • The first four lines of your submitted files must have the following format:
 
# Your name (last-name, first-name)
# Your student ID (100x_xxx_xxx)
# Date of submission (yyyy_mm_dd)
# Assignment_nn_kk