Assignment 05 (Optional) (Due date: Dec. 1, 2024)

Previous Next

Neural Networks

Assignment 05

Bonus (Optional) Assignment

Due: Dec. 01, 2024

This assignment is optional. If you choose to complete this assignment, the grade for this assignment will replace the grade of your lowest assignment. Please note that there are no partial credits for this assignment, and all components must be fully functional to receive a grade.

Assignment Objectives:

The primary goal of this assignment is to implement a Variational Autoencoder (VAE). The assignment comprises two tasks:

Task 1: Latent Space Interaction with VAE

Build and train a Variational Autoencoder (VAE) model to learn a 6 dimensional latent space representation of a facial data set.
Interact with the decoder part of the trained VAE by creating a graphical user interface (GUI) with six sliders, representing the values in a six-dimensional latent space.
Reconstruct and display facial images in real time when user interacts with the sliders using a mouse. .

Task 2: Masked Image Reconstruction with VAE

Use the same data set as in Task 1 to train a second VAE to handle masked facial images. The mask is assumed to be a square with variable size and location.
Once the training is done, enable the user to load an image and interactively change the position of a square mask over the selected image (using sliders). Display the reconstructed image in real-time using the trained VAE from task 1 and also display the reconstructed image using the trained VAE from task 2.
The masked portion of the image should be set to all zeros.
The normalized size of the mask should be adjustable from 0 to 0.5. The normalized position of the top left of the mask should be between 0 and 1.
Reconstruct the image in real-time as the user moves and changes the mask's size, displaying the associated Mean Squared Error (MSE).

Datasets:

You have the option to use any of the following facial data sets:

LFW (Labeled Faces in the Wild): Contains approximately 13,000 labeled facial images.
CelebA: Comprises over 200,000 celebrity images.
FER2013 (Facial Expression Recognition 2013): Includes around 35,000 images for facial expression analysis.
IMDB-WIKI: Contains over half a million images of celebrities.
CASIA WebFace: Includes over 500,000 images of celebrities.
MS Celeb 1M: Consists of around 10 million images of celebrities.
300 Faces In-the-Wild (300W): A dataset with 68,000 labeled faces displaying varying poses, expressions, and occlusions, often used for facial landmark detection.
Multi-PIE: A dataset with more than 750,000 images of 337 individuals, captured under different illumination, pose, and expression conditions, commonly used for face recognition research.
AFLW (Annotated Facial Landmarks in the Wild): Contains over 25,000 in-the-wild facial images with annotated facial landmarks.

Grading Criteria:

This assignment is entirely optional, and there are no partial credits. To receive a grade for this assignment, all components in Task 1 and Task 2 must be fully functional. It is expected that the graphical interface and real-time reconstruction work seamlessly for both tasks.

Notes:

Your GUI must exactly match the image shown below.
Apart from the number of latent variables, your are free to choose the architecture of your VAEs.
Submit your saved trained model with your submission.
Your program must automatically load the saved model when it starts to run.
Your program must automatically show the GUI when it runs.
Do not submit the facial dataset.
Please ensure that you consult the specific data set's documentation and comply with usage terms and permissions when working with facial data sets.

Submission Guidelines:

The first four lines of your submitted files must have the following format:

# Your name (last-name, first-name)

# Your student ID (100x_xxx_xxx)

# Date of submission (yyyy_mm_dd)

# Assignment_nn_kk

Create a directory and name it according to the submission guidelines and include your files in that directory.
Zip the directory and upload it to Canvas according to the submission guidelines.

Top

Computer Science and Engineering @ UTA	Farhad Kamangar
Computer Science and Engineering @ UTA	Last updated: 2024-11-26

	Home \| Table of Contents \| Overview Map
	Neural Networks Fall 2024
	Farhad Kamangar kamangar@uta.edu