← Back to Home

Cloud Assignment 1

Purdue University CS351 Spring 2026

Due February 6th 11:00PM

Introduction

In this assignment, you will launch your own Software-as-a-Service (Saas) on an Infrastructure-as-a-Service (IaaS) platform. You will become familiar with virtual machines using the EC2 service and practice deploying a real application to the cloud. You'll use practical Linux commands, learn DevOps principles, and work within the Python ecosystem, as well as gain some insight into web application architecture. The experience of working through this assignment will help motivate why the cloud has seen such rapid adoption, and how running software on the cloud feels. Follow the instructions carefully, and you'll move through this assignment quickly. This is your first step into cloud engineering, it may feel uncomfortable or frustrating at times. Running software across multiple architectures, services, and regions is not simple, and over the next few assignments you'll learn to tame the complexity that you get a taste of today.

Contents

  1. Initial Setup
    1. Set AWS Budget
    2. Authorize Autograder
  2. Application Design
    1. Login Page
    2. Image Upload Page
    3. Detection Result Page
  3. Application Architecture
    1. Preparing project environment
    2. Notes on our Django application
  4. System Architecture
    1. Create EC2 virtual machine
    2. Connect to the VM
    3. Add swap to the VM
    4. Add autograding access to the VM
  5. Deploying Django to EC2
    1. Syncing source code
    2. Preparing the VM
  6. Teardown

Initial Setup

Use an account eligible for the AWS Free Tier. Make a post on Ed if you don't have an eligible account. This assignment will use AWS region "us-east-1", do not use AWS services in any other region. You can control your current region from the menu bar of the AWS Console.

https://us-east-1.console.aws.amazon.com/console/home?region=us-east-1#
AWS Console Regions menu with cursor pointing at us-east-1

You may have performed these steps in Cloud Assignment 0. If you got a 100% on the assignment and haven't changed those resources, then you can skip the "Initial Setup" section.

Set AWS Budget

We're going to set up an AWS Budget that will alert you if you get any monetary charges.

  1. Navigate to the Billing feature in the AWS Billing and Cost Management Service.
  2. Click "Create Budget"
  3. Select "Use a template" then "Zero spend budget"
  4. Add your email to the "Email recipients" text area.
  5. Click "Create budget"

Authorize Autograder

To allow instructors to grade your work and ensure authenticity of your submissions, we require read-only access to your AWS account. Follow the steps below to set up access

  1. Navigate to IAM (Identity and Access Management) in your AWS account.
  2. Go to the "Users" tab.
  3. Create a new user:
  4. Set up an access key for this user
  5. Include the access key and the secret key in your submission

Ensure that this account remains unchanged until the course concludes. Afterwards, you can remove it. Note: If the keys are no longer accessible, we will not be able to grade your assignments. If you delete and re-create your keys, make sure to update the key values in your submitted credentials file.

Now create an empty file called "credentials", and populate it with:

credentials
[default]
aws_access_key_id = access key id you just generated
aws_secret_access_key = secret access key you just generated

Replace the red values above with the values from the access key you attached to the autograder user. Remove any quotation marks surrounding the values. This file should be an ASCII text file, you can create one using vim, vscode, or another plain text editor.

🚨 Important

Safekeep the credentials file, you will use it for all assignment submissions during the semester. AWS access key secrets and IDs are highly privileged information and you must take care to keep them secure. DO NOT share them publicly, DO NOT store them on a untrustworthy service or device, DO NOT take pictures or screenshots of them, and avoid committing them to source control unless you encrypt them and secure the encryption key. IF YOU SHARE YOUR KEY PUBLICLY OR ARE OTHERWISE CONCERNED ABOUT SECURITY delete the key in question and create a new one.

Now you can submit this file to CA1 on Gradescope. At this point, you should pass test cases 1.1, 1.2, and 1.3. Each time you resubmit, the autograder will go through the current configuration of your AWS account, showing you your progress in the assignment.

As you work on the assignment, you can use the Gradescope autograder to make sure that you complete all the objectives. Because part of the cloud engineering experience is to navigate the documentation and do your own research, we will not guide you through each command, as you can figure them out by consulting relevant sources, for example this one. Moreover, this gives you more freedom in how you approach the problem, and as long as your solution passes all the test cases you'll receive full credit.

Application Design

Our SaaS is a startup called "Bird.ai". It solves the important problem of finding a bird in a photo. With sunny optimism you expect to make millions off this software from the sleepy comfort of your dorm room. Let's get started on our lottery ticket.

Our SaaS web app will have three pages:

Login Page

Users will login with a username and password (You will manually create a user later in the instructions, there won't be a user registration page). Requiring users to authenticate before being able to view any pages is some of the strongest security a SaaS can have.

http://localhost:8000/login
Bird.ai Login Page

Image Upload Page

Users will be able to upload an image to our application and be redirected to the results page.

http://localhost:8000/
Bird.ai Image Upload Page

Detection Result Page

The results page will show if and where a bird was been detected

http://localhost:8000/result/
Bird.ai Detection Result Page (Successful)

Application Architecture

Our application will use a web framework called Django. Every web application has to conform to the conventions of the Hypertext Transfer Protocol (HTTP) and uses web browser features to power its user experience. Because of this common medium, web applications generally share a lot of features and characteristics, regardless of programming language or use-case. Frameworks abstract implementation details into an API that puts common tasks easily in reach. Why re-invent the wheel each time you start a new project? A framework's pitch promises you can focus on the essential complexity of the problem you're solving, not the incidental complexity of the web as a platform.

Django is a commonly used Python web framework. It has good documentation, tutorials, a large community, and is an actively developed open source project. These are all important factors to consider when selecting a framework.

Our cloud environment will be running a Linux operating system. If you are on Windows, I recommend using Windows Subsystem for Linux (WSL) for this project. If you are on MacOS and would like to use a Linux virtual machine too, I recommend OrbStack.

We'll begin by setting up the Django application locally before deploying it to our cloud server. Be patient, Python has a historically messy packaging system. Working through its quirks now will show you later how useful other course topics like "Containers" and "Automation" are, giving you a feeling for why we adopt certain practices.

Preparing project environment

First, create a project directory, let's call it "cloud_assignment_1". Then, download the bird.ai.zip archive from Brightspace and place it in your "cloud_assignment_1" directory to extract it.

Terminal
$ mkdir -p cloud_assignment_1
$ cd cloud_assignment_1
$ unzip bird.ai.zip

Next, install Python 3.14.2 on your machine. Then, in your project directory, we'll initialize a virtual environment to manage our software dependencies. Reproducibility is very important in modern cloud development. Learn more about virtual environments and the venv tool we're using. Our dependencies will be installed with pip, Python's main package manager. If you are struggling to manage multiple Python versions on your machine, checkout pyenv. Note that a Python virtual environment is not a virtual machine, they are very different. Python virtual environments and "requirements.txt" files have parallels in many other programming language communities, and will be an important part of working with containers in the future.

So, let's create a virtual environment and install Django. Make sure you are using Python 3.14.2.

Terminal
$ cd cloud_assignment_1
$ python --version
Python 3.14.2
$ python -m venv .venv
$ echo '.venv' >> .gitignore # important
$ source .venv/bin/activate # activate the virtual environment
(.venv) $ python -m pip install Django ultralytics whitenoise

Do not commit the virtual environment folder to source control. It contains modules compiled for your machine's hardware architecture, which others working on your project may not share. Your virtual environment is not active until you run source .venv/bin/activate, it fixes your shell's environment variables to locate your project dependencies. It's just a shell script, take a look with cat .venv/bin/activate. When you're done working on the project, run deactivate or simply close the shell session.

Next we'll see how to save a requirements.txt file that allows others to reproduce the virtual environment we just created. Remember to run pip help and pip help <command> to learn more about the tools we are using.

Terminal
(.venv) $ pip freeze > requirements.txt # notice virtual environment is active
(.venv) $ cat requirements.txt | grep Django
Django==5.2.10
(.venv) $ django-admin --version
5.2.10

Let's prove we're able to recreate our project's environment.

Terminal
(.venv) $ deactivate # exit the virtual environment
$ django-admin --version
bash: command not found: django-admin
$ rm -rf .venv # delete the project dependencies
$ python -m venv .venv # Now recreate our project environment from scratch
$ django-admin --version # We're not there yet
bash: command not found: django-admin
$ source .venv/bin/activate # next step
(.venv) $ django-admin --version # but still not there yet
bash: command not found: django-admin
(.venv) $ pip install -r requirements.txt # install from our dependency list
Collecting asgiref==3.11.0 (from -r requirements.txt (line 1))
Using cached asgiref-3.11.0-py3-none-any.whl.metadata (9.3 kB)
Collecting Django==5.2.10 (from -r requirements.txt (line 2))
Using cached django-5.2.10-py3-none-any.whl.metadata (4.1 kB)
Collecting sqlparse==0.5.5 (from -r requirements.txt (line 3))
Using cached sqlparse-0.5.5-py3-none-any.whl.metadata (4.7 kB)
...
(.venv) $ django-admin --version # now we've fully recreated the environment
5.2.10

Now our Python project has many of the pieces needed for reproducibility. Just a few more steps and we can run our SaaS app locally. Next, using Django's "createsuperuser" command, we're going to create a username and password you'll use to login to the application . Running this command will initialize a user in the database with "superuser" privileges. In response to the CLI prompt, set the username to "purdue" and the password to "purduecsinthecloud".

Terminal
(.venv) $ cd bird.ai
(.venv) $ python manage.py collectstatic
(.venv) $ python manage.py migrate
(.venv) $ python manage.py createsuperuser # fill in above username and password

We can finally run the web application server. It will take a second to startup at first, our machine learning model is initializing and installing its dependencies. Visit http://localhost:8000 and login with the user you created when its ready.

Terminal
(.venv) $ python manage.py runserver
🌈 The More You Know

localhost is a hostname representing your own computer. It corresponds to a special IP address called the loopback address (127.0.0.1). IP addresses let you find computers connected across networks. Often, if you are using a piece of software that's meant to connect to a network (like a web server wants to connect to the Internet), you'll have it connect to a physical network interface and listen on a port. If a web request is a mail package, the IP address is the city and state while the port is the street address. The Internet will get the package to your city, while the operating system is the mailman who moves the package from the city's post office to the web server's mailbox on the street. The web server will pick up the package, open it, look at the return address, and send you a new package in response. For regular networks, sending mail like this involves fiber optics, ethernet cables, internet routers, and expensive hardware. The loopback address is a virtual network interface, implemented in software, that never leaves your machine (Later, you'll see another special address, 0.0.0.0, the default route). The opened package contains a payload in the shape (really, the "protocol") your server was expecting. When we visited the URL http://localhost:8000, the package we sent to our web server contained a message in the shape of the HTTP protocol, the lingua franca of the World Wide Web.

Notes on our Django application

Run python manage.py --help to see a list of project-specific commands you use to develop and deploy this application. Remember how we don't want to re-invent the wheel? These commands take care of common tasks for us.

By default, Django is configured to use SQLite as its backing database. In SQLite, databases are single files, which makes them easy to copy around.

Django projects are organized as a collection of applications. For this assignment, we have a single application called "app", which implements all our SaaS features. Django is a Model-View-Controller (MVC) web framework. Many user interfaces are developed in an MVC style. In Django, "Views" determine what a user sees based on HTML templates, "Controllers" decide what actions occur when a user navigates the website and triggers an HTTP request, and "Models" describe the schema of any data persisted in the database between user actions. This will become concrete as you explore our bird recognition application.

Django controllers are defined in a file views.py. It's a misleading name (Django has been around a while) but we know its a controller because it decides what to do in response to an HTTP request. The HTML template files in the templates folder are views, they accept arguments and transform those arguments into a format that defines what the user sees. This application doesn't define any of its own models, but it does use models defined in Django's built-in library.

Controllers are mapped to an HTTP URL in the urls.py file. When the Django application receives a web request for "localhost:8000/", it performs a lookup in the urlpatterns object to match the path "/" to a controller, and then passes the request along to that controller for handling.

Bird detection occurs in a controller. It's accomplished using a You Only Look Once (YOLO) convolutional neural network model. We're using Ultralytic's latest, specifically the "YOLOE-26: Open-Vocabulary Instance Segmentation" model with the fewest number of parameters available (YOLOE-26n-seg). This lets us use natural language to describe what we would like to detect while still achieving good performance on a relatively weak EC2 instance type.

System Architecture

The system architecture for our SaaS begins with an EC2 virtual machine. You will be spinning up a t4g.small. It has an AWS Graviton 2 processor, Amazon's proprietary ARM chip that powers most of their cloud workloads and currently holds the best price/performance ratio of any chips offered on their platform.

Let's take a look at some of the free tier instance types

TypeCPUPrice/hrvCPUMemoryNetworkBlock StorageHypervisor Schedule
t3.microIntel Xeon Platinum 8000 1st/2nd Gen$0.010421GiB5 Gbps2.8 GbpsBurst
t3.smallIntel Xeon Platinum 8000 1st/2nd Gen$0.020822 GiB5 Gbps2.8 GbpsBurst
t4g.microAWS Graviton 2$0.008421 GiB5 Gbps2.8 GbpsBurst
t4g.smallAWS Graviton 2$0.016822 GiB5 Gbps2.8 GbpsBurst

These free tier instances are some of the cheapest provided by AWS, but why are they so cheap? An instance type is defined both by the hardware composing the server, and by the resource allotment provided by the hypervisor. The t3 and t4g families are "burstable" instances. "Burstable" is AWS's answer to the bin packing problem and noisy neighbor problem of virtual machines. AWS wants to fit as many VMs as possible on a server, and VMs compete for resources (you can't respond to an HTTP request on a cloud server if another VM is using the physical interface you're requesting, i.e. the network interface card) so hypervisors control each virtual machine's access to resources by enforcing a schedule. AWS has a proprietary hypervisor "Nitro" they use for most of their EC2 instances.

t3 and t4g instances are heavily throttled by the hypervisor, they are only guaranteed a 10%-20% baseline performance per vCPU. A VM earns "CPU Credits" when idle, which are paid to the hypervisor to "burst" and get more resource access later. Those resources could be CPU time, network bandwidth, or disk bandwidth. If you limit a virtual machine's resource consumption by a factor of 10, you've just increased how many virtual machines you can fit on a server by 10. Certain customer workloads are bursty enough as a function of time that they love the cost savings and aren't affected by the performance hit from throttling because they usually have enough CPU credits to "pay" the hypervisor for more resources.

AWS has turned a negative user experience into a profitable product, by charging you less for VM instances that have noisy neighbors, and enforcing the fairness of a "burst" schedule using the hypervisor, giving them a statistic they can use to optimize how many VMs they pack in a single server, increasing their profit.

Create EC2 virtual machine

For this part of the assignment, you are required to have a virtual machine named "ca1" running. It should be configured with the Amazon Linux operating system. You also need to configure it with the AWS free tier in mind, and add 30GB in block storage.

Note: the AWS free tier for EC2 changed on July 15, 2025. If your account was created after July 15, 2025 use the t4g.small free tier instance. If your account was created before July 15, 2025, reach out to the instructional staff on Ed.

Connect to the VM

Use ssh and the key pair assigned to the VM in the previous step. You might find ssh -i useful.

Add swap to the VM

Our EC2 instance is small, and when it gets short on memory the kernel may reclaim some by randomly killing processes. To prevent memory spikes from closing SSH connections, killing web servers, or crashing the VM, we'll register 2GB of overflow memory called swap.

Terminal
$ sudo fallocate -l 2G /swapfile_2G
$ sudo chmod 600 /swapfile_2G
$ sudo mkswap /swapfile_2G
$ sudo swapon /swapfile_2G
$ sudo swapon --show

Add autograding access to the VM

Inside the VM go to the /home/ec2-user/.ssh/authorized_keys file and append the following key for use by the autograder:

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAORDoNEnPakX83lU2uUDnMWQm4L4ytchU3Mi80Uw+H7

Deploying Django to EC2

Our application is working, but only on our personal computer. We'll deploy it to a cloud provider so Bird.ai's customers can access this important service from anywhere in the world.

Syncing source code

We need to put our application files on the VM where they can be executed by the Python interpreter. Any assets we may have, such as images, files, or styles, have to be on the VM too.

There are a lot of ways to move files from one machine to another, and one of the fastest is rsync. Install and use rsync to copy the directory bird.ai into the home directory of the EC2 instance at ~/bird.ai. Also sync over the requirements.txt file into the home directory. Do NOT sync over the .venv directory, you will recreate it on the server. (hint: you can set ssh options using the -e flag and transfer directories using -a. Learn more by reading man rsync)

Preparing the VM

ssh into your EC2 instance "ca1". Make sure to install Python 3.14.

Terminal
$ sudo dnf update -y
$ sudo dnf install mesa-libGL git-core gcc make patch zlib-devel bzip2-devel \
   readline-devel sqlite-devel openssl-devel tk-devel libffi-devel xz-devel -y
$ curl https://pyenv.run | bash
$ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> .bashrc
$ echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> .bashrc
$ echo 'eval "$(pyenv init - bash)"' >> .bashrc
$ echo '3.14.2' > .python-version
$ source ~/.bashrc
$ pyenv install

cd into the bird.ai directory. Recreate and activate the virtual environment using the steps above. You will also have to re-run the python manage.py commands that migrated the database, created a user, and prepared any static files.

Once everything is ready, you can start the server listening on the default route (0.0.0.0) at port 8000.

python manage.py runserver 0.0.0.0:8000

You almost have your very own SaaS, just one more step. Update the security group for your EC2 instance to allow inbound "Custom TCP" traffic to port "8000" from any IP address (CIDR "0.0.0.0/0").

Now you can visit your web-based Software-as-a-Service at "http://<your-ec2-instance-public-ip>:8000" and upload a photo to check if a bird is present.

Happy birding!

Teardown

Make sure that you remove all resources you created after obtaining a full score on Gradescope. Pay particular attention to the fact that the free tier includes running instances. After you stop the instance, you are no longer charged usage or data transfer fees for it. However, you will still be billed for associated Elastic IP addresses and EBS volumes. Make sure to terminate (delete) any instances instead of stopping. The only resource we created this assignment is the "ca1" EC2 instance.

https://us-east-1.console.aws.amazon.com/ec2/home?region=us-east-1#Instances:
EC2 Console: Terminate instance button