← Back to Home

Week 16

585 words · 3 min read

Summary

April 27 - May 3

Meetings

Accomplishments

When mistakes mean bills

We had a student accumulate ~$78 dollars in bills at the end of the semester.

I took two approaches to analyzing this problem

The snapshots indicated that he had left two Aurora db.r7g.large instances stopped, and that he left the t4g.small running for 6 weeks. Snapshots don’t covera ll possible AWS services, and are a sparse data point unable to provide a clear picture of a student’s AWS use, to know whether billing resulted from a cloud assignment or personal use.

I tried to have claude examine his AWS usage more closely using the same role assumption authentication for autograding, but he has disabled his role. So I had it estimate the costs based on the resources we were able to see. It’s possible that the full $78 dollars was generated by leaving assignment resources running, but his bill screenshot showed service charges from WAF (not part of any cloud assignment) and data transfer (high considering assignment requirements).

These are clues something else is going on, but the instructional site does not have enough data to perform a proper accounting. The student billing feature doesn’t run on a cron, nor does the individual snapshot, which is an oversight I never attended to. I’ve recommended he go into AWS Cost Explorer and export the data as a CSV we can inspect.

Let’s consider how we’ll remedy this in the future:

Daily cost tracking is cheap (one or two API calls on a cron), and should have been the default already. CS351-specific labels was never implemented in a cloud assignment, but can be a pre-requisite for passing a cloud assignment, allowing us to enforce a billing attribution structure in student accounts. Per-student assignment budgets requires us to estimate the cost of completing a cloud assignment, so we can set thresholds for alerts. We can do a better job helping students manage costs by verifying a student has properly torn down an assignment’s resources, either by pro-active monitoring after they get a perfect score, or by having the autograder for later assignments verify previous assignment infrastructure has been torn down.

Cloud Assignment 4

The assignment took most students just under 2 hours to complete. It’s an atypical assignment, it was an easier one, and students had 18 working days instead of the usual 10. Let’s take a look at the assignment graphs.

https://www.cs351.cloud/assignments/3/
https://www.cs351.cloud/assignments/3/
https://www.cs351.cloud/assignments/3/
https://www.cs351.cloud/assignments/3/
https://www.cs351.cloud/assignments/3/
https://www.cs351.cloud/assignments/3/analysis/

The students have improved. Almost all have pushed themselves into positive territory on the cloud assignments (one of the “Disengaged” cohort dropped the course, leaving two others). 48 students (~70%) received a perfect score on all of the last three cloud assignments.

https://www.cs351.cloud/assignments/3/analysis/

I deprovisioned all student deployments and destroyed the EKS cluster and all other AWS resources powering the CA4 assignment. The dashboard I created for CA4 is now a historical record of activity during the assignment.

https://www.cs351.cloud/ca4/
https://www.cs351.cloud/ca4/

We incurred almost $500 dollars in costs during this cloud assignment. While planning, I posted in Slack on April 13th that “expected spend during the 10 days of the assignment is ~$500”. We hit that after 18 days of running the assignment, my post-launch cost optimization tweaks worked well.

https://www.cs351.cloud/cost/

This concludes the cloud assignment experience this year. Thank you!