Legacy Code: Courses Search
Courses search is intended as an application that provides a more functional version of the official public facing course search app available at the address: https://my.sa.ucsb.edu/public/curriculum/coursesearch.aspx
Our version, currently deployed at https://courses.dokku-00.cs.ucsb.edu provides many more features:
For example:
- Search by instructor (What courses has Diba Mirza taught?)
- Search by course over a range of quarters (Who has taught CMPSC 130A over time?)
- and many more
Explanation | Link |
---|---|
Source code | https://github.com/ucsb-cs156/proj-courses |
Production Deployment | https://courses.dokku-00.cs.ucsb.edu |
QA Deployment (for CMPSC156 course staff) | https://courses-qa.dokku-00.cs.ucsb.edu |
UCSB_API_KEY values
To deploy proj-courses, in addition to the usual GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET needed for other OAuth apps in this course, you will also need a value for UCSB_API_KEY. This is a key that gives you access to the API for UCSB course information. These keys are obtained from the website https://developer.ucsb.edu.
You can request your own account, but it is typically faster to get one from the instructor, who will provide it to you on your team slack channel.
When setting up API keys, we typically enable all of the API endpoints that are not sensitive or protected (i.e. the auto-approved endpoints.) As of this writing, those are these endpoints:
- Academics - Events
- Academics - Academic Quarter Calendar
- Academics - Curriculums
- Dining - Dining Commons
- Dining - Dining Menu
- Students - Student Record Code Lookups
The ones that are currently used in the app are:
- Academics - Curriculums
- Students - Student Record Code Lookups
MongoDB Collections
The Courses Search application uses a MongoDB collection as backup storage for the UCSB Curriculum information, so you will also need to set up a MongoDB collection and put the URL for that collection in your environment variables. Read on for more explanation.
Why MongoDB for UCSB Courses Search
While it might seem redundant to cache curriculum information in a separate database (vs. just going to the API each time), doing so provides several important advantages:
- It may be faster to get data from the MongoDB collection than from the UCSB API for some queries
- Some query types are not supported directly by the UCSB API; in particular, the UCSB API does not support searches across multiple quarters; it can only retrieve data for one quarter at a time.
- It provides a backup in case the UCSB API ever becomes unavailable temporarily or permanently.
In addition, it provides an opportunity to learn about how to work with a MongoDB database.
Getting the value for MONGODB_URI
In the .env
file for proj-courses
, you’ll need a value for MONGODB_URI
. That URI contains the hostname, username, password, and other information needed to connect to a MongoDB database provided by MongoDB.com.
To start, in order to keep things simple, we will use the same database credentials for localhost, dev, qa and prod. We acknowledge that this is not in keeping with “best practices” which would keep these separate (as they are for the SQL databases). You have the option of learning how to create your own databases on MongoDB.com and then getting separate credentials for these if this becomes an issue that you need to tackle.
To get the value for MONGODB_URI, here’s what you need to do.
- Login to MongoDB.com; you should have gotten an invitation to a project in your email that matches your team name e.g.
f23-7pm-1
,f23-7pm-2
, etc. under an organization that matches your class name (e.g.ucsb-cs156-f23
). - Find your project. That should be a page that looks like this:
- On the card for the deployment, there should be a
Connect
button, as shown below. Click that button - That brings up this modal. Click Drivers (the first option).
-
That takes you to a page like this one. The page asks you to choose a driver, but as it turns out that for both the default option (
Node.js
) and the actual option we are using (Java
), the URI is exactly the same, and that’s all we need here. You will get the URI from the box as shown below. Note, however, that this is not the final form of the URI; we’ll have to do some editing here. But, go ahead and copy that URI and paste it into your.env
file as the value ofMONGODB_URI
. For example, if the screen looks like this:Then in your
.env
file, you’ll start with this:MONGODB_URI=mongodb+srv://user:<password>@cluster0.vwrcszq.mongodb.net/?retryWrites=true&w=majority
-
Now, we need to do two things. The first one is to add in the name of our database, which we are calling
database
in order to keep things simple. The textdatabase
goes right after the single/
and before the?retryWrites
, like this:- OLD: MONGODB_URI=mongodb+srv://user:<password>@cluster0.vwrcszq.mongodb.net/?retryWrites=true&w=majority
- NEW: MONGODB_URI=mongodb+srv://user:<password>@cluster0.vwrcszq.mongodb.net/database?retryWrites=true&w=majority
Note that by convention, our projects use
database
as the database name, but if you use something else (saypotato
) then this should go in this spot in the URI. -
I said there were two things. The second thing is to replace
<password>
(including the<>
) with the actual password. Note that while there is a password that’s already set for the usernameuser
, it is not possible to look that up; the only thing we can do is reset it. So the next few steps are to reset this password, and then copy that value in, replacing the<password>
in the URI.Note that another option is to add addtional users with names other than
user
; each of those can have a different password. This may be important, because otherwise, if everyone on the team is sharing the same password, any time you need to update it, you have to coordinate with the entire team; changing it changes it for everyone. So, it’s up to the team how to manage this. Decide whether its easier to have just one user/password combination, or whether it’s easier to create separate user/passwords for each team member.To change the password value, navigate to the main page of the project, so that it looks like this:
Find
Database Access
on the left nav and click. You’ll see a page like this one:Here, you have a choice; edit the user called
user
to reset the password, or create additional users. If you create additional users, you’ll need to change theuser
in the MONGODB_URI as well; for example, if you createcgaucho
, then the MONGO_URI changes like this:- OLD: MONGODB_URI=mongodb+srv://user:<password>@cluster0.vwrcszq.mongodb.net/database?retryWrites=true&w=majority
- NEW: MONGODB_URI=mongodb+srv://cgaucho:<password>@cluster0.vwrcszq.mongodb.net/database?retryWrites=true&w=majority
In any case, to change the password, you go into the
edit
button beside the user. You’ll see this:Click the
Edit Password
button.Click
Autogenerate Secure Password
. Then either click theCopy
button to copy the password, or click theShow
button and copy it manually. Either way, it’s probably a good idea toShow
the password and copy/paste it into your URI before you clickUpdate User
to make sure they match. And you must scroll down and clickUpdate User
! This is easy to forget because theUpdate User
is not visible until you scroll down. I have made this mistaken dozens of times, and it’s a time-consuming one to make.Here’s what that looks like, including copying the password into the
MONGODB_URI
in.env
and scrolling down to click Update User:
With that (along with configuring the other values in .env
), you should be able to run mvn spring-boot:run
To check if it is working, login as an admin, and try loading courses for a quarter, like this:
As shown in the animation above, if the MongoDB database is working, you’ll be able to see messages indicating success.
At that point, you can also browse the MongoDB collection manually, as shown below.
Navigating to the databases and collections
Your team should have a deployment with the default name (typically cluster0
). Inside this deployment, there should be a database called database
, which can contain multiple collections.
Collections functions similar to tables in an SQL database, although they have a differerent structure: they are key/value stores that function similar to JSON objects. In fact, they use a variant of JSON known as BSON (which stands for “Binary Javascript Object Notation”). It’s the same idea, except, internally, the objects are not literally stored in JSON, but in binary encoding of JSON.
To navigate to your database and collections, follow this path in MongoDB.com:
From the deployment card, click Browse Collections
:
You should see an interface similar to this one:
The collections themselves are explained in further detail below.
The main collection
The main database in your MongoDB deployment should be called database
, and the main collection should be called courses
.
You can do queries like as shown below. This is a query that finds all records for quarter 20231
(Winter 2023) and the enroll code 07443
.
{ 'courseInfo.quarter' : '20231', 'section.enrollCode' : '07443' }
As we can see here, this brings up two records; this is the result of loading the data for this quarter twice without first defining an index to prevent duplicates (as illustrated below).
Avoiding Duplicate Data
To avoid duplicate data, it is helpful to define an indexes that prevents storing multiple documents with the same quarter and enroll code. Here’s how.
First, we want to ensure that the combination of courseInfo.quarter
(e.g. “20231”) and section.enrollCode
(e.g. 07443
) is unique, so that we don’t end up storing duplicate data. We can do that by defining an index like this:
Go to the index tab (second over in MongoDB.com collections page):
Click the Create Index
button at right:
That brings up this modal:
Fill in Fields
with:
{
"courseInfo.quarter": 1,
"section.enrollCode" : 1
}
Then fill in options with:
{unique:true}
So that it looks like this:
Then click “Review”.
Note that if you already have duplicate data, adding this constraint will not eliminate those. You may need to drop the collection and recreate it with the constraint in place (i.e. add the constraint before you start adding data.)
Querying Data in MongoDB
The animation below shows how to do some basic queries in MongoDB.
A few tips:
- Use single quotes, not double quotes.
- Query by simply listing the keys (using dot notation for nested fields) and the values those fields should have.
-
Examples:
{ 'courseInfo.quarter' : '20231'}
{ 'courseInfo.quarter' : '20231', 'courseInfo.title': 'INTRO DATA SCI 1' }
{ 'courseInfo.quarter' : '20231', 'courseInfo.title': 'INTRO DATA SCI 1' , 'section.enrollCode' : '07443'}
Queries with regex
You can use regular expressions to match course ids that start with a certain subject area. For example, this will find courses in quarter 20231
where the courseInfo.courseId
field starts with MATH
:
{ 'courseInfo.quarter' : '20231', 'courseInfo.courseId': { $regex: /MATH/ }}
Staff Setup for proj-courses
This section is for the staff and instructor. Students and others are welcome to look at it, but it typically doesn’t pertain unless you are on the staff of the course (instructor/TA/LA). Click the triangle to reveal the details.
Set up on MongoDB.com
Dokku does have the ability to create MongoDB instances, and eventually it would be nice to migrate to that solution. For the time being, however, we have not figured out how to take advantage of that capability and connect it to our Spring Boot code bases. Therefore, for the time being, we are using the free tier of https://mongodb.com to provision mongodb databases for our courses.
To set up resources for proj-courses for CS156 using mongodb.com, here is how we’ve proceeded in the past.
First, login to MongoDB.com with your UCSB Google Account.
Creating an organization
Here is how you navigate to the page where you can create a new organization, as illustrated in the animation below:
- Select the dropdown, upper left, that shows your organizations.
- Scroll to the bottom where it says:
All Organizations
- On that page, a
Create New Organization
button appears at upper left.
The form to create a new organization has multiple pages. On the first page, put in the name of your organization (e.g. ucsb-cs156-f23
), select MongoDB Atlas
and then scroll down and click Next
On the next screen, take the default (which is that the option Require IP Access List for the Atlas Administration API
is selected) and click Create Organization
That takes you to this screen, where you can create projects:
Creating Projects in a MongoDB Organization
If you just created your organization, it will probably be selected as the default organization, upper left, but if not, select your organization like this:
Then, click Projects
in the left nav to get to the Projects page where you can see your projects and create a new project by clicking the New Project
button. I suggest creating one project per team (for the teams that need a MongoDB deployment, which includes, at least, all teams working on proj-courses). Here, we’ll just create the projects, and deal with adding users as a separate step.
Now that we have one project per team, we’ll add users and create deployments.
Adding Users to a Project
Next, pull up a roster with the @ucsb.edu
email addresses of the members of each of the teams for which you set up a project; in the example here, that’s f23-7pm-1
, f23-7pm-2
, f23-7pm-3
, f23-7pm-4
as shown on the project page:
Choose the first project, e.g. f23-7pm-1
by clicking its name; that takes you to a page like this one:
Find the Access Manager
tab at the top of the page, like this, and select Project Access
That takes you here:
Clicking Invite to Project
takes you here:
Regrettably, it does not seem to be possible to paste in multiple email addresses, even if separated by spaces or commas; this has to be done one at a time.
Paste in each email address to create an invitation. As you do, for each invitation, click Project Owner
in addition to Project Read Only
; this makes it unnecessary to click each individual box, and gives students full permission over their project.
Repeat for each student on the team. If you want to delegate this to a TA or LA, you can add them as organization owners (we’ll cover this below).
After inviting all students, it may be helpful to post a message on each of their team slack channels, or on the proj-courses slack channel saying something like the following.
@channel Please look in your email for an invitation to create a MongoDB.com account, and join the project for your team (e.g. `f23-7pm-1`, `f23-7pm-2`, `f23-7pm-3`, `f23-7pm-4`). Please accept that invitation before your next scheduled class, or during your next scheduled class.
You will need this access in order to work on proj-courses. We'll explain more in class, and/or in future posts to this slack channel.
We'd like to see that all team members have done so by the end of the next scheduled class.
Creating Deployments
For each team (in course terms), i.e. each project (in Mongodb.com terms), it is now necessary to create a deployment. The deployment is where the MongoDB databases and collections are actually deployed, i.e. it is a kind of virtual server where the databases are deployed.
To create a deployment for a team/project, navigate to the project page and select the Data Services
tab (leftmost at top). If it is a new project with no deployments yet, you’ll likely see the very prominent Create a deployment
card with a Create
button in the middle of it, like this:
When you click that button you’ll be taken to this screen, where the obvious choice is the free tier. Click on that to select it so that it is outlined in dark green, like this (note that the free tier is not the default option!)
The scroll down, where you can choose a provider: AWS, Google Cloud, or Microsoft Azure. At present, all three are options for the free tier, so choose whichever you like. AWS is the current default, so that’s what I usually choose.
The default region is US East (Virginia)
; I typically change this to US West (Oregon)
since we are located on the West Coast, but that’s again up to you.
I typically take the default name for the cluster:
Finally, click Create
:
When you do, you’ll be directed to this screen where we’ll set up the security for the cluster:
For the first question, take the default (Username/Password authentication):
Scroll down to this page. Here, the default username will probably be the username from email address of the person creating the deployment; you should not take this default. This username is the one that will be embedded in the credentials for the database, so it’s more typically something like user
. I suggest changing it to user
.
So change it to this:
The default generated password is fine; the one shown in this screenshot is NOT the one I used; I clicked the Autogenerate Secure Password
to create a new one after taking this screenshot.
Click Create User
. Note that the password is now no longer available; it will be necessary, later, for the students to generate their own password when they want to embed the password in their URL. That’s fine; we’ll provide instructions to walk them through that.
The next step is to scroll down to this section, where you’ll need to change the IP address restrictions:
We need to change this to allow connections from all IP addresses. An alternative would be to get the public IP address of their dokku server, or the IP address range for UCSB, and put that in. That would be preferable, but as of this writing, we are still just leaving this open; we used to have to do that since we were using Heroku, and could not predict the IP address ranges that Heroku would be coming from when connecting to MongoDB.
Here’s how to open the deployment to connections from all IP addresses:
- Enter
0.0.0.0
for the IP address, andEntire internet
as the description. - Add that entry
- If desired, remove the specific entry for the ip address you were at when you create the deployment.
- Then click
Finish and Close
to save the changes
That is illustrated here:
Updating Access
If you need to modify the user or password, you can get back to edit those from the project page by selecting Database Access
from the left nav of the project page, as shown below. Be sure that if you change the password, you click Save Changes
, or they will not take effect and the old password will still be the one in use.