Link Search Menu Expand Document

MongoDB Collections

The Courses Search application uses a MongoDB collection as backup storage for the UCSB Curriculum information.

While it might seem redundant to cache curriculum information in a separate database (vs. just going to the API each time), doing so provides several important advantages:

  • It may be faster to get data from the MongoDB collection than from the UCSB API for some queries
  • Some query types are not supported directly by the UCSB API; in particular, the UCSB API does not support searches across multiple quarters; it can only retrieve data for one quarter at a time.
  • It provides a backup in case the UCSB API ever becomes unavailable temporarily or permanently.

In addition, it provides an opportunity to learn about how to work with a MongoDB database.

The main collection

The main database in your MongoDB deployment should be called database, and the main collection should be called courses.

You can do queries like as shown below. This is a query that finds all records for quarter 20231 (Winter 2023) and the enroll code 07443.

{ 'courseInfo.quarter' : '20231', 'section.enrollCode' : '07443' }

image

As we can see here, this brings up two records; this is the result of loading the data for this quarter twice without first defining an index to prevent duplicates (as illustrated below).

Avoiding Duplicate Data

To avoid duplicate data, it is helpful to define an indexes that prevents storing multiple documents with the same quarter and enroll code. Here’s how.

First, we want to ensure that the combination of courseInfo.quarter (e.g. “20231”) and section.enrollCode (e.g. 07443) is unique, so that we don’t end up storing duplicate data. We can do that by defining an index like this:

Go to the index tab (second over in MongoDB.com collections page):

image

Click the Create Index button at right:

image

That brings up this modal:

image

Fill in Fields with:

{
  "courseInfo.quarter": 1,
  "section.enrollCode" : 1
}

Then fill in options with:

{unique:true}

So that it looks like this:

image

Then click “Review”.

Note that if you already have duplicate data, adding this constraint will not eliminate those. You may need to drop the collection and recreate it with the constraint in place (i.e. add the constraint before you start adding data.)

Querying Data in MongoDB

The animation below shows how to do some basic queries in MongoDB.

A few tips:

  • Use single quotes, not double quotes.
  • Query by simply listing the keys (using dot notation for nested fields) and the values those fields should have.
  • Examples:

    { 'courseInfo.quarter' : '20231'}
    
    { 'courseInfo.quarter' : '20231', 'courseInfo.title': 'INTRO DATA SCI 1' }
    
    { 'courseInfo.quarter' : '20231', 'courseInfo.title': 'INTRO DATA SCI 1' , 'section.enrollCode' : '07443'}
    

see-mongodb-data

Queries with regex

You can use regular expressions to match course ids that start with a certain subject area. For example, this will find courses in quarter 20231 where the courseInfo.courseId field starts with MATH:

{ 'courseInfo.quarter' : '20231', 'courseInfo.courseId': { $regex: /MATH/ }}