Sets - Usage

We've learned all about sets, but it's about time we take some time to explore some real world examples of how they can be used.

We're working on inlining examples! Until then, please find the examples here on GitHub.

Sets - Usage

We've covered how Sets are used for situations in which you need a collection of items when the order is irrelevant and you want to avoid duplicates. That knowledge is great to have, but lets see how we can apply them to things that you may come across during your work.

Our domain for examples will be various needs for a fictional chat application, since they have a number of sticky areas that many areas can relate to.

We'll use the node REPL for these examples. So I'm going to open that up and then we'll import MySet from index where it's defined. And now we're ready to get started.

const MySet = require('./index')

Let's first assume that we want to track members who are online, which is also known as tracking their presence. Seems like it'd be simple enough to plop them into a list, right? We can define that as usersList and we'll populate it with 4 users to start: foo, bar, baz, and buz.

let usersList = ['foo','bar','baz','buz']

Next we'll define a method named addUser which will handle adding users to our chat application. It'll take two arguments, the usersList and our new user. We'll assume that it does something with a user object passed in, and then it pushes the user's username onto our usersList to track which users are online. It'll then return the list of users once it's done.

function addUser(usersList, user) {
  // Something with user itself for storage
  return usersList;

So now if we want to add a new user we'll call addUser, pass in our usersList, and then we'll pass in an object that presumably would have more keys, but we just need to pass in the username key, and we'll assign it biz.

addUser(usersList, {username: 'biz'})

Now we see that our usersList has appended biz on the end. If I call this a few more times, we keep getting more biz entries to the list. There could be a number of reasons this may happen with a real chat service - maybe a user logs in from multiple locations or multiple devices. Maybe they disconnected and reconnected quickly, and their old connection didn't time out yet. So in short, we only want to show a single username in our presence roster. We also don't need to add a bunch of nasty conditionals to areas of the code that have nothing to do with tracking presence. We'll just switch our userList data structure over to a set and get the benefits for free. So lets assign it to a new instance of MySet created from our previous list.

usersList = MySet.createFrom(usersList);

And then just for the sake of having a matching API for this example, I'm going to define the method push on our MySet.prototype to be equal to our already existing add method. If you aren't familiar with prototypes in JavaScript, just think of this as me adding the method push to every MySet object, just as though it was defined via a class declaration.

MySet.prototype.push = MySet.prototype.add

Okay now if I call our addUser method again and again and again we can see that no new users are being added to the list - we're keeping only 5 entries since biz is already in the list.

So now we know we're able to quickly determine who is online at any given point, and our data structure is pretty performant which is good because presence is checked constantly. So now what if we want to have a job, for instance, spin through all users in the persistence layer and clean up any users that are no longer really online. With some of our new algebraic methods, this is really easy.

Let's assume we get this list of 10 users that we will wrap in a set.

let persistedUsers = MySet.createFrom([
  'foo', 'bar','baz', 'buz',
  'coffee', 'frenchpress', 'pourover',
  'chemex', 'aeropress', 'espresso'

We can quickly find which users are persisted and still online by finding the intersection of usersList and a new set created from persistedUsers.


And now we have a set containing foo, bar, baz, buz, the 4 elements that are present in both.

If we want to find only the list of users marked as online, but without an object in storage, meaning it's a ghost session, then we would want to find the difference between usersList and persistedUsers, rather than the intersection.


This returns to us only biz, since it does not exist in persistedUsers. So if we wanted to clean up that list, we'd just remove biz.

If we wanted to find the only the user objects that haven't been cleaned up yet, we always invert this such that we are taking the difference between persisted users and our online users, leaving us a set containing 6 out of 10 elements, which won't contain the 4 elements that are in the intersection of our sets.


We can now either delete these 6 users from persistence, or depending on the logic, perhaps push their username back on to the active sessions.

Then if we wanted to track total discrepancies between the two lists, we could run


which will yield a list of 7 users that need some form of action taken against them.

So hopefully these examples give you a fair idea of how to use these methods.
The next time you need to do something like maybe persist a blog post with tags to a database, you will know how to find all the elements present in the db that you will have to delete from persistence because they were not present in the new list. Or how to find a list of all tags that two or more blog posts have in common.

Even if you choose not to use these on plain arrays rather than sets, the set algebra methods are incredibly helpful and can greatly simplify a lot of work.