Last Updated: February 25, 2016
·
1.836K
· deckarep

Consider not looping - Sets in Go.

A fundamental data structure that is under-utilized, yet highly useful is the set type. Luckily, it exists in many languages in some form or another. In C#, you have the generic HashSet<T>, in Objective-C you have the NSMutableSet and NSSet objects, and in Python you simply have the built-in set type in addition to the lesser known frozen set.

However they are named, however they are built, they are all modeled after a very important mathematical construct which itself is known as the Set. The definition of a set is simply a well defined collection of distinct objects. The objects that make up the set are known as the members of that set. And theoretically, sets can contain anything: numbers, names, animals, planets, bow-ties, other-sets and on and on...you get the point.

Typically, when you are storing items in a set it's usually useful to store similar or like items but it's not mandatory. It all depends on the context of your application and how you interpret this data. Let us consider a few important properties of well designed set implementation:

  • They only hold distinct members
  • They can tell you their size, also known as cardinality
  • They should easily allow you to test if a member is already present in the set quickly
  • Their members are stored in an undefined order
  • They are able to be combined in various ways producing new sets as a result

Now before we get into the specifics, let's quickly talk about sets in Go. Well, to put it bluntly, there are no sets in Go. I really have no idea why there are no sets in Go seeing as it's a very fundamental collection type that should be in ALL languages in my most humble opinion. Alas, it's not there, at least from what I can tell and there have been some questions in the Go forums and email groups about its omission.

Enter the home-brewed golang-set: http://deckarep.github.io/golang-set/

Installation instructions:

go get github.com/deckarep/golang-set

Get it while it's hot! Let's talk about how this puppy works:

//The 2013 winter schedule for freshmen and freshwomen.

requiredClasses := NewSet()
requiredClasses.Add("Cooking")
requiredClasses.Add("English")
requiredClasses.Add("Math")
requiredClasses.Add("Biology")

scienceClasses := NewSet()
scienceClasses.Add("Biology")
scienceClasses.Add("Chemistry")

electiveClasses := NewSet()
electiveClasses.Add("Welding")
electiveClasses.Add("Music")
electiveClasses.Add("Automotive")

bonusClasses := NewSet()
bonusClasses.Add("Go Programming")
bonusClasses.Add("Python Programming")

Suppose, we are trying to figure out what classes to take for our freshman year in high school. We look at the school brochure and immediately see that there are some required classes (groan), a few electives to choose from and even, if we want to get a few extra credits for this year, some bonus classes.

Nothing earth shattering so far. Where we stand, we have created four distinct sets that all contain Go strings as the members of these sets. Let's see what kinds of questions we can answer and assertions we can make about these 4 distinct sets.

I need to give my friend who's sick at home a list all the classes available. Let's get the union of all these classes and send them to him.

allClasses := requiredClasses.Union(scienceClasses).Union(electiveClasses).Union(bonusClasses)

Great, now I have a new set called allClasses that I can send on over to him. This new set now contains a distinct list of the entire winter schedule which means Biology will not appear in the allClasses set twice, yes, by design.

I wonder now if cooking is considered a science class these days since certain cooking shows like Good Eats have a lot of science in them.

fmt.Println(scienceClasses.Contains("Cooking")) 
//Returns: false

Nevermind, how about what are all the classes I can take that are not science classes?

fmt.Println(allClasses.Difference(scienceClasses)) 
//Returns: Set{Music, Automotive, Go Programming, Python Programming, Cooking, English, Math, Welding}

Okay, interesting, what about: Which of the science classes are also required classes?

fmt.Println(scienceClasses.Intersect(requiredClasses)) 
//Returns: Set{Biology}

How many bonus classes are offered for this winter schedule?

fmt.Println(bonusClasses.Cardinality()) 
//Returns: 2

Do you have any of the following classes? I really like them!

fmt.Println(allClasses.ContainsAll("Welding", "Automotive", "English")) 
//Returns: true

As you can see, we can actually answer quite a few questions by taking some sets and composing them, decomposing them, and generally manipulating them to answer a variety of questions. And by no means is this meant to be an exhaustive reference of set manipulation but please observe one thing. No loops! This is the power of sets, the power to treat a collection of objects as well, a Set. As an exercise how would you have accomplished the same thing using other more typical collections like a Go slice? Or even a map?

I hope this was fun for you as it was for me!

-Ralph

If you like this consider my other article where I use Go to build a simple swear word filter along with Redis sets