Sets
After lists, sets are one of the simplest structures you can use. They're a good general purpose storage container with a couple nice extra features. To look at more data structures, head back to the series page.
To learn more about all these structures, you can also download one of our Data Structures eBooks!
What are Sets good for?
Sets are very useful for deduplication problems, as you can quickly find if an item already exists in the set. Because they are ordered, they are also useful for finding the largest or smallest of a group of items, or other queries related to the ordering of elements. Most basic set operations are logarithmic in their time complexity, so it's a good general purpose structure.
There are also many specific "set" operations that are useful in various ways, like union
, intersection
, and "set difference" (the (\\)
operator).
What are Sets not as good for?
Sets do not track the order in which items were inserted. This means you cannot use them in many cases where items are synchronized with indices across different data structures. If all you are doing is adding items and looping through all of them, sets have more overhead and are slightly slower than lists.
Set Parameters and Constraints
Sets have a single type parameter for the contained type. This type must satisfy the Ord
type class, so that the set can create an ordered binary tree.
How do I create a Set?
There are 3 main ways to initialize a set. The empty
expression will return a fresh set with 0 elements. The singleton
function takes a single item and creates a set containing only that item. Then fromList
will take an existing list of items and turn that list into a set.
>> import qualified Data.Set as S
>> let a = S.empty
>> let b = S.singleton "Hello"
>> let c = S.fromList ["Hello", "World"]
As we'll see, this pattern of 3 initializers is quite common among data structures.
How do I get the size of a Set?
Use Data.Set.size
. This operation takes O(1)
time.
size :: Set a -> Int
How do I add elements to a Set?
You can add an item to a set with the insert
function. This operation is O(log n)
.
insert :: a -> Set a -> Set a
How do I access Set elements?
The primary "accessing" mechanic of sets is to ask if an item is already a member of the set (or not).
member :: a -> Set a -> Bool
notMember :: a -> Set a -> Bool
Another useful idea taking advantage of the ordered nature of sets comes with the "lookup" family of functions. These will give the "next" element in the set that is greater (or less) than the input. These of course will return Nothing
if the set is empty or no item fits the criteria.
lookupLT :: a -> Set a -> Maybe a
lookupGT :: a -> Set a -> Maybe a
lookupLE :: a -> Set a -> Maybe a
lookupGE :: a -> Set a -> Maybe a
You can, of course, also "loop" through the items of a set with fmap
or a fold.
In certain cases, you may also want to flatten your set into a list with toList
. Many other structures share this function.
toList :: Set a -> [a]
How do I delete elements from a Set?
Removing items is done with the delete
function. This operation is O(log n)
. If the item does not exist in the set, the output is the same as the input set.
delete :: a -> Set a -> Set a
How do I combine Sets?
Two sets can be combined in a few ways. The union
function is the most basic, creating a new set with all items that exist in either of the input sets. This function is also the Semigroup/Monoid instance, so you can get its behavior with the (<>)
operator.
union :: Set a -> Set a -> Set a
Many sets can be combined in this way all at once with unions
.
unions :: [Set a] -> Set a
If you only want elements that are in both sets you can use intersection
.
intersection :: Set a -> Set a -> Set a
If you want all the elements of one set that do NOT appear in a second set, this is the difference
function. It also has an equivalent operator (\\)
.
difference :: Set a -> Set a -> Set a
(\\) = difference
Many of these functions appear for other data structures as well. Combining with union
is the most canonical, and we'll see that it exists for most of our other structures. And unless otherwise stated, you can assume union
is the method used for the Monoid/Semigroup instance.
Import Notes
You should typically import Set functions in a qualified way. However, you can import the structure name itself unqualified. The main operator you might use with set is set difference (\\)
. You can usually get away with leaving this unqualified as well.
import Data.Set (Set, (\\))
import qualified Data.Set as S
If you are also using the Sequence data structure below, you may want to import these types using as Set
and as Seq
to avoid ambiguity.
Conclusion
If you want to learn about more data structures in Haskell, make sure to head back to the series page! Next on our list is the Ordered Map!