nub

The nub function removes duplicate elements from a list. In particular, it keeps only the first occurrence of each element. (The name nub means `essence'.) It is a special case of nubBy, which allows the programmer to supply their own equality test. If there exists instance Ord a, it's faster to use nubOrd from the containers package (link to the latest online documentation), which takes only <math> time where d is the number of distinct elements in the list. Another approach to speed up nub is to use map Data.List.NonEmpty.head . Data.List.NonEmpty.group . sort, which takes <math> time, requires instance Ord a and doesn't preserve the order.

Examples

>>> nub [1,2,3,4,3,2,1,2,4,3,5]
[1,2,3,4,5]
>>> nub "hello, world!"
"helo, wrd!"
The nub function removes duplicate elements from a list. In particular, it keeps only the first occurrence of each element. (The name nub means 'essence'.) It is a special case of nubBy, which allows the programmer to supply their own inequality test.
O(n log n). Fold values into a list with duplicates removed, while preserving their first occurrences
The nub function which removes duplicate elements from a vector.
The nub function removes duplicate elements from a list. In particular, it keeps only the first occurrence of each element. (The name nub means `essence'.) It is a special case of nubBy, which allows the programmer to supply their own equality test.
>>> nub [1,2,3,4,3,2,1,2,4,3,5]
[1,2,3,4,5]
Return a new Shell that discards duplicates from the input Shell:
>>> view (nub (select [1, 1, 2, 3, 3, 4, 3]))
1
2
3
4
Removes duplicate elements from the list. See also nubBy
The nub function removes duplicate elements from a list. In particular, it keeps only the first occurrence of each element. (The name nub means `essence'.) It is a special case of nubBy, which allows the programmer to supply their own equality test.
>>> nub [1,2,3,4,3,2,1,2,4,3,5]
[1,2,3,4,5]
If the order of outputs does not matter and there exists instance Ord a, it's faster to use map Data.List.NonEmpty.head . Data.List.NonEmpty.group . sort, which takes only <math> time.
On ordered lists, nub is equivalent to nub, except that it runs in linear time instead of quadratic. On unordered lists it also removes elements that are smaller than any preceding element.
nub [1,1,1,2,2] == [1,2]
nub [2,0,1,3,3] == [2,3]
nub = nubBy (<)
O(n log n). Remove duplicate entries from the heap.
>>> nub (fromList [1,1,2,6,6])
fromList [1,2,6]
Used as a scan. Returns Just for the first occurrence of an element, returns Nothing for any other occurrences. Example:
>>> stream = Stream.fromList [1::Int,1,2,3,4,4,5,1,5,7]

>>> Stream.fold Fold.toList $ Stream.scanMaybe Fold.nub stream
[1,2,3,4,5,7]
Pre-release
The memory used is proportional to the number of unique elements in the stream. If we want to limit the memory we can just use "take" to limit the uniq elements in the stream.
O(n). Remove duplicate elements from a sorted list.
Remove duplicate from a list, keeping only the first occurrence of each element. Because of a very weak constraint on a, this operation takes O(n²) time. Consider using nubOrd instead.
Delete all but the first copy of each element, special case of nubBy.
O(n). This upgrades nub from Data.List from O(n^2) to O(n) by using productive unordered discrimination.
nub = nubWith id
nub as = head <$> group as
Version of the traditional nub function using the PartialOrd notion of equality.
Remove all duplicates. Since 0.7.0.0
O(n^2). Removes duplicate elements from a slist. In particular, it keeps only the first occurrence of each element. It is a special case of nubBy, which allows to supply your own equality test.
>>> nub $ replicate 5 'a'
Slist {sList = "a", sSize = Size 1}

>>> nub mempty
Slist {sList = [], sSize = Size 0}

>>> nub $ slist [1,2,3,4,3,2,1,2,4,3,5]
Slist {sList = [1,2,3,4,5], sSize = Size 5}
Functions to remove duplicates from a list.

Performance

To check the performance there was done a bunch of benchmarks. Benchmarks were made on lists of Ints and Texts. There were two types of list to use:
  • Lists which consist of many different elements
  • Lists which consist of many same elements
Here are some recommendations for usage of particular functions based on benchmarking results.
  • hashNub is faster than ordNub when there're not so many different values in the list.
  • hashNub is the fastest with Text.
  • intNub is faster when you work with lists of Ints.
  • intNubOn is fast with the lists of type that can have fixed number representations.
  • sortNub has better performance than ordNub but should be used when sorting is also needed.
  • unstableNub has better performance than hashNub but doesn't save the original order.