Comparing Ordered Dictionaries

During second week of GSoC 2019 coding period on Pharo project, I was learning more about existing implementations of Ordered dictionaries, comparing them and polishing Containers-OrderPreservingDictionary a little bit.

I couldn’t help but notice all the similarities between Collections - OrderedDictionary and Containers-OrderPreservingDictionary. They both start from the same place, inheriting from Collections. The diagram below shows inheritance relationships:

All of these classes: OrderedDictionary, CTStandardOrderedDictionary and CTOrderPreservingDictionary had dictionaries that are keeping their key-value pairs in some sort of an order. I wanted to know what are the differences between these orders. Let’s see:

| orderedDict |
orderedDict := OrderedDictionary   new. 
orderedDict at: 1 put: 'first'.
orderedDict at: 2 put: 'second'.
orderedDict at: 1 put: 'another first'.
orderedDict  >>> 'an OrderedDictionary(1->''another first'' 2->''second'')'
| standardOrederedDict |
standardOrederedDict := CTStandardOrderedDictionary  new. 
standardOrederedDict  at: 1 put: 'first'.
standardOrederedDict  at: 2 put: 'second'.
standardOrederedDict  at: 1 put: 'another first'.
standardOrederedDict  >>> 'a CTStandardOrderedDictionary(1->''another first'' 2->''second'')'
| orderPreservingDict |
orderPreservingDict  := CTOrderPreservingDictionary new. 
orderPreservingDict at: 1 put: 'first'.
orderPreservingDict at: 2 put: 'second'.
orderPreservingDict  at: 1 put: 'another first'.
orderPreservingDict  >>> 'a CTOrderPreservingDictionary(1->''another first'' 2->''second'')'

The order is preserved in all three classes in the same way.

If the order is same, what could be the other differences between all these classes from the diagram?

I decided to compare these pairs:

  1. OrderedDictionary vs CTStandardOrderedDictionary
  2. CTStandardOrderedDictionary vs CTOrderPreservingDictionary
  3. OrderedIdentityDictionary vs CTStandardOrderedIdentityDictionary
  4. CTOrderPreservingIdentityDictionary vs CTStandardOrderedIdentityDictionary
  5. CTOrderPreservingDictionary vs CTOrderPreservingStringDictionary

1. OrderedDictionary vs CTStandardOrderedDictionary

Both of these classes are subclass of Collections. Majority of their methods are completely the same. There are some differences though:

OrderedDictionary >> associationAt: aKey ifPresent: aBlock
	^ dictionary associationAt: aKey ifPresent: aBlock
CTStandardOrderedDictionary >> associationAt: aKey ifPresent: aBlock
	"Squeak and GS do not have #associationAt:ifPresent: so it
	is reimplemented for portability"
	^ aBlock cull:
		(dictionary
			associationAt: aKey
			ifAbsent: [^ nil])

2. CTStandardOrderedDictionary vs CTOrderPreservingDictionary

We saw that the order is preserved in the same manner in both of these classes.

The difference: By default, when CTStandardOrderedDictionary raises an exception and CTOrderPreservingDictionary returns a configurable default value (nil by default) instead of raising an exception.

| dict |
dict := CTStandardOrderedDictionary new. 
dict at: #missing
>>> 'KeyNotFound'
| dict |
dict := CTOrderPreservingDictionary new. 
dict defaultValue: 666.
dict at: #missing
>>> 666

In conclusion several methods in CTOrderPreservingDictionary are overriden so that they return a default value.

(OrderedDictionary like CTStandardOrderedDictionary raises an error)

3. OrderedIdentityDictionary vs CTStandardOrderedIdentityDictionary

All identity dictionaries have the feature of using == instead of = for key comparing.

isIdentityDictionary
	^ true

4. CTOrderPreservingIdentityDictionary vs CTStandardOrderedIdentityDictionary

5. CTOrderPreservingDictionary vs CTOrderPreservingStringDictionary

What I expected from CTOrderPreservingStringDictionary based on its name was that they key values could only be Strings. That didn’t make much sense and when I checked it didn’t behave like that:

| strDict |
 strDict  := CTOrderPreservingStringDictionary new. 
 strDict at: 1 put: 10.
 strDict at: 2 put: 20.
 strDict  >>> 'a CTOrderPreservingStringDictionary(1->10 2->20)'

So you can use other data types besides Strings for both keys and values. Then why String in the name of the class?

CTOrderPreservingDictionary >> at: aKey
	^ self
		at: aKey
		ifAbsent: [defaultValue]
at: aKey
	^ self
		at: aKey
		ifAbsent: ['']

Conclusion:

There are a lot of code repetitions in these classes and subclass behavior differences are not that big. One of the solutions that is being discussed with mentors is merging some classes and using strategy.