Final Report
Hello everyone. The last three months passed by really quickly and the Google Summer of Code (GSoC) 2019 is coming to an end. In this blog post (which is also a part of my final report) I will summarize the main results of my GSoC 2019 project titled New Collections for Pharo, my overall experience and what I have learned. I am really happy that I had a chance to participate in this GSoC and I can say that this was an incredible experience.
Introduction
As many of you already know, Pharo is an actively developed, elegant pure object-oriented programming language that comes with an immersive IDE that simplifies testing and debugging and can significantly reduce development time. Pharo also comes with a large set of existing collections (see following figure). There is an existing effort - The Containers modular library, to include many more new collections in Pharo. This project has attracted the attention of the Pharo community and I had a chance to work on it during this GSoC.
Goals
My project goal was to collect, clean test and document alternate collections as well as implement a few new ones (with appropriate tests and documentations) and include them in the library.
Trees are very important and useful data structures. They have many applications, ranging from computer science and language processing to 2D/3D image processing and rendering, and many more… After a consultation with my mentors and the Pharo community it was decided that I am to implement different Tree collections that can easily be extended and used by other users.
End Results
During this GSoC I worked on improving several different containers (e.g. Containers-Stack, Containers-Queue, Containers-Trie, Containers-OrderedMultiMap, and many others). I’ve fixed mistakes, cleaned and refactored the code, and included necessary documentation and tests. As part of the Containers-Tree library, I added new collections: Binary Trees, N-ary Trees, Binary Search Trees and AVL Trees. Detailed documentation and tests are provided for each new collection. All these collections follow the object-oriented paradigm and are designed to be easily extended and used by other users. In the next section I will describe all contributions in greater detail.
What has been done
Containers library is designed to be modular and as such each collection is independent of the others. During this GSoC I’ve worked on several different Containers.
Contribution Lists
The following list shows GitHub pull requests, each one containing several commits. Ticked boxes indicate that all the changes are merged into the base branch. Most of the commits have been included into the base branch, while a few still await to be merged (the ones that are not merged are marked as - under review). Changes made to the Containers-Tree project are more significant, and they are presented in a separate list.
- Containers-SkipList: fix how to depend on it
- Containers-UniqueOrdered: add how to load and depend on
- Containers-Grid: add how to depend on code
- Containers-Array2D: add how to depend on code
- Containers-OrderPrservingDictionary: remove deprecated method
- Containers-PropertyEnvironment: change coverage status button url
- Containers-Stack: edit method comments
- Containers-Array2D: Patch/separate tests
- Containers-Queue: separate tests, add missing tests and comments (under review)
- Containers-Trie: add example(under review)
- Containers-Array2D: change coveralls url to Array2D instead of Grid(under review)
- Containers-OrderPreservingDictionary: Add class and package comments with examples(under review)
- Containers-OrderedMultiMap: move tests to separate package(under review)
Containers-Tree contributions:
- Publish all the alternate package into Container-Tree
- Add binary node and tree class with getters, setters and other methods
- Fixed an issue - isEmpty
- Fixed and restructured Binary Tree
- Add a new composite binary tree implementation
- Binary Search Trees (using composite design)
- Add N-ary trees with tests
- Add traversing methods and operations methods with tests
- Fix or operator, and change argument names in AbstractTree
- Add class and method comments for search tree
- Add class and method comments for n-ary tree(under review)
- Add avl tree with tests and commetns, modify binary search tree(under review)
Challenges Faced and Lessons Learned
One of the biggest challenges was designing a tree data structure that can easily be extended and reused for different applications. I started with implementing a rooted binary tree (and binary search tree), for detailed description see this blog post. After consulting with my mentors, we decided to do the things the Pharo way - i.e. redesign the trees using object oriented approach and design patterns and going from rooted representation to a composite one. This was a demanding task as I had a lot to learn.
Composite design pattern lets you treat a group of objects the same way as you would a single instance. In our case we can treat a node of a tree the same way as a subtree and consequently as an entire tree. This would let users (clients) treat nodes and compositions of nodes uniformly.
In our case, with a tree, there was no need to separate node class and a tree class. A tree could be represented as a single node and all the operations that can be performed on a tree you should be able to do on a single node and vice a versa. As a result a tree could be represented with one class. For more details see this blog post about restructuring trees.
In order to further improve the design and avoid unnecessary if-checks I implemented the Null object design pattern. This simplified the code significantly and made it more readable. For a detailed description about the Null Object see this blog post.
I summarize some key things that I’ve learned:
- I am more comfortable thinking about object oriented design - i.e the Pharo way.
- I realize the importance and advantages of design patterns.
- I learned some new design patterns that I didn’t know before.
- I believe I am able to recognize when to use some design patterns and when not to.
- I feel a lot more comfortable programming in Pharo.
- I learned how to write better tests.
Code Base
As the Containers Library is a big project that is actively being developed, I provide links to the current state of the code at the end of the GSoC 2019. A few commit merges are still pending. For links to the active projects, please see Active Project Links in the Auxiliary Links section.
- Containers-Trie
- Containers-Grid
- Containers-Stack
- Containers-Queue
- Containers-SkipList
- Containers-Array2D
- Containers-UniqueOrdered
- Containers-OrderedMultiMap
- Containers-PropertyEnvironment
- Containers-OrderPreservingDictionary
Overall Experience
I had a lot of fun during this GSoC but I have also learned a lot. The Pharo community is great and people are always eager to help. I am really happy I had a chance to be part of this community and I hope to continue my work on the Containers library. I can say that this was a very valuable experience.
I would like to sincerely thank my mentors - Stéphane Ducasse, Gordana Rakić, Guille Polito and Yurko Tymchuk for their help and support during this project.
Plans after GSoC 2019
Containers library is a big project that will be actively developed and maintained well after this GSoC has ended. I plan on continuing to contribute to the Containers library and other Pharo projects, and hope to be an active member of the open source community.
Auxiliary Links
Active Project Links
During the GSoC 2019 I contributed to following projects:
- Containers-Trie
- Containers-Grid
- Containers-Stack
- Containers-Queue
- Containers-SkipList
- Containers-Array2D
- Containers-UniqueOrdered
- Containers-OrderedMultiMap
- Containers-PropertyEnvironment
- Containers-OrderPreservingDictionary
Biggest contribution was done to the project:
Blog
During the GSoC I wrote weekly blog posts about the things I learned and what I did. I include here a list of all the previous blog posts:
- New Collections for Pharo
- Pair Programming
- Pair Programming - Part II
- Baseline
- Comparing Ordered Dictionaries
- Prelude to Trees
- Left Child, Right Sibling
- Tree Application Programming Interface (API)
- Binary Search Tree in Pharo
- Restructuring Trees
- Self-Balancing Trees
- Null Object Pattern
- State Design Pattern
- Strategy Design Pattern
- M-ary trees
Personal Links
- GitHub: https://github.com/SmiljanaKnezev
- Twitter: https://twitter.com/KnezevSmiljana