Final Report

Hello everyone. The last three months passed by really quickly and the Google Summer of Code (GSoC) 2019 is coming to an end. In this blog post (which is also a part of my final report) I will summarize the main results of my GSoC 2019 project titled New Collections for Pharo, my overall experience and what I have learned. I am really happy that I had a chance to participate in this GSoC and I can say that this was an incredible experience.

Introduction

As many of you already know, Pharo is an actively developed, elegant pure object-oriented programming language that comes with an immersive IDE that simplifies testing and debugging and can significantly reduce development time. Pharo also comes with a large set of existing collections (see following figure). There is an existing effort - The Containers modular library, to include many more new collections in Pharo. This project has attracted the attention of the Pharo community and I had a chance to work on it during this GSoC.

Goals

My project goal was to collect, clean test and document alternate collections as well as implement a few new ones (with appropriate tests and documentations) and include them in the library.

Trees are very important and useful data structures. They have many applications, ranging from computer science and language processing to 2D/3D image processing and rendering, and many more… After a consultation with my mentors and the Pharo community it was decided that I am to implement different Tree collections that can easily be extended and used by other users.

End Results

During this GSoC I worked on improving several different containers (e.g. Containers-Stack, Containers-Queue, Containers-Trie, Containers-OrderedMultiMap, and many others). I’ve fixed mistakes, cleaned and refactored the code, and included necessary documentation and tests. As part of the Containers-Tree library, I added new collections: Binary Trees, N-ary Trees, Binary Search Trees and AVL Trees. Detailed documentation and tests are provided for each new collection. All these collections follow the object-oriented paradigm and are designed to be easily extended and used by other users. In the next section I will describe all contributions in greater detail.

What has been done

Containers library is designed to be modular and as such each collection is independent of the others. During this GSoC I’ve worked on several different Containers.

Contribution Lists

The following list shows GitHub pull requests, each one containing several commits. Ticked boxes indicate that all the changes are merged into the base branch. Most of the commits have been included into the base branch, while a few still await to be merged (the ones that are not merged are marked as - under review). Changes made to the Containers-Tree project are more significant, and they are presented in a separate list.

Containers-Tree contributions:

Challenges Faced and Lessons Learned

One of the biggest challenges was designing a tree data structure that can easily be extended and reused for different applications. I started with implementing a rooted binary tree (and binary search tree), for detailed description see this blog post. After consulting with my mentors, we decided to do the things the Pharo way - i.e. redesign the trees using object oriented approach and design patterns and going from rooted representation to a composite one. This was a demanding task as I had a lot to learn.

Composite design pattern lets you treat a group of objects the same way as you would a single instance. In our case we can treat a node of a tree the same way as a subtree and consequently as an entire tree. This would let users (clients) treat nodes and compositions of nodes uniformly.

In our case, with a tree, there was no need to separate node class and a tree class. A tree could be represented as a single node and all the operations that can be performed on a tree you should be able to do on a single node and vice a versa. As a result a tree could be represented with one class. For more details see this blog post about restructuring trees.

In order to further improve the design and avoid unnecessary if-checks I implemented the Null object design pattern. This simplified the code significantly and made it more readable. For a detailed description about the Null Object see this blog post.

I summarize some key things that I’ve learned:

Code Base

As the Containers Library is a big project that is actively being developed, I provide links to the current state of the code at the end of the GSoC 2019. A few commit merges are still pending. For links to the active projects, please see Active Project Links in the Auxiliary Links section.


Overall Experience

I had a lot of fun during this GSoC but I have also learned a lot. The Pharo community is great and people are always eager to help. I am really happy I had a chance to be part of this community and I hope to continue my work on the Containers library. I can say that this was a very valuable experience.

I would like to sincerely thank my mentors - Stéphane Ducasse, Gordana Rakić, Guille Polito and Yurko Tymchuk for their help and support during this project.

Plans after GSoC 2019

Containers library is a big project that will be actively developed and maintained well after this GSoC has ended. I plan on continuing to contribute to the Containers library and other Pharo projects, and hope to be an active member of the open source community.

During the GSoC 2019 I contributed to following projects:

Biggest contribution was done to the project:

Blog

During the GSoC I wrote weekly blog posts about the things I learned and what I did. I include here a list of all the previous blog posts:

  1. New Collections for Pharo
  2. Pair Programming
  3. Pair Programming - Part II
  4. Baseline
  5. Comparing Ordered Dictionaries
  6. Prelude to Trees
  7. Left Child, Right Sibling
  8. Tree Application Programming Interface (API)
  9. Binary Search Tree in Pharo
  10. Restructuring Trees
  11. Self-Balancing Trees
  12. Null Object Pattern
  13. State Design Pattern
  14. Strategy Design Pattern
  15. M-ary trees