Communicating the changes in Pharo 3.0

This is the most significant release of Pharo so far. The momentum is growing and Pharo is improving on many fronts. To exhibit this, we decided to augment the textual announcement with a visualization.

We wanted a visualization that is not only beautiful but that is also built with Pharo. The above picture is built with the most recent version of CodeCity built by Richard Wettel (if you want to try the visualization, you can download the image) that runs on top of Moose 5.0 and Pharo 3.0.

The goal of the visualization is to show the amount and spread of changes throughout the system. Specifically, every city district is a package, every building is a class, and the red bricks represent the modified methods in Pharo 3.0.

I was asked several times how I built it. Here is a short description that covers both the technical details and some design decisions. Let’s proceed.

First, we need to obtain all the changes that happened in Pharo 3.0. This has two sides.

Given that CodeCity adds extra packages that we do not want to take into account, we need to identify the packages that are only in Pharo 3.0. To this end, we use a heuristic and identify all packages that have associated the Pharo 3.0 Monticello repository.

packages := RPackageOrganizer default packages select: [ :package |
     MCWorkingCopy allManagers
          detect: [ :each | each packageName = package name ]
          ifOne: [ :mc | mc repositoryGroup repositories anySatisfy: [ :each | each location = 'http://smalltalkhub.com/mc/Pharo/Pharo30/main' ] ]
          ifNone: [ false ] ].

Next, we need to identify all methods defined in the above packages that have been modified after the start of Pharo 3.0. For this, we use another heuristic (I took this heuristic from Sven Van Caekenberghe) and simply compare the method modification timestamp with the date at which Pharo 3.0 was started.

all := IdentityDictionary new.
packages do: [ :package |
     package methodReferences do: [ :m |
          all at: m compiledMethod put:
               ([ '2013-03-18T00:00:00' asDateAndTime <=
                  (DateAndTime fromMethodTimeStamp: m compiledMethod timeStamp) ]
                         on: Error
                         do: [ false ]) ] ].

Given that comparing timestamps is time consuming and that this information will be used at rendering time, we store it in a dictionary to speed the visualization up.

Like any visualization engine in Moose, CodeCity offers a fluent API that is suitable for scripting. Using this API we can build nodes, specify shapes and define nesting (see below).

Talking about nesting, CodeCity is at its best when we get deep nesting. However, in Pharo, packages have almost no nesting. To improve the rendering, we can use another heuristic and split the package names by -. For example, something like AST-Interpreter-Core would produce 3 pseudo packages: AST, AST-Interpreter and AST-Interpreter-Core.

allPackageStrings := packages collectAsSet: #name.
allPackageStrings copy do: [ :each |
     dash := each lastIndexOf: $-.
     dash > 0 ifTrue: [
          allPackageStrings add: (each first: dash - 1) ] ].

We are now ready to tackle the actual visualization. The script below builds nodes for each packages, inside each of them, it build the nodes for classes, and inside each class, it builds the nodes for all its methods. Afterwards, it defines the nesting of packages based on string matching.

builder := CCBuilder new.
builder shape platform color: Color white.
builder
     nodes: allPackageStrings
     forEach: [ :eachPackageName |
          | package |
          package := RPackageOrganizer default packageNamed: eachPackageName 
                                               ifAbsent: [ RPackage new ].
          builder shape platform
               color: Color lightGray.
          builder nodes: package definedClasses forEach: [ :class |
               builder shape box
                    color: [:m |
                         (all at: m ifAbsent: [ false ])
                              ifTrue: [ Color red ]
                              ifFalse: [ Color lightGray ]].
               builder nodes: (class methods sorted: [:a :b |
                                     (all at: b ifAbsent: [false]) ]).
               builder wallLayout ].
          builder packingLayout innerGap: 20 asCCPoint ].
builder nest: allPackageStrings node: #yourself in: [ :each |
     dash := each lastIndexOf: $-.
     dash > 0 ifTrue: [ each first: dash - 1 ] ].
builder packingLayout innerGap: 40 asCCPoint.
builder

Another complication of the visualization is associated with positioning the changed methods. Because we want to convey the amount of changes that are touching the system, we need the changed methods to appear on top of the other ones so that the red is easily spottable when we look over the city.

The interesting side effect of this decision is that it works beautifully also when we use the cool feature of CodeCity to observe the city from on top (which in effect turns the 3D visualization into a treemap).

Job done. Pharo 3.0 has changed a lot.

This is indeed a large script. However, if you look closely, you will see that most of the code is related to building data structures that are more suitable for the visualization (detecting packages, querying changed methods, manufacturing nesting) than it is about specifying the visualization. Furthermore, the goal here is not to produce a reusable piece of code, but to produce a visualization that can have impact.

The exercise took not more than 1 hour and it was not linear. For something like this, the interactivity offered by the GTPlayground is of critical importance. Due to the very short feedback loop, I managed to produce several intermediate visualizations, and focus most of my time tweaking the parameters until I could find a suitable balance between the accuracy of the visualization and the goal of it.