Demystifying software queries

During Moose trainings, I instigate participants to formulate problems that they would like to have checked in their system. Usually, they start with problems like find the unused methods, or uncommented methods. These are generic queries and are easy to formulate because people have seen then before.

However, when left generic, they have no real practical value. For example, unused methods in the context of a framework or a library are not a problem at all. In the same time, inexistent comments are much more problematic in a public API than in an internal implementation.

One reason engineers do not formulate more specific questions is that they simply do not see themselves able to answer them. Once they see that the costs of implementing checks can be low, they do get more courageous, and go gradually towards more specific ones such as is there a direct dependency between component A and component B.

These type of questions capture more value because they are anchored in the context of the system. For example, if the framework depends on the client code, the engineer knows he has to take immediate action.

When I said courage, I did not mean it metaphorically. There so much myth around software analysis that it does require courage to break the spell and go into the wilderness of software as data.

Let me tell you a short story. A couple of weeks ago, during a Moose training, I had the opportunity to meet a couple of truly courageous engineers. At some point one got to formulate a problem, albeit in a rather quickly and whispery voice:

We have two components A and B. Objects from A passed call objects from B, and they pass other A objects as parameters. And we do not want these B objects to directly set values in A objects. Can we get all methods from B that violate this constraint?

He knew it was a cool question, but he was also confident that it is for the gods to answer. Given that it was at the end of the first training day and everyone was tired, I suggested just to start describing the problem in more detail to see what it would take to possibly answer it. They agreed, and we proceeded.

After a short brainstorming, we identified a handful of pieces of information that we needed:

  1. distinguish classes from A and from B
  2. get methods from B that are being called from A
  3. get only those methods that receive an A as a parameter and call a setter method of this object

Identify classes from A and from B was rather straightforward, and they already have done it for other queries. It simply required extensions for FAMIXType with testing methods:

FAMIXType>>isA
    ^ '*::A::*' match: self mooseName
FAMIXType>>isB
    ^ '*::B::*' match: self mooseName

Then we needed a means to check methods from B are being called from A classes. This was slightly more difficult:

bMethods := ((model allClasses select: #isB) flatCollect: #methods).
bMethodsInvokedFromA := bMethods select: [:each |
    each invokingClasses anySatisfy: #isA ].

The tricky part was to obtain only those methods that receive an A object as a parameter and call a setter method of this object. It was tricky because they needed to correlate information from multiple sources. At this time, I re-explained the invocation object provided by FAMIX, the central source code meta-model in Moose. I used the picture below.

Invocation.png

An invocation represents the call relationship between methods. It has knowledge about the sender method, the receiver method, and the variable that receives the message being called. Thus, in the end, the query became:

problematicMethods := bMethodsInvokedFromA select: [:each | 
    each outgoingInvocations anySatisfy: [:inv |
        inv to parentType isA and: [
            inv to isSetter and: [
                inv receivingVariable isParameter]]]].

It took us 30 minutes to decompose the problem, and by the time we were done, the query was ready as well. The difficult bit consisted in understanding the invocation object.

This was indeed a tricky problem, but Moose makes it doable in a short amount of time. The solution is not trivial, yet once you go through it, it becomes accessible.

The essence is to dare formulate. And when armed with the right infrastructure, everything else follows.

Posted by Tudor Girba at 22 October 2011, 10:28 am with tags assessment, story, spike, moose link
|