After getting some things done, now I’m returning to Code Contracts. In the first parts of this series I gave you some basic information about Code Contracts, the static and dynamic checker, the code rewriter and how implications can be modelled. This day’s blog post comes up with another interesting aspect: method purity. There seems to be some misunderstanding on this term – I want to clear some clouds and show you what method purity means and which implications it has.
Code Contracts come with an attribute called [Pure]
for decorating class methods (and by which you can declare a method as being „pure“). Method purity is a quality, expressing that the method is free of observable side effects. This means, it does not change the observable state on any objects seen by callers in some way. For example, pure methods do neither modify properties of a class nor reference type method arguments. Even the modification of a private field (which is not seen by callers directly) is prohibited, if class methods work with this field and return values, that depend on it. In this case the change of the field is indirectly reflected by a changed method return value, thus the observable state has changed! To summarize, mostly pure methods are only allowed to change objects that have been created after method entry.
The following code shows the simple class Point
and gives you a short example of how to use Pure in declaration and pure method calls in contracts. The static and dynamic checkers will both be very happy with that 🙂 :
public class Point { public int X { [Pure] get; set; } public int Y { [Pure] get; set; } public Point() { } [Pure] public double GetDistance() { return Math.Sqrt(X * X + Y * Y); } public void SwapXY() { Contract.Ensures(Contract.OldValue<int>(X) == Y); Contract.Ensures(Contract.OldValue<int>(Y) == X); Contract.Ensures(Contract.OldValue<double>(GetDistance()) == GetDistance()); int temp = X; X = Y; Y = temp; } }
(Perhaps not the best example, because we have redundant information in code and contracts in SwapXY()
– that should not be the goal in most cases, but in this simple scenario it’s quite ok for demonstration purposes)
All methods that are called within a contract (pre-/postcondition etc.) must be declared pure. But why?
First it has a very certain reason: contracts like the Contract.Requires()
and Contract.Ensures()
assertions will not be evaluated in every execution and build configuration. For example, you can choose just to check pre-conditions, thus stepping over the post-conditions. I don’t need to explain, that methods with side effects in those post-conditions would be a messy thing. The execution results of your code would depend on the configuration of Code Contracts and if the contracts are evaluated – this must be prohibited and that’s why all methods called in contracts must be pure. Remember that Code Contracts is declarative code. It should not influence the behavior of your code, but just decorate it in some way.
Second, declaring methods as pure is generally a good thing. Beside of allowing you to use those methods in contract declarations, it furthermore helps the static checker in its task. Since method arguments can not be changed arbitrarily in pure methods, the static checker can assume that those objects will be the same after the method call.
Third (and for me that raises Code Contracts to an even higher level), declaring a method as pure can yield to a better software design and just clarifies your intent. You as programmer are able to express explicitly that your method does not have side effects. The intention of the method is revealed to callers (intention revealing), who can rely on the fact, that the state of the method’s class will be the same after calling the method as before. Thinking about declaring as many methods as possible as pure quickly brings us to command-query-separation, where you are trying to have exactly two types of methods. On the one handside the commands, which change class state and on the other handside queries, which calculate some value from current class state, but don’t modify it. Thus, queries are pure methods and with the [Pure] attribute you are able to explicitly express this intention. Dividing commands/queries and expressing queries with [Pure] is often (while not everytime) a good thing, since it leads to clear design and distinct intention and let you avoid side-effecting methods, which can be horrible code smells indeed (think of deep-nested side-effecting methods, for example).
Currently there is no component of Code Contracts which checks if methods declared pure are pure indeed. Thus if a programmer decorated a method with [Pure], it’s just believed. The Code Contracts team is working heavy on that, thus to come up with a purity checker in a future release.
(Update: thanks to AJ, I clarified my intent with the following section) Moreover one may ask: Are there any implications to automatic parallelism? The answer is: not directly. Parallelism is a hot topic and one could think, that method purity brings automatic parallelism of calls to such methods to life. And purity is indeed a strong characteristic of a method, but it’s not sufficient for the auto-par task. It’s just a one-way road, since it declares, that a method is side-effect-free (thus it doesn’t write any shared state). But in general a method can also read shared objects and depending on this it could return another value. Thus, parallelism of those method’s calls could run into problems (returning non-expected values), if the state is changed by other methods between subsequent calls. Perhaps there will be more attributes similar to [Pure] for such use cases in the future or other structures to handle that? I’m a believer…
So far for now on this topic. As you can see, decorating your methods in a declarative way with attributes like [Pure] can express your intent in a very smooth way. It helps you to clarify your design and it helps other programmers to know your intent – not just by code comments, but by checked documentation via Code Contracts.
„Are there any direct implications to parallelism? The answer is: no. … one could think, that method purity brings automatic parallelism of calls to such methods to life. But that’s not true.“
I guess I cannot agree with you here. Actually purity is a very important characteristic in the (research) area of automatic parallelization, i.e. letting the compiler or runtime figure out how to parallelize your code.
If you have three calls to pure functions in a sequence, some automat may decide to run them in parallel and without locks. Of course purity is neither sufficient to automating parallelism (e.g. if one method depends on the result of another), nor is parallelism the only end to purity (call rearranging may be done for different reasons). Anyway, dismissing the relationship is like saying „streets are not made for cars“ just because streets may also be used by moter cycles and car can drive off-road…
Hi AJ,
I didn’t want to dismiss purity as quality completely from the parallelism topic, so perhaps I have failed to express myself very well. What I wanted to say was, that while I see purity as important characteristic, I don’t see direct implications to the topic of auto-parallelism because of this „one side of the medal“ thing: A pure method is like „I don’t write shared state, but I can read it“. So I didn’t want to dismiss the relationship in general, but just wanted to say: there is no direct implication, that means an implication that brings auto-parallelism directly to life. If you couldn’t agree again, please let me know. Your opinion is appreciated everytime.
Thanks, Matthias
Hi Matthias,
thank your for this good discussion. It is a great example for demonstration how [Pure()] works. I have a small annotation. I think you don’t need decorate the property getters. Code Contract tools assume the property getters are pure.
Robert
I more pressing question for me is: what about LINQ to Objects. Are iterators impure? — they use IDisposable so I guess yes. That’s a bit of a shame though as they are quite useful for specifications too even if only applied in run-time checks in debug mode.