Saturday, November 28, 2009

ephipany: the path to null-safety is the zero-method value object

Or, The Zen Of Zero-Method Value Objects...

I had an ephiphany the other day.

The path to null-safety is the zero-method value object.

Rationale: an interface that defines no methods cannot result in the generation of call-site that causes a NullPointerException for the simple reason that if there is no method, there can be no call. If there is no call, there can be no NullPointerException

Example 1: What you can't call, can't blow up


interface DangerousDateRange
{  
    boolean contains(Date operand);
}
DangerousDateRange dateRange = ...;
...
dateRange.contains(date); // legal, but potentially unsafe

interface SafeDateRange { } SafeDateRange safeDateRange = ...;
...
safeDateRange.contains(date); // illegal, but very safe

Ok, trivial point you say. But what use is a zero method value object? How useful can that really be? After all, how can a date range answer the contains question if it has no contains() method in the first place?

How useful? Very.

A zero-method object can still be the container of state, namely the value.

Operations can be performed on another interface. The interface can be well defined, even for the null reference.

Example 2: Making a zero-method value interface useful


interface SafeDateRange
{
   // here is where the value contract goes ...

   interface Ops
   {
       boolean contains(Date date);  
       SafeDateRange asValue();
   };

   // here is where we store state ...

   private class Impl extends SafeDateRange
   {
        private final Ops ops;

        private Impl(final Date lower, final Date upper)
        {
            ops = 
                new Ops()
                {
                    boolean contains(Date date)
                    {
                       // ... whatever
                    };

                    SafeDateRange asValue()
                    {
                         return Impl.this;
                    }       
                };
            }

        }            
        Ops asOps()
        {
                return ops;
        }
   }

   // this is how we generate new values

   BinaryOp<Date, Date, Ops> NEW = new BinaryOp<Date, Date,  Ops>()
   {
         Ops apply(final Date lower, final Date upper) { return new Impl(lower, upper); }
   }

   // and this is how we provide access to the behaviour
   // in a null-safe manner

   UnaryOp<SafeDateRange, Ops> OPS = new UnaryOp<SafeDateRange,  Ops>()
   {
        Ops apply(final SafeDateRange ops) 
        {
             if (ops instantof Impl)
             {
                 return ((Impl)ops).asOps();
             }
             else 
             {
                 // and this is how we make it null-safe 

                 return new SafeDateRange.Ops()
                 {
                     boolean contains(Date date)
                     {
                         return true;
                     }

                     SafeDateRange asValue()
                     {
                         return null;
                     }
                 }
             }
         }
   }

}

Example 3: Using A Null Safe Date Range



SafeDateRange safeDateRange = SafeDateRange.NEW.apply(lower, upper).asValue();
SafeDateRange.Impl impl = null; //illegal - can't declare impl

SafeDateRange.OPS.apply(safeDateRange).contains(test); // safe, legal, well-defined
SafeDateRange.OPS.apply(null).contains(test); // safe, legal, well-defined

impl.contains(test);    // illegal: impl can't have been declared in the first place

Example 4: Caveat: There will always be idiots...


class Foo
{
    SafeDateRange     safe;    // always do this
    SafeDateRange.Ops unsafe;  // almost never do this 
}

Foo foolish = new Foo();

...
SafeDateRange.OPS.apply(foolish.safe).contains(test); // legal, safe
foolish.unsafe.contains(test);                        // legal, unsafe


Yep, there will always be idiots. But hopefully you can teach idiots that declaring long-lived references (e.g. member variables) to SafeDateRange.Ops is almost always a bad thing. Exception: if you can guarantee, by construction, that a reference to SafeDateRange.Ops is never null, you can relax this rule. Update: Thinking about this further - declaring a Ops reference is fine and (for performance reasons probably preferable), but you have to ensure that it is never null to use it safely. See my later post for a discussion of this.

Credit Where Credit Is Due

I am almost certain that I am not the first to discover the zen of zero-method value objects so if you have seen this or a similar idea written up before, please let me know and I'll will include a reference to that place.

It would not surprise me, for example, if this is very similar to one of Tony Morris's ideas.

Update:

As suspected, this is something addressed by Tony Morris' functional java, as per this response from him


Hi Jon,
What you are discovering has this catamorphism:
λa.λx. (a -> x) -> x -> x

To translate that to Java:

interface Function { Y apply(X x); }
interface Thunk { X apply(); }
interface Nullable<A> {
  X cata(Function s, Thunk n);
}

It's available in Functional Java

Many (non-total) languages do not have null.
And this:
Haskell calls it Maybe, ML calls it option, Scala calls it Option,
Functional Java calls it Option. Note that it (the type constructor)
is a covariant functor and a monad. The catamorphism is the important
part. You can use languages that simply don't have null, or assume
language subsets without it (many Java users do). You can also write
in languages that guarantee termination i.e. all total functions (with
some sacrifices of course) e.g. Agda, Coq, Isabelle.

6 Comments:

Anonymous The Other One said...

It seems like a lot of work just to avoid those null pointer exceptions.

Dereferencing NULL is just the computer's way of telling you that you did something wrong (well, telling the user that the programmer did something wrong).

I think it would be more useful to build robust interfaces which are easier to use and so less likely to cause programmers to write bad code which could cause NULL dereferencing.

30 November 2009 at 16:22  
Blogger Jon Seymour said...

The best way explain why it is worth it is with an example.

Suppose you have 3 dates. Two that specify a date range, 1 that specifies a date and you want to test whether the test date is within the range. Further suppose that the test date satisfies the range criteria if all of the specified bounds are satisified. That is, if the lower bound is not specified, then the test date is in the range if it less than the upper bound, etc.

Further assume that any 3 of these dates can be null.

This is a nightmare to implement correctly in the presence of nulls.

With my library you can express this test as:

Date.NEW.apply(lower, upper).contains(test);

It's that simple.

This is possible because unspecified bounds or dates get mapped to special values (MINIMUM_LOWER_BOUND, MAXIMUM_UPPER_BOUND or UNSPECIFIED_DATE)

whose Option interfaces (called Ops in my post) answer set membership questions sensibly.

Trying to code that question neatly with java.util.Dates and not getting wrong is very, very tricky.

It is really hard to get this wrong. It is completely null safe, there are no conditions.

Date.NEW.apply(lower, upper).contains(test);

jon.

30 November 2009 at 20:26  
Blogger Andres N. Kievsky said...

Objective-C just ignores the method call if you try to perform it on a NULL reference. That is so awesome and useful, it even becomes an idiom. I don't know why the Sun geeks didn't bother copying that - they did copy other stuff, though, like interfaces.

30 November 2009 at 20:41  
Blogger Jon Seymour said...

Jaw drops.

How on earth do you even know something is broken?

jon.

30 November 2009 at 21:48  
Blogger Jon Seymour said...

Although, that said, one could argue that if you didn't intend

Date.NEW.apply(null, null).contains(notnull)

to be interpreted as "is a finite date a member of the infinite interval?", then you would be as clueless as the Objective-C user who doesn't get a null pointer exception.

30 November 2009 at 21:50  
Blogger Andres N. Kievsky said...

Interestingly, in Obj-C the situation is somewhat similar to Haskell'ls Maybe.

The built-in data structures (NSArray, etc) don't store NULLs or primary values. They only take objects - there's a special object for NULL (NSNull), and another special object for ints, reals, numbers, etc. So you can't send an int directly, you have to hand it wrapped up. Which is similar to what you have found, and makes life easier.

As they say, all problems in CS are solved by adding a layer of indirection... *grin*

30 November 2009 at 22:50  

Post a Comment

<< Home