the importance of data types and units

2017/01/15

While working on the bachelor thesis I came across a hard to detect type of bug. It’s not one of the typical off-by-one, array-out-of-bound  or one of those I-inverted-the-indices-of-the-array bug. This one is harder to spot and is caused by a bad design which in turn makes it hard to spot the bug.

What happened

As all bugs the software behaves wrong. In this case I got some garbage numbers which just didn’t make sense. Basically the code is interpolating over a field of wind vectors and returns the interpolated wind vectors. The problem was, that the vector all seemingly looked more or less in the same direction. That was weird.

Bug type definition

I would call this bug type something like “mixed units bug” since I mixed degree and radian:

double lng = ((c.getLongitudeInRadian() - metadata.getNorthWestCorner().getLongitudeInRadian()) /
    metadata.getDeltaLng()) % 1;
double lat = ((c.getLatitudeInRadian() - metadata.getNorthWestCorner().getLatitudeInRadian()) /
    metadata.getDeltaLat()) % 1;
return new Double[] {lat,lng};

Next to the fact that this is not very nice code, I stumbled over a self made trap: getDeltaLng() and getDeltaLat() do not actually return a radian unit which you would usually use when doing math with spheres but a degree value. So since the radian value of a sphere traditionally would go from [0,2π) a difference 0.2 would make a lot more in radian than 0.2 in degree.

How to avoid this type of bugs

As usual: Try to think before you code unless you like to live with a unmanageable pile of junk code with no clear guideline and a lot of magic conversion in between. In the worst case you will end up converting a radian value again into radian which will give you some really unusable numbers.

The next obvious point is to use the same unit throughout the whole project. If you happen to come across that you may need angular values in degree and radian and coordinates on the sphere also in degree (mostly known as GPS), radian and even cartesian you better use an abstract data type which strictly forces you to set the type of unit when setting the values via setter-methods:

public class Coordinate {
    private static final double MIN_LONGITUDE = -180.0;
    private static final double MAX_LONGITUDE = 180.0;
    private static final double MIN_LATITUDE = -90.0;
    private static final double MAX_LATITUDE = 90.0;

    private BigDecimal longitude;
    private BigDecimal latitude;

    public final String toString() {
        return "Longitude: " + longitude + "°, Latitude: " + latitude;
    }

    public final double getLongitudeInDegree() {
        return longitude.doubleValue();
    }

    public final void setLongitudeInDegree(double l) {
        if (l > MAX_LONGITUDE || l <= MIN_LONGITUDE)
            throw new IllegalArgumentException("Longitude is out of range.");
        this.longitude = BigDecimal.valueOf(l);
    }

    public final double getLatitudeInDegree() {
        return latitude.doubleValue();
    }

    public final void setLatitudeInDegree(double l) {
        if (l > MAX_LATITUDE || l < MIN_LATITUDE)
            throw new IllegalArgumentException("Latitude is out of range.");
        this.latitude = BigDecimal.valueOf(l);
    }

    public final double getLongitudeInRadian() {
        return getLongitudeInDegree()/180*Math.PI;
    }

    public final void setLongitudeInRadian(double l) {
        setLongitudeInDegree(l/Math.PI*180);
    }

    public final double getLatitudeInRadian() {
        return getLatitudeInDegree()/180*Math.PI;
    }

    public final void setLatitudeInRadian(double l) {
        setLatitudeInDegree(l/Math.PI*180);
    }
}

As a programmer I don’t have to care anymore in what format these values are stored internally, I just have to indicate in what unit (or format) I want to set or get the value. In this way my code even gets better reusability since I can easily add other getter- and setter-methods later without breaking any old usage of this data type. In this case I could easily add a way to get and set the coordinate in cartesian values without refactoring other usage of this class.

Even better: As you may have noticed I built in some range checking of the values. In that way I can enforce and assert that the values will never under no circumstances provoke undefined behavior because they went out of range.

Closing words

Due to this lack of using strict typing and enforcing the developer to think about what he deals with or rather what unit he’s working with, I spent and lost a few hours tracking down a annoying and hard to get bug.