[Geojson] Binary GeoJSON

Arlo Belshee Arlo.Belshee at microsoft.com
Fri Feb 3 09:35:40 PST 2012


A thing to keep in mind: why are you doing this, and will your implementation get the results you're after. I mention this because we just went through a similar process with OData, and decided not to do binary (just gzip the text Json instead, and clean up the text Json representation).

Your goals may be very different from ours, but make sure that you know what they are and that you keep them clearly in mind. There are downsides to multiple formats (fragmenting the ecosystem), and you don't want to take them on unless you're also gaining something important.

With OData, our goals were:

* Not sacrifice readability; improve it where we can
* Decrease power consumption and latency for consuming OData payloads from mobile devices
* Still support the wide ecosystem of clients and servers (either endian-ness, all native bit sizes from 8 to 64)

The third point prevents assuming any of the really fast binary parsing techniques (such as memory overlays). No matter what binary rep we take, a big chunk of the market will not be able to parse it by direct memory mapping.

Given that, we then measured the CPU usage and wall-clock time to parse a binary format that was not memory mapable, vs the same data in textual Json. We also wrote a new textual Json format, where we eliminated duplication and redundant metadata. Both the binary and the new textual contained the same information.

In our case, we found that the textual one was about 10% slower than the binary, both in CPU usage and wall-clock time. The old Json was 100% slower than our new "light" format, and the old Atom format was 250% slower than the light Json.

Thus, going to binary would have gained us a small advantage over just making a better Json format, but not enough to be worth decreasing readability. The light Json format reduced battery usage by 50% from our previous fastest; getting it up to a 55% improvement just isn't important enough.

For latency, all that matters is the number of bits sent to the client. Really, all that matters is how long it takes a cell phone to do TCP negotiation, but after that it's bit count. The textual Json zipped to within 2% of the size of the zipped binary. So they were identical (within variability of measurement) for download time.

Thus, our solution to binary: lighter textual Json + gzip.

Not saying that this will be your best outcome. Just saying that we entered the process assuming that we were going to do an efficient binary format. Fortunately, we clarified the goal to be efficient format, binary or otherwise, as measured by latency and battery usage. That let us find that our best option wasn't actually binary at all.

Arlo

-----Original Message-----
From: geojson-bounces at lists.geojson.org [mailto:geojson-bounces at lists.geojson.org] On Behalf Of Reed Underwood
Sent: Thursday, February 02, 2012 5:38 PM
To: geojson at lists.geojson.org
Subject: [Geojson] Binary GeoJSON

Hello,

I'm interested in developing a spec for a binary equivalent to GeoJSON. If there are efforts along these lines already in motion, please let me know. If you're interested in collaborating on something like this, please let me know.

I plan on building a model implementation in C and would like to move on this sooner rather than later.

Cheers,
Reed Underwood
Boston University
_______________________________________________
Geojson mailing list
Geojson at lists.geojson.org
http://lists.geojson.org/listinfo.cgi/geojson-geojson.org








More information about the GeoJSON mailing list