[Geojson] Interpretation of "extra" coordinate dimensions
Arlo.Belshee at microsoft.com
Wed Aug 31 12:50:44 PDT 2011
These situations are fundamentally observational data. With observational data, re-use and clarity is maximized when you can pass around "observations." An observation is, effectively, a struct containing coordinates on spatial, temporal, and observational axes.
And often we don't just want point observations, we want to observe shapes (eg, a thermocline is a surface, not a point).
Splitting the observations into multiple features further exacerbates the data de-sync problem. Now, we no longer have the highly-coupled data even on the same object.
The measurements belong at a location. In fact, their being associated with that position is the most important aspect of either them or the position (in the case of the maritime scientist, the boat's path is only interesting in that it ties together the measurements, and the measurements are only interesting in that they are spatially located).
With these measurements on the coordinate directly, we're representing this tightly-coupled 1:1 relationship effectively as multiple columns in the same table, with values per record (stretching a metaphor here, but I think it applies). The observational coordinate system is effectively a single data table, which keeps everything together that should be together.
When we move the measurements to an array on the enclosing feature, then we're representing tightly-coupled 1:1 data via a many-to-many join (using the feature as a join table), using position in sequence as the join function. There are now a ton of data sync errors that could happen. We accidentally alter either array without doing the other, and all of our data become invalid. Worse yet, we can't even tell that they all became invalid.
This happens in the real world. For example, a particular measuring device may fail at one measurement point, or post-processing may discard a particular data element. Unless we foresee those problems and extend the protocol to handle them (eg, define the meaning of NULLs and empties in the various ancillary arrays), we end up destroying our data.
As soon as we break the data items out to a different feature and then hold that feature by reference, we further extend the types of problems we can have. Now, not only are we representing 1:1, tightly related data via a many:many relationship, we're explicitly removing all transactions. We can now update one feature without even knowing the other exists, making it yet easier to corrupt the data.
Now, I'm not saying that all data should be put in the coordinates. And there are downsides with doing so.
However, for data that is fundamentally observational, it can make a ton of sense to put the observation coordinates right alongside the positional and temporal coordinates in the same observation. That's why GML defines all three types of coordinate systems and explicitly allows a data set to say which coordinates it's using for its data.
GeoJson doesn't go crazy and support everything the way that GML does. It doesn't make the payload attempt to describe not only data but rules for interpretation of those data. However, it is nice that it can work for observational and spatio-temporal data as well as just simple temporal data.
From: evan.bowling at gmail.com [mailto:evan.bowling at gmail.com] On Behalf Of Evan James Bowling
Sent: Wednesday, August 31, 2011 11:18 AM
To: Arlo Belshee
Cc: Daniel Azuma; geojson at lists.geojson.org
Subject: Re: [Geojson] Interpretation of "extra" coordinate dimensions
Thank you for the examples Arlo, I'm not very familiar with such uses. While I agree that it will be commonplace for non-trivial uses of GeoJSON to require additional meta-data for each coordinate I still feel that a separation of the actual geometry and meta-data will improve re-use and clarity. In some ways, parsers could be simplified as well.
I recalled the following related experience:
Using ArcGIS you can create additional meta-data for each geographic feature such as a point or polygon in the attributes table of a shapefile. You can also join a table of data to a list of features, and add any number of columns to these tables to include additional ids, titles, etc.
I feel that allowing meta-data to be stored within the "coordinate" property is inappropriate. Instead, I would want to see a separate object referencing a feature id
//Modified Example from Earlier mail:
"id" : 1,
"coordinates": [[102.0, 0.0, 0.0, "2011-03-29T08:38:50Z"],[102.0,3.5, 0.0, "2011-03-29T09:38:50Z"],[102.0, 7.8, 0.0, "2011-03-29T10:38:50Z"]]
"featureId" : 1,
"metadata": [[3.54, 39.80],[3.54, 39.80],[3.54, 39.80]]
However, the alternative is much more concise. Assuming the meta data is added properly, the pure geometry could easily be extracted by other applications. Although, different applications might try to interpret the extra data in different ways (e.g. a user might not be able to tell the difference between 2 uses, and send the wrong file to a web app.
On Wed, Aug 31, 2011 at 12:34 PM, Arlo Belshee <Arlo.Belshee at microsoft.com<mailto:Arlo.Belshee at microsoft.com>> wrote:
I am also not a GeoJson spec author. I'm fairly new to this protocol. So please don't take what I say as implying what was in the heads of the authors as they were writing it. However, I see some strong advantages of the current design.
The need for extra dimensions shows up commonly in scientific and maritime navigational data. It also probably shows up in other places, but those I know off the top of my head.
In both cases, you have a feature that has a spatial property that is not just a point. For example, it could be a linestring that the boat is travelling. There are data items that you want to associate not with the path as a whole, but with each control point.
In navigation, you get cases like Daniel's: at each point in the path, you want to store not just its location, but the bearing, speed, water depth, and other values to aid in navigating the boat.
You don't want to store these at the feature level, because you need them directly mapped to the control points in your path. You could have them as separate arrays, with cross-correlation by index, but this increases the chance of off-by-one errors resulting in crashed boats.
Similarly, marine scientific measurements often include measuring a whole bunch of different things at different locations. Again, these may be a one feature per measurement point, but often are not. It is very reasonable for a feature to represent a particular journey, a particular body of water, or the like.
For example, you might have a feature "Mississippi River." It has a geographic property that marks the position of the river, with measurements of flow & temperature at each location.
All of these are advanced uses of geospatial data. GeoJson could well choose to not support them, in order to "better" support simpler uses. But a number of simpler uses just naturally extend to advanced uses. It's nice to not have to change format when that happens.
From: geojson-bounces at lists.geojson.org<mailto:geojson-bounces at lists.geojson.org> [mailto:geojson-bounces at lists.geojson.org<mailto:geojson-bounces at lists.geojson.org>] On Behalf Of Evan James Bowling
Sent: Wednesday, August 31, 2011 10:18 AM
To: Daniel Azuma
Cc: geojson at lists.geojson.org<mailto:geojson at lists.geojson.org>
Subject: Re: [Geojson] Interpretation of "extra" coordinate dimensions
1. The GeoJSON spec should specifically forbid any dimensions beyond XYZM until enough practical applications can be brought up to merit the flexibility.
2. JSON in general is used for transferring small-medium datasets. Locking down the spec on this detail can help data sets to remain readable as well as self-describing.
3. All of the uses of GeoJSON I have come across would be functional with the limited amount of dimensions.
4. Perhaps a more general spec can be laid out for representing more types of spatial information, and GeoJSON would be a more restricted version using similar property names. There are certainly a lot of new web applications that could benefit from a shared base data structure such as games, music notation (vexflow), white boarding tools, and mind maps.
Thanks for bringing this up Daniel!
On Wed, Aug 31, 2011 at 12:03 PM, Daniel Azuma <dazuma at alumni.caltech.edu<mailto:dazuma at alumni.caltech.edu>> wrote:
Just joined the list, so a quick intro: My name is Daniel Azuma. I wrote and maintain a GeoJSON builder/parser for Ruby called rgeo-geojson. (http://virtuoso.rubyforge.org/rgeo-geojson/) It parses GeoJSON into objects built by RGeo, which is a Ruby implementation of the OGC SF spec.
Right now my parser ignores and throws away any "extra" dimensions beyond XYZ(M), because the SF spec (and hence RGeo) doesn't have any notion of dimensions beyond XYZM. However, I notice that the GeoJSON spec does allow for any number of dimensions in a coordinate. I wanted to ask what is generally expected of parsers when dealing with such "extra" dimensions, whether they should be considered meaningful.
I ask because I have a user who wants to use extra dimensions to store metadata associated with point coordinates. That is, he wants to do this:
"coordinates": [102.0, 0.0, 0.0, "2011-03-29T08:38:50Z", 3.54, 39.80]
where the coordinates are X, Y, Z, timestamp, speed, bearing. I responded to him that I thought such metadata should be represented as properties in a Feature object, but he prefers conciseness over expressiveness in his use case. So I was wondering about the intent of allowing arbitrary extra dimensions in a coordinate, and whether, in the opinion of the authors of the spec, this was a case that GeoJSON parsers do (or should) handle.
Geojson mailing list
Geojson at lists.geojson.org<mailto:Geojson at lists.geojson.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Geojson