[Geojson] GeometryCollection not treated as a Geometry type

John Herring john.herring at oracle.com
Mon Oct 8 12:50:35 PDT 2007


Christopher, 
 
>> A MultiPoint is, in my mind, a single geometry, not a geometry 
>> collection -- a geometrycollection is a list of disconnected  
>> components. (In general, it probably only makes sense to create  
>> GeometryCollections from objects of different types -- at least,  
>> that's all I can imagine.)

This is a divergence from ISO 19107, and from Simple Features, which both
subtype MultiPoint from GeometryCollection. Further, GeometryCollections
need not be disconnected, and in 3D can be used to give us shells (the
solid's equivalent for a polygon's rings). Again ISO 19107 created the most
useful types by using collections of homogeneous types to create composite
curves, composite surfaces and composite solids. 

This is what I meant when I said GeoJson was diverging from the rest of OGC,
which uses ISO 19107 as the basic geometry model; even for those
specification which predate ISO 19107, because ISO 19107 was derived in some
sense from the earlier geometry models used in OGC prior to 1999 when Simple
Feature came on-line (became a standard). Simple Features also feed ISO
19109 which is ISO's Feature model. It all was written by the same group of
about 20 people who were familiar with both ISO TC 211 and OGC, and often
members of both. 

>> Hm, I'm not sure I understand: multipoint says '"coordinates"  
>> must be an array of the things described by Point' -- that means  
>> that since point has [x, y], MultiPoint has [[x,y], [x,y]] --  
>> That's clear to me, but clearly isn't to you. Can you explain  
>> what you would prefer?

The issue arose from the use of separators in GML and the way coordinates
and coordinates strings are written (in some versions of GML). If you
marshal out a multidimensional array (and you know the dimensions) you write
out row major form, which means that the 3X3 matrix

	1	2	3
	4	5	6
	7	8	9

Marshalls out as [1 2 3 4 5 6 7 8 9], and it is knowing that it is 3x3 that
allows you to reconstruct the original sequence [[1 2 3] [4 5 6] [7 8 9]].
This encoding shortcut is used in the "pos" array in some forms of GML. The
issue here is that it is not obvious from the start what is going on. You
will get the same sort of confusion from folks who do not seem much
difference between an array of arrays and a multidimensional array, like
folks who are use to C "pointer arithmetic." 

>> For the record, the change was an attempt to be exactly the same 
>> -- specifically, the same as WKT. (I don't understand GML well enough to
>> comment.) In WKT, a Union of a point and a linestring returns a 
>> geometrycollection with a Point and a Linestring in it: this
representation 
>> is mimicked in GeoJSON (and that's not changing!): the same structure in 
>> GeoJSON, WKT, and GML.

Ah ha, the dangers of backward engineering. WKT in Simple Features (06-103)
is defined to be a representation of the semantics presented earlier in the
document, especially as defined from the "Figure 1: Geometry class
hierarchy". Now from  the diagram, only instantiable variants of the various
classes need to show up in the WKT, but the semantics of the diagram are
assumed to be preserved, even though some of the semantics finer points are
lost in the transfer to a text format without a need to explicitly reflect
inheritance. Now what occurred in GeoJson was a backward engineering from
Clause 7 (the WKT) to get the semantics of what should have been gotten from
Figure 1 (the UML) and its follow-on text (Clause 6). Now because the WKT
did not need to reflect the substitutability inherent in Figure 1, which is
somewhat replicated in all the GML versions (from its derivation from ISO
19107), that step missed the semantics from the UML that was not reflected
in the WKT. 

Further, since the feature model was never reflected in the WKT (nor for
that matter in the Simple Feature UML) but derived from the Feature volume
of the OGC Abstract Specification, which is consistent with the Feature
Model in ISO 19109, derivation from WKT could not and did not reflect the
proper semantics for features and feature collections (most important is the
subtyping of feature collection from feature). GML again is consistent with
ISO 19109, and therefore has an ISO compliant mechanism defined for doing
application schemata. 

What we are dealing with is the old down-easter problem - "sorry, you can't
get there from here." Representation formats, especially ones that are
incomplete, that do not have to deal with semantics, generally do not deal
with semantics. GeoJson, being a "object" encoder, had to deal with
features, feature collections, and the semantics of geometry; none of which
are reflected in the less functional WKT. That is the major source of your
divergence, not the "wording issues."

Regards,
John

You do what you can when you can because you can.

The opinions expressed in this email are 
purely my own and do not necessarily 
represent the opinions of any organization
or otherwise sane person or persons.

John R. Herring
Architect, Spatial Products
Oracle Corporation
One Oracle Drive
Nashua, New Hampshire 03062
ph: 1 603 897 3216
fx: 1 603 897 3334

Annue cœptis - Novus Ordo Seclorum
  


-----Original Message-----
From: Christopher Schmidt [mailto:crschmidt at metacarta.com] 
Sent: Monday, October 08, 2007 12:26 PM
To: John Herring
Cc: 'Andrew Turner'; geojson at lists.geojson.org
Subject: Re: [Geojson] GeometryCollection not treated as a Geometry type

On Mon, Oct 08, 2007 at 10:30:10AM -0400, John Herring wrote:
> Case in point (pardon the semi-pun):
> 
> From the place cited:
> 
> >> In addition to the type member, any GeoJSON object that represents 
> >> a single geometry (referred to as a geometry object below) must 
> >> have a member with the name "coordinates". This does not apply to 
> >> geometry objects of type "GeometryCollection".
> >> For type "Point", each element in the coordinates array is a number 
> >> representing the point coordinate in one dimension. The order of 
> >> elements follows x, y, z order (or longitude, latitude, elevation 
> >> for coordinates in decimal degrees).
> >> For type "MultiPoint", each element in the coordinates array is a   
> >> coordinates array as described for type "Point".
> 
> This has a couple of semantic disconnects. First, the concept of a 
> single geometry is undefined.

That's probably true. Please feel free to suggest a wording change -- the
only reason that it is worded like that is to exclude GeometryCollections.
In fact, that entire first chunk of text needs rewording -- the key points
are: 
 * All geometries have "coordinates":, EXCEPT:
 * If the geometry object is a GeometryCollection, which has
   'geometries' instead.

> A geometry collection is a single object, and represents a single 
> geometry (albeit potentially disconnected).

Agreed. Bad wording on my part -- I'm sorry.

> A multipoint
> is a geometry collection but is included as an example in a list from 
> which geometry collections are specifically excluded.

A MultiPoint is, in my mind, a single geometry, not a geometry collection --
a geometrycollection is a list of disconnected components.
(In general, it probably only makes sense to create GeometryCollections
from objects of different types -- at least, that's all I can imagine.)   

> And Point is defined in such
> a manner as to confuse it with multipoint. 

Hm, I'm not sure I understand: multipoint says '"coordinates" must be an
array of the things described by Point' -- that means that since point has
[x, y], MultiPoint has [[x,y], [x,y]] -- That's clear to me, but clearly
isn't to you. Can you explain what you would prefer?

> Three sentences with three
> confusions of terminology is a bit dense for special casing. While I 
> do not mind folks ignoring ISO 19107 (which is the official OGC 
> geometry volume) in small things, it is disconcerting to have the 
> requirement to catalogue the disconnects to understand what is suppose to
be a simple specification.

Understood. I *think* this is just a case of bad wording on my part, becuase
I was in a rush to go fix code instead of fixing the spec. Can you propose a
better wording for this section? I think it all stems from the use of the
term 'single geometry', which is obviously poorly defined. 

> You guys are spending way too much time being different for no 
> apparent reason.

For the record, the change was an attempt to be exactly the same --
specifically, the same as WKT. (I don't understand GML well enough to
comment.) In WKT, a Union of a point and a linestring returns a
geometrycollection with a Point and a Linestring in it: this representation
is mimiced in GeoJSON (and that's not changing!): the same structure in
GeoJSON, WKT, and GML (thanks to OGR):

>>> g.ExportToWkt()
'GEOMETRYCOLLECTION (POINT (0 0),LINESTRING (1 1,2 2))'
>>> g.ExportToGML()
'<gml:GeometryCollection><gml:geometryMember><gml:Point><gml:coordinates>0,0
</gml:coordinates></gml:Point></gml:geometryMember><gml:geometryMember><gml:
LineString><gml:coordinates>1,1
2,2</gml:coordinates></gml:LineString></gml:geometryMember></gml:GeometryCol
lection>'
>>> g.ExportToJson()
{'type': 'GeometryCollection', 'geometries': [{'type': 'Point',
'coordinates': [0.0, 0.0]}, {'type': 'LineString', 'coordinates': [[1.0,
1.0], [2.0, 2.0]]}]}

The only difference in the spec is that GeometryCollection is now
universally treated as a geometry object, meaning it can fill the 'geometry'
member of a feature, whereas before the spec was unclear on that. 

> It is a waste of your time, and will be a waste of the time of the 
> reader who will have to realize the differences in rules in each 
> specification instead of following a consistent approach for all OGC 
> specification of geometry representations. Occam and Einstein both saw 
> that simple things should be done simply.

To me, I look at those three different representations ,and they all seem to
be saying the same thing. Where am I going wrong?

Regards,
--
Christopher Schmidt
MetaCarta




More information about the GeoJSON mailing list