[GeoJSON] processing model

Sun May 19 07:32:36 PDT 2013

Hello Erik,

On 18.05.13 19:11, Erik Wilde wrote:
> hello stefan.
>
> my second follow-up (after the initial "namespaces" email).
>
> On 2013-05-17 23:28 , Stefan Drees wrote:
>> On 18.05.13 07:37, Stefan Drees wrote:
>>> On 18.05.13 00:05, Erik Wilde wrote:
>>> What about starting from the Robustness Principle (a.k.a. Postel’s law)
>>> cf. [RFC1122]: “Be liberal in what you accept, and conservative in what
>>> you send“. A GeoJSON consumer is expected to “ignore” instead “complain
>>> about” any “unknown stuff” present in a response and still make the most
>>> of it. ...
>> Softening my suggestion of a starting point a bit: I know we describe 1)
>> a format here and do not specify 2) a protocol plus a format.
>
> i might sound a bit pedantic here, but since it matter in this case (i
> think): a format *is* a protocol: it is a convention on how to come to a
> shared interpretation of some representation.

on a sunny sunday I had to decide which mail to give feedback on ;-) so 
I chose this one.

 From a far away view point things often look identical, when inspecting 
further usefull differences may become visible. I think it is usefull to 
not discuss purchasing ingredients or cooking meals when buying 
inventory for a kitchen. Any container I consider for a kitchen has to 
deal successfully with water, heat, sharp knifes etc. and the placement 
in the room should be meaningfull both for the process of stowing the 
purchased ingredients and for cooking with them.

The distinction is the reason why something is designed once in a 
specific way differs often from the processing (user) perspective using it.

If one perspective contradicts the other, a design change is the long 
term solution, while processing work arounds are the short term solution.

So a format IMO gives form to what common assumptions on what is when 
and where needed for reaching goals 1,2, ... N. The format from the 
moment of creation lives on forever, as the GeoJSON format often copied, 
not-so-often referenced (presumably) is a form.

Now we attempt to make small adjustments, like with sandpaper smoothilg 
here a bit, covering broken promises with a small whit sticker, ...

>> In the case of (2) we would have a no-brainer jump-start with Postel's
>> Law, but as we only (as of now) describe one part of the "equation", we
>> might put some rules inside our format spec (at minimum attributed with
>> SHOULDs and SHOULD NOTs as applicable) ensuring that (where forseeable)
>> not the complete burden ends up on the consumer/clients shoulders.
>
> claiming that (1) and (2) are the same, i think that yes, Postel is a
> good start. but even Postel needs definition, and if you wanted
> something like the "mustUnderstand" semantics i mentioned in my previous
> email, you would need to design that into your format, so that consumers
> would understand when to stop.
>
> JSON as a format has no processing model other than how to parse the
> serialization into an object model. it simply is a representation of
> structured data. since JSON has no schema language, there usually is no
> validation, and consumers tend to just consume what they know, and
> ignore the rest. but that is just a side-effect of how they are
> implemented in most cases. making all of this explicit in a processing
> model allows producers and consumers to make informed decisions about
> where to put or not put additional data.

we live in interesting times ;-)

Currently a proposed json WG 
(http://datatracker.ietf.org/doc/charter-ietf-json/) at IETF is on its 
way chartered as first and only defined step targeting to correct the 
shifted situation, that many standards have begun to refer to JSON RFC 
(even ECMA Script spec) but the JSON RFC points to ECMA spec. THus move 
JSON RFC to standards track and like we plan with GeoJSON make a few 
minor modifications ;-)

Also, looking at: http://datatracker.ietf.org/wg/json/ I find the 
following active internet drafts (as of 2013-05-19):

draft-fge-json-schema-validation-00 	
JSON Schema: interactive and non interactive validation 	

draft-kelly-json-hal-05 	
JSON Hypertext Application Language 	

draft-luff-json-hyper-schema-00 	
JSON Hyper-Schema: Hypertext definitions for JSON Schema 	

draft-newton-json-content-rules-01 	
A Language for Rules Describing JSON Content 	

draft-nottingham-json-home-03 	
Home Documents for HTTP APIs 	

draft-sakimura-json-meta-00 	
JSON Meta Object 	

draft-snell-json-test-05 	
JSON Predicate 	

draft-zyp-json-schema-04 	
JSON Schema: core definitions and terminology 	

Some of them do look very promising and I feel I should read those 
before sticking my nose further into a debate on schema validation of 
JSON :-)

>> If one specifies a complete comunication protocol, both sides (producer
>> and consumer) are declared to be known up to a certain detail and are
>> assigned roles where a balance is sought (and the terms and places are
>> all present to specify this) for sharing the burdens.
>
> that's exactly the case for a format: the format should allow producer
> and consumer to come to a shared, non-conflicting interpretation of the
> exchanged data. a format is a protocol.

the container of a protocol, the passive part, yes. It should fit it's 
purpose, here it of course should overlap. But this overlap in my 
experience should be minimized. If of course you can map huge parts of a 
dynamic processing into a static proven form, this helps a lot, but 
because, the difficult processing has fewer things to care about (for 
each initiation).

>> In our case we should IMO as far as possible precisely as you (Erik)
>> suggested: "[...] make the processing model explicit and define how
>> GeoJSON is supposed to be consumed." with the reasoning, that it fosters
>> producer-consumer relationships by minimizing any impedance mismatch.
>
> great, that's all i was asking for. that allows producers and consumers
> to understand the expectations. it also allows things such a
> "validators" to exist where people can submit their GeoJSON, and it
> might tell them "yes, i can find the data that MUST be present, but i
> also see data in places where there really SHOULD NOT be any."

I am interested in helping to provide such a service, though maybe not 
to operate it ;-)

>> But, this will not be easy, as we do not describe a protocol, only a
>> format (as focus). On the other hand, if the GeoJSON community knows,
>> that in reality there has been a consolidation on some describable
>> processing model, this would be a good starting point to be helpful in
>> offering guidance without writing something that is "wrong" in n percent
>> of the use cases.
>
> agreed, defining processing models is not an easy task. however, it is
> one of the most important things a proper format has to do, because it
> dramatically reduces the risk of developers coming to different
> conclusions how to use/adapt/extend/repurpose a format, and then end up
> with non-interoperable implementations.
>
>> We should maybe split processing and namespacing into two different
>> threads.
>
> done, but namespaces imho are simply a part of a processing model, if
> you want to have a well-defined extensibility model.

we may well decide to merge the topics at anytime, but I find it good to 
separate the topic of Identity and scope (namespace, fence, scope, ...) 
from processing assumptions to have a better position in what to combine 
and what to separate in the final specification.

> cheers,
>
> dret.
>

All the best and sorry for the longish mail,

Stefan.