API redesign progress

Thu Jul 5 15:20:20 UTC 2012

On 2012-07-04 20:47, Linus Torvalds wrote:
> On Wed, Jul 4, 2012 at 8:53 AM, Jef Driesen <jefdriesen at telenet.be> 
> wrote:
>>
>> Now that we are discussing major api improvements, let's also have a 
>> look at
>> the sample parsing api. The current api generates a stream of sample 
>> values,
>> where the time value is special in the sense that it starts a new 
>> sample. I
>> think it would be cleaner to somehow return an opaque sample, and 
>> then get
>> all the values available for that sample. Something similar to this:
>
> I think that would result in simpler code, yes. That said, at least
> for the really fundamental ones I would actually prefer to not have 
> to
> see those "dc_sample_get_xyz()" calls for each.
>
> Because we *know* that for each sample, we're going to want time,
> depth, temperature and cylinder pressure if available. So if we want 
> a
> really convenient format, just give those directly without even 
> asking
> for it (so maybe "dc_sample_t" could have them as defined members 
> that
> we can just access, or maybe there would be a separate structure that
> contains the standard ones).

The problem with such a convenient format is that the only fields that 
are guaranteed to be available are time and depth. Everything else is 
optional and may not always be available. So if we would introduce some 
all-in-one structure, like the one below, how do you know when a 
particular field is available or not?

struct dc_sample_basic_t {
    unsigned int time;
    double depth;
    double temperature;
    double pressure[MAX];
};

Using some magic value to indicate the "not available" case is ugly and 
might cause problems too. For example zero or a negative value for 
temperature would be a bad choice, because they are perfectly valid 
temperatures.

Another complication are the multi-valued fields like the tank 
pressure. In the structure above I defined it as an array, but this 
won't work well, because the value for MAX will be different for every 
device. If we hardcode some global value that is large enough for all 
currently supported devices, then we are still screwed if a new devices 
appears with a larger number. And we can't fix it without breaking 
existing code.

I think what we really need here, is some kind of api to query which 
types are actually present.

BTW, with the dc_sample_get_xyz() accessor functions we can extend them 
to accept a unit parameter, and return the value in the requested units 
(e.g. si, metric or imperial) directly. Although it might be a bit 
overkill and better to leave unit conversion to the application. Just a 
thought.

> Note that you already do "pressure" wrong, in that there doesn't seem
> to be any way to get the pressure samples from *multiple* cylinders 
> in
> the same sample.
>
> And no, gas-change events are not sufficient (although we can work
> around *some* of the limitations with them). There are literally dive
> computers out there that can listen to four different HP sensors at
> the same time, and people actually use them for more complex cases
> than just switching between cylinders for one diver. One dive
> instructor talked about tracking student air use from his dive
> computer using that kind of setup, for example.
>
> So I haven't seen it personally yet, but the "there is only one
> pressure per sample" is not correct.
>
> At a minimum, the pressure needs to have an index, and then for the
> multi-cylinder case you could generate multiple samples with the same
> time (but different high pressure index and values).
>
> (And again, cylinder change events are *independent* of the HP
> samples: when you switch cylinders, the pressure sensor does not move
> with the cylinder switch. So for my own setup, for example, the HP
> sensor is always connected to one particular cylinder, and the
> pressure samples keep coming from that one even if I switch cylinders
> on the dive computer and start breathing from the other one. So 
> please
> never try to mix up "pressure info" with "cylinder change" events,
> that way lies madness).

Multiple pressure values per sample are fully supported today, and I 
have no intentions to drop that in the new api. With the current api you 
get a pressure value and a tank id, just as you describe. I don't know 
why you think only a single pressure value is supported, although it 
might be due to the next paragraph :-)

Note that today, many dive computers with support for multiple pressure 
sensor, only store a pressure value for the active tank. In such case 
you will only get a single pressure value per sample, but this a 
limitation of the dive computer data format, not the libdivecomputer 
api. If a dive computer records multiple pressure values per sample, 
you'll get all of them. I assume this limitation is done to save memory. 
Inactive tanks are likely to stay at a constant pressure (unless you 
have a leak, have your BCD attached to it, or are doing buddy-breathing, 
etc) and record it is just a waste of memory. Another reason may be that 
a dive computer display has only space to show one pressure value at a 
time, and only that value gets recorded. I would have to double check, 
but I think this last one also applies to those dive computers that 
support monitoring your buddy's tank pressure (e.g. some oceanics). By 
default your own pressure is shown and you have to do something like 
pushing a button to actually see your buddy's pressure, and only then it 
gets recorded.

Tank pressures and gas switches are indeed independent, but I never 
claimed otherwise.

>> At the moment we also have:
>>
>>  * events: Anything that happens only at specific point in time. All 
>> other
>> sample types are typically measurements obtained from a sensor (or
>> calculations based on them), with values available for each sample 
>> point (or
>> at least every x sample points). But that is not the case for 
>> events.
>> Typical examples are warnings, gas changes, etc. At the moment many 
>> events
>> are supported, but the api sucks. The current plan is remove the 
>> events, use
>> the vendor type instead (see below), and then add back support for 
>> the
>> important events with a better api. The details of this last part 
>> are still
>> unclear, but support for gas switches is definitely high on the 
>> list.
>
> I wouldn't mind that. Right now subsurface takes the events pretty
> much the way you feed them to us (except we then generate a string 
> for
> them), and quite frankly, the end result is not pretty. We have this
> "filter events" thing in subsurface just to get rid of the crazy 
> ones.
>
> [...]
>
> So the "raw events" from the dive computer are often totally useless.
> It would be much better to have some sane per-computer "turn this
> event into something useful" model that can fix things like this.
>
> (Sure, some people may again want the raw data, so have some raw data
> model for those things, but I'm talking about uses like subsurface
> that really want the *abstraction* part of libdivecomputer that makes
> all computers look sane - whether they really are that or not)

What I had in mind was that for now, we just resort to using raw data 
(e.g. the vendor extension), and try to document the format (much of 
this stuff is still unknown). Then later on, we can come back to this 
and provide some helper functions to assist parsing this data, or parse 
it internally into some nicer structure. But right now there are many 
things that I consider far more important.

>>  * rbt: The RBT (remaining bottom time) is the time you can spend at 
>> the
>> current depth and still have enough gas supply to make a safe 
>> ascent. It's
>> currently available for the Uwatec Smart/Galileo only. Other air 
>> integrated
>> dive computers often support a similar concept, so I think we can 
>> just keep
>> this. Maybe rename it to the less cryptic "airtime"? Note that the 
>> actual
>> interpretation may vary between models, depending on whether factors 
>> like
>> the safety reserve, the ascent time, decompression time, etc are 
>> taking into
>> account or not.
>
> I'm not convinced the rbt is all that useful in post-dive logging. 
> The
> dive computers that log it do so because they already calculated it,
> and the log is often just basically a "this is what I'm going to
> display" (many dive computers that *don't* log the rbt will still log
> the fact that they warned about low rbt - you can often set these
> things to warn when they think you are getting low).
>
> But post-dive, what's the advantage of seeing the rbt? It's kind of
> pointless, and it was always really just calculated to begin with (ie
> it has no *real* data behind it, and the dive logging software could
> calculate its own version using the SAC rate etc). At the time of the
> dive, rbt is useful as a "this is what I estimate", but *after* the
> dive the fact that it is just an estimate makes it kind of pointless.
>
> And as you say, it's not well-specified to begin with, since 
> different
> computers calculate the remaining time totally differently (not just
> in details, but in the whole "what does it mean" department).

I agree that the most interesting data is the "fundamental" data which 
is obtained directly from a sensor (e.g. time, depth, temperature, 
pressure). Everything else is usually calculated from this fundamental 
data, and can be recalculated (in a much more consistent way) post-dive. 
But nevertheless many people are still interested in this kind of info. 
One of the most common questions I get from divinglog or macdive users 
is: "Why don't I get info X when downloading my dives? It's available in 
the manufacturer's application, so it must be available in the data 
somewhere!".

So even while I'm personally not convinced of their usefulness either, 
the info is there and if it doesn't cause major trouble, it doesn't 
really hurt to support it to some extent. But that's only for the most 
common types of course. For the really highly device specific things, 
the answer is definitely no. That's simply out of the scope of the 
libdivecomputer library. And if anyone really insist, you can still get 
it from the raw data.

>> What else should we support? Some common requests I get are:
>>
>>  * gas switches
>>  * no decompression time
>>  * decompression stops
>>  * ...
>
> Gas switching is a real event (just as long as you don't mix it up
> with the pressure data). I would suggest against bothering with the
> computed stuff, although the "violated deco stop" event is obviously
> an interesting event.

I also consider gas switching a must have feature. It's one of the 
important things that is missing at the moment.

Like you say, there are probably a few more events that are 
interesting, but whatever list we end up with, it will always be a bit 
subjective. And the mapping from the raw device events to our standard 
set isn't always straightforward. For example the Suunto has voluntary 
and mandatory safety stops, deep stops and deco stops. Then what do we 
consider a "violated deco stop"? Only the real deco stops, or also the 
others? I'm pretty sure that if we pick one option, there will be some 
complaints that it doesn't match with the manufacturer's application. 
Not that I care too much about a perfect match...

> At the same time, the fact that the computer thinks you are in deco
> (or not) _is_ interesting information, so I think it's roughly the
> right thing to do. It's just that I don't like how libdivecomputer
> gives the data: it was not at all clear to me what the deco events
> meant initially.

The exact meaning of the events isn't always clear for me either. Very 
often they show up in the manufacturer's application as a single word, 
without giving any real explanation. Even if you start digging deeper in 
the manuals there is no clear explanation, if there is an explanation at 
all. Some events in the data do not even show up in the applications and 
remain a mystery...

Jef