Next round of api improvements.

Jef Driesen jefdriesen at telenet.be
Wed Nov 21 16:57:46 UTC 2012


On 2012-11-16 20:18, Dirk Hohndel wrote:
> Jef Driesen <jefdriesen at telenet.be> writes:
>> But while I'm not too concerned about the memory allocations by
>> themselves, I still find it a bit silly having to allocate resources
>> where it's not strictly necessary. For the sample data I simply 
>> can't
>> think of any use-case where you would want to hold on to the sample
>> objects outside the callback function. Hence my search for possible
>> alternatives.
>
> I understand. But I think you are focusing on the wrong problem.
> Having a consistent API that treats memory consistently and doesn't
> require the application to know the internals of your implementation 
> is
> a HUGE advantage. It makes libdivecomputer easier to use.
>
> "Any memory allocation that libdivecomputer returns to the 
> application
> is owned by the application and needs to be freed when it is no 
> longer
> needed".
>
> This is easy to understand, easy to implement, even easy to verify 
> with
> static analysis tools.
>
> Elegance and best possible algorithms are honorable goals. But
> consistency and ease of use (and debuggability) are more important, I
> think.

If the "application owns" is the preferred option, then that's fine for 
me. No problem.

>> I also considered using a bitmap (it was mentioned briefly somewhere 
>> in
>> my mail I think), but I couldn't find a way to make it work with 
>> vendor
>> extensions. A bitfield is limited to 8 to 64 different types, 
>> depending
>> on the integer size we decide to use. For the normal samples, that 
>> is
>> probably more than we'll ever need. But if every backend can support 
>> a
>> few extension types, and each one needs another bit, then we'll 
>> quickly
>> running out of bits. Do you have a solution for this?
>
> Yes. Sort of. You have one bit in your bitmap that says "vendor
> extension" and then the vendor extension can use another int (16bit 
> or
> something) with an enum (so 64k values for 16 bits) to indicate which
> extension this is. So we don't waste the space in every sample, but 
> if
> there is an extension we have tons of room to grow and add extensions
> for new things to come.

If I understand your proposal correctly, you would have a data 
structure similar to the existing vendor data:

struct dc_sample_vendor_t {
    unsigned int type;
    unsigned int size;
    const void *data;
};

That would work for a one vendor extension per sample, but I don't see 
how you would make that work nicely in case there can be multiple 
extensions per sample. It also makes it harder to use a custom data 
structure for each extension, instead of just a blob of raw data. Yes, 
you can abuse the void pointer to point to any structure, but that isn't 
the most pretty interface.

Let's assume that a device supports a heartrate and an airtime value 
per sample. And assume for a moment those are not supported with the 
main api.

If we introduce some kind of a generic dc_sample_get_vendor() function, 
then we have to make sure it can return both the heartrate and airtime 
value. We can probably solve that in a similar way as you proposed for 
the tank pressures. However, my idea was that the backend would provide 
some dc_xxx_sample_get_heartrate() and dc_xxx_sample_get_airtime() 
functions with a corresponding data structure. But then a single bit to 
indicate the presence of a vendor extension isn't sufficient. If every 
such extensions needs it own bit in the main bitfield, then we'll 
quickly run out of bits. There are many backends and many possible 
extensions, so we have a virtually unlimited number of possible 
extensions. We would at least need an independent bitmap only for the 
extensions, or some other solution.

I think this is something we'll have to explore further.

>> While I'm definitely not opposed to the bitmap idea, I'm not that 
>> that
>> happy with the basic_sample_t struct. I would prefer to implement 
>> this
>> with separate functions. One for the bitmap itself, and one function 
>> for
>> each value. Since there are only two values that are always 
>> available, I
>> wouldn't treat them any different from the others.
>
> The reason that I thought of treating them differently is that these 
> two
> values are available in every sample from every DC (otherwise, what's
> the point of that DC). So you can have one call that gives you what 
> you
> can always get (time, depth) plus the information about everything 
> else
> that is available.
>
> But if you'd rather have this be three API calls (get_time, 
> get_depth,
> get_available_data_bitmap or something), I'll be happy with that, 
> too.
>
>> For the multivalued data, I'm not sure how well such a struct
>> (containing a length and pointer to an array) will work for language
>> bindings (python, .net, etc). That's something we'll have to check.
>
> I don't know about .net. In python this would work just fine.

I'm pretty sure such a structure isn't supported out of the box and 
will need some manual marshaling on the .net side. So my guess is that 
it can be done, but it won't be very pretty.

We can maybe consider to provide language bindings for the most popular 
languages (.net and python) ourselves?

>> I prefer enums over pre-processor constants too. I know this will 
>> loose
>> the conditional compiling, but this is a trick that will work for 
>> C/C++
>> only.
>
> The problem with enums is that you now have to have strong versioning 
> of
> your API - something which libdivecomputer has not a good history of
> doing (and I know all the reasons why - don't take this as criticism,
> just as an observation). So if you use enums then I have to recreate 
> the
> logic of "which version of libdivecomputer supports which values for
> this function" in my app - and so does everyone else. We need to
> basically create a "version number to feature mapping" in the client.
>
> If you do this with #defines then this is much easier to take care 
> of.
> At least for C/C++ programers - I have seen other language bindings
> parse the defines for this as well, but don't know what the current
> users of libdivecomputer are

As far as I know there are at least .NET (divinglog) and python 
(kenozooid) based applications. I got quite some questions about .NET 
support, so I assume there will be others too.

Anyway, I think it's very important to not treat these non C/C++ 
applications as second class citizens, especially because at least one 
of them (divinglog) is likely the application with the largest userbase. 
So that means all features should be usable from the languages bindings 
too.

BTW, that's for example one of the main reasons the logging interface 
doesn't exposes a vararg interface, but calls vsnprintf internally to 
convert the messages to strings. Strings are supported by all languages, 
varargs are not.

I guess that once we have decided on the final list of dive and sample 
data that we will support, the chances that we'll have to introduce 
additional samples in the future are relative small. I mean the goal for 
the v1.0 was to maintain backwards compatibility from then on. If we 
then document the first version number in which every feature appeared, 
then it shouldn't be that much of a hassle to do:

#if DC_VERSION_CHECK(x,y,z)

#endif

and then language bindings can do the same at runtime.

But okay, this is something we can discuss further.

>>>> 3. Supported dive and sample data
>>>> ==================================
>>>>   * temperature
>>>
>>> air temperature, I assume?
>>
>> Depends. Different dive computers have different interpretations. It
>> could be air/water temperature, the min/max/avg value, and so on. If
>> that's even know to us. Not sure how to deal with this. We could 
>> just
>> supply a temperature, and leaving the actual interpretation to the 
>> user,
>> or we could have an extra flag with the exact interpretation (if 
>> known)
>> as you mention below. But even then we're still left with devices 
>> that
>> provide more than one temperature.
>
> So offer a couple of values with flags that determin the semantics
>
> * air_temperature
> * min_water_temperature
> * avg_water_temperature
>
> And offer flags that clarify semantics
>
> AIR_TEMP_IS_FIRST_SAMPLE
> AVG_IS_LAST_TEMP
> MIN_TEMP_IS_AT_MAX_DEPTH
>
> (btw, I love the last one as I frequently dive in a water body that 
> for
> more than half of the year has temperature go UP as you go deeper...)

To be honest, I don't really like this. This tastes again as adapting 
our model to that of the dive computers. If we supply three values, then 
it breaks again if there is some dive computer out there that supports 
some fourth value. And with those flags, the semantics can be altered in 
many different ways. I fear this will quickly get messy on the 
application side.

If you take a step back and think about it, the temperature is just a 
poor substitute for those devices without a full temperature profile. If 
you have a full temperature profile, then those min/max/avg values and 
begin/end values suddenly become a lot less interesting.

So it might be more interesting to have the begin/end temperature, and 
the temperature at maximum depth delivered in the sample data. We 
already do this today for the suunto vyper backend, and I think it works 
quite well. Instead of temperature graph, you now draw a few temperature 
dots. If we do that, we can then reserve the temperature field 
exclusively for the true single temperature value case.

>> Heartrate is only supported by the Uwatec Galileo as far as I know. 
>> It
>> certainly not a must-have feature, but I can imagine that if you 
>> have
>> such a device, this is an interesting value. Although the actual
>> usefulness is questionable if you ask me, but it's cool gadget :-)
>
> So much of what high end DCs provide is more cool than entirely 
> useful -
> but I agree that we should make it available regardless. 
> Visualization
> is the application's job :-)

Yes, that's the idea. As long as it's reasonable device independent 
(e.g. there is no reason why other devices can't include a heartrate 
monitor, and beats per minute is a standard unit too) and doesn't 
complicate the api by (e.g. if you don't need it, you just don't read 
the value and that's it), then I have no real objections. The same for 
other types like CNS/OTU.

But things like for example tissue saturation in the decompression 
model of device X, or RGBM warning Y, that's a whole other story.

>> I have no idea which devices record this in the data. I
>> know the Suunto's have this weird OLF (Oxygen Limit Factor) value, 
>> which
>> is either the CNS or OTU, whichever is higher. And I think the 
>> Oceanics
>> have a scale from 1 to 5 or something. So that might be a bit more
>> difficult to support in a uniform way.
>
> I'll keep singing the same song. Pick one semantic as default and
> provide clearly specified flags to modify that.

But then you introduce extra complexity on the application side. If you 
get an CNS value, then you have to check all those flags to figure out 
whether it's a percentage (default), or some scale index, or that weird 
suunto OLF value, or...

You can't just take the value anymore and treat is as a percentage. 
 From my point of view that makes the abstraction provided by 
libdivecomputer less useful. I mean how different is that from doing a 
model based interpretation:

if (suunto)
    /* OLF */
else if (oceanic)
    /* Range index */
else
    /* Percentage */

I think we are better off always providing a percentage, and if that is 
not available then you don't get a CNS value at all.

>> What I had in mind, is to have a single DC_SAMPLE_EVENT type. If 
>> this
>> bit is set, there are events available. The actual event value would
>> then be another bitmap containing all the events. With this approach 
>> we
>> can support several events without needing additional sample types 
>> for
>> each event type.
>
> Good idea.

Then the next step will be to make a list of the events we want.

> See what I wrote above. The moment you don't have precisely defined
> semantics of the value (even if the semantic is just our current best
> guess based on reverse engineering) the usefulness of that value to 
> the
> application declines dramatically. An app is always at liberty to 
> ignore
> that additional information ("hey, here's some temperature, no idea 
> what
> it is"), but if it isn't there all that will happen is that apps 
> start
> to make educated guesses based on the type of DC used. And that's 
> just
> ugly.
>
> I feel very strongly about that. I think and API that doesn't define
> what the values it gives you actually mean is really of rather 
> limited
> use.

I agree, although not as strong as you do.

Let's say we define the meaning of the temperature as the minimum water 
temperature, which is probably the most common value. But then you 
happen to have some dive computer which only provides the average water 
temperature. Since that's all we have, that's what we give to the 
application and goes into your logbook. Is the small difference in 
semantics really that important, considering we don't have any other 
info? I mean to the user it won't make that much difference, because the 
semantics will remain consistent across all his dives with that device. 
In a paper logbook you would also just write down the value that you get 
from your dive computer.

So while I agree that a crystal clear semantics would be ideal, I also 
believe some small deviations are acceptable. You can also put it in 
another way: how can we even be sure about the actual semantics given 
the fact that it's all based on reverse engineering, so there is always 
an uncertainty factor present.

I think it gets a lot more dramatically if an application interprets a 
value incorrect because it fails to take into account one of those flags 
correctly. For example interpreting the oceanic CNS range index as a 
percentage, which means it will never go past 5% (or some other small 
value). Ouch.

Jef





More information about the Devel mailing list