Re: Next round of api improvements.

21 Nov 2012

      On 2012-11-16 20:18, Dirk Hohndel wrote:
...
Jef Driesen jefdriesen@telenet.be writes:
...
But while I'm not too concerned about the memory allocations by
themselves, I still find it a bit silly having to allocate resources
where it's not strictly necessary. For the sample data I simply 
can't
think of any use-case where you would want to hold on to the sample
objects outside the callback function. Hence my search for possible
alternatives.
I understand. But I think you are focusing on the wrong problem.
Having a consistent API that treats memory consistently and doesn't
require the application to know the internals of your implementation 
is
a HUGE advantage. It makes libdivecomputer easier to use.
"Any memory allocation that libdivecomputer returns to the 
application
is owned by the application and needs to be freed when it is no 
longer
needed".
This is easy to understand, easy to implement, even easy to verify 
with
static analysis tools.
Elegance and best possible algorithms are honorable goals. But
consistency and ease of use (and debuggability) are more important, I
think.
If the "application owns" is the preferred option, then that's fine for 
me. No problem.
...
...
I also considered using a bitmap (it was mentioned briefly somewhere 
in
my mail I think), but I couldn't find a way to make it work with 
vendor
extensions. A bitfield is limited to 8 to 64 different types, 
depending
on the integer size we decide to use. For the normal samples, that 
is
probably more than we'll ever need. But if every backend can support 
a
few extension types, and each one needs another bit, then we'll 
quickly
running out of bits. Do you have a solution for this?
Yes. Sort of. You have one bit in your bitmap that says "vendor
extension" and then the vendor extension can use another int (16bit 
or
something) with an enum (so 64k values for 16 bits) to indicate which
extension this is. So we don't waste the space in every sample, but 
if
there is an extension we have tons of room to grow and add extensions
for new things to come.
If I understand your proposal correctly, you would have a data 
structure similar to the existing vendor data:
struct dc_sample_vendor_t {
    unsigned int type;
    unsigned int size;
    const void *data;
};
That would work for a one vendor extension per sample, but I don't see 
how you would make that work nicely in case there can be multiple 
extensions per sample. It also makes it harder to use a custom data 
structure for each extension, instead of just a blob of raw data. Yes, 
you can abuse the void pointer to point to any structure, but that isn't 
the most pretty interface.
Let's assume that a device supports a heartrate and an airtime value 
per sample. And assume for a moment those are not supported with the 
main api.
If we introduce some kind of a generic dc_sample_get_vendor() function, 
then we have to make sure it can return both the heartrate and airtime 
value. We can probably solve that in a similar way as you proposed for 
the tank pressures. However, my idea was that the backend would provide 
some dc_xxx_sample_get_heartrate() and dc_xxx_sample_get_airtime() 
functions with a corresponding data structure. But then a single bit to 
indicate the presence of a vendor extension isn't sufficient. If every 
such extensions needs it own bit in the main bitfield, then we'll 
quickly run out of bits. There are many backends and many possible 
extensions, so we have a virtually unlimited number of possible 
extensions. We would at least need an independent bitmap only for the 
extensions, or some other solution.
I think this is something we'll have to explore further.
...
...
While I'm definitely not opposed to the bitmap idea, I'm not that 
that
happy with the basic_sample_t struct. I would prefer to implement 
this
with separate functions. One for the bitmap itself, and one function 
for
each value. Since there are only two values that are always 
available, I
wouldn't treat them any different from the others.
The reason that I thought of treating them differently is that these 
two
values are available in every sample from every DC (otherwise, what's
the point of that DC). So you can have one call that gives you what 
you
can always get (time, depth) plus the information about everything 
else
that is available.
But if you'd rather have this be three API calls (get_time, 
get_depth,
get_available_data_bitmap or something), I'll be happy with that, 
too.
...
For the multivalued data, I'm not sure how well such a struct
(containing a length and pointer to an array) will work for language
bindings (python, .net, etc). That's something we'll have to check.
I don't know about .net. In python this would work just fine.
I'm pretty sure such a structure isn't supported out of the box and 
will need some manual marshaling on the .net side. So my guess is that 
it can be done, but it won't be very pretty.
We can maybe consider to provide language bindings for the most popular 
languages (.net and python) ourselves?
...
...
I prefer enums over pre-processor constants too. I know this will 
loose
the conditional compiling, but this is a trick that will work for 
C/C++
only.
The problem with enums is that you now have to have strong versioning 
of
your API - something which libdivecomputer has not a good history of
doing (and I know all the reasons why - don't take this as criticism,
just as an observation). So if you use enums then I have to recreate 
the
logic of "which version of libdivecomputer supports which values for
this function" in my app - and so does everyone else. We need to
basically create a "version number to feature mapping" in the client.
If you do this with #defines then this is much easier to take care 
of.
At least for C/C++ programers - I have seen other language bindings
parse the defines for this as well, but don't know what the current
users of libdivecomputer are
As far as I know there are at least .NET (divinglog) and python 
(kenozooid) based applications. I got quite some questions about .NET 
support, so I assume there will be others too.
Anyway, I think it's very important to not treat these non C/C++ 
applications as second class citizens, especially because at least one 
of them (divinglog) is likely the application with the largest userbase. 
So that means all features should be usable from the languages bindings 
too.
BTW, that's for example one of the main reasons the logging interface 
doesn't exposes a vararg interface, but calls vsnprintf internally to 
convert the messages to strings. Strings are supported by all languages, 
varargs are not.
I guess that once we have decided on the final list of dive and sample 
data that we will support, the chances that we'll have to introduce 
additional samples in the future are relative small. I mean the goal for 
the v1.0 was to maintain backwards compatibility from then on. If we 
then document the first version number in which every feature appeared, 
then it shouldn't be that much of a hassle to do:
#if DC_VERSION_CHECK(x,y,z)
#endif
and then language bindings can do the same at runtime.
But okay, this is something we can discuss further.
...
...
...
...

Supported dive and sample data

==================================

temperature

air temperature, I assume?
Depends. Different dive computers have different interpretations. It
could be air/water temperature, the min/max/avg value, and so on. If
that's even know to us. Not sure how to deal with this. We could 
just
supply a temperature, and leaving the actual interpretation to the 
user,
or we could have an extra flag with the exact interpretation (if 
known)
as you mention below. But even then we're still left with devices 
that
provide more than one temperature.
So offer a couple of values with flags that determin the semantics

air_temperature
min_water_temperature
avg_water_temperature

And offer flags that clarify semantics
AIR_TEMP_IS_FIRST_SAMPLE
AVG_IS_LAST_TEMP
MIN_TEMP_IS_AT_MAX_DEPTH
(btw, I love the last one as I frequently dive in a water body that 
for
more than half of the year has temperature go UP as you go deeper...)
To be honest, I don't really like this. This tastes again as adapting 
our model to that of the dive computers. If we supply three values, then 
it breaks again if there is some dive computer out there that supports 
some fourth value. And with those flags, the semantics can be altered in 
many different ways. I fear this will quickly get messy on the 
application side.
If you take a step back and think about it, the temperature is just a 
poor substitute for those devices without a full temperature profile. If 
you have a full temperature profile, then those min/max/avg values and 
begin/end values suddenly become a lot less interesting.
So it might be more interesting to have the begin/end temperature, and 
the temperature at maximum depth delivered in the sample data. We 
already do this today for the suunto vyper backend, and I think it works 
quite well. Instead of temperature graph, you now draw a few temperature 
dots. If we do that, we can then reserve the temperature field 
exclusively for the true single temperature value case.
...
...
Heartrate is only supported by the Uwatec Galileo as far as I know. 
It
certainly not a must-have feature, but I can imagine that if you 
have
such a device, this is an interesting value. Although the actual
usefulness is questionable if you ask me, but it's cool gadget :-)
So much of what high end DCs provide is more cool than entirely 
useful -
but I agree that we should make it available regardless. 
Visualization
is the application's job :-)
Yes, that's the idea. As long as it's reasonable device independent 
(e.g. there is no reason why other devices can't include a heartrate 
monitor, and beats per minute is a standard unit too) and doesn't 
complicate the api by (e.g. if you don't need it, you just don't read 
the value and that's it), then I have no real objections. The same for 
other types like CNS/OTU.
But things like for example tissue saturation in the decompression 
model of device X, or RGBM warning Y, that's a whole other story.
...
...
I have no idea which devices record this in the data. I
know the Suunto's have this weird OLF (Oxygen Limit Factor) value, 
which
is either the CNS or OTU, whichever is higher. And I think the 
Oceanics
have a scale from 1 to 5 or something. So that might be a bit more
difficult to support in a uniform way.
I'll keep singing the same song. Pick one semantic as default and
provide clearly specified flags to modify that.
But then you introduce extra complexity on the application side. If you 
get an CNS value, then you have to check all those flags to figure out 
whether it's a percentage (default), or some scale index, or that weird 
suunto OLF value, or...
You can't just take the value anymore and treat is as a percentage. 
 From my point of view that makes the abstraction provided by 
libdivecomputer less useful. I mean how different is that from doing a 
model based interpretation:
if (suunto)
    /* OLF */
else if (oceanic)
    /* Range index */
else
    /* Percentage */
I think we are better off always providing a percentage, and if that is 
not available then you don't get a CNS value at all.
...
...
What I had in mind, is to have a single DC_SAMPLE_EVENT type. If 
this
bit is set, there are events available. The actual event value would
then be another bitmap containing all the events. With this approach 
we
can support several events without needing additional sample types 
for
each event type.
Good idea.
Then the next step will be to make a list of the events we want.
...
See what I wrote above. The moment you don't have precisely defined
semantics of the value (even if the semantic is just our current best
guess based on reverse engineering) the usefulness of that value to 
the
application declines dramatically. An app is always at liberty to 
ignore
that additional information ("hey, here's some temperature, no idea 
what
it is"), but if it isn't there all that will happen is that apps 
start
to make educated guesses based on the type of DC used. And that's 
just
ugly.
I feel very strongly about that. I think and API that doesn't define
what the values it gives you actually mean is really of rather 
limited
use.
I agree, although not as strong as you do.
Let's say we define the meaning of the temperature as the minimum water 
temperature, which is probably the most common value. But then you 
happen to have some dive computer which only provides the average water 
temperature. Since that's all we have, that's what we give to the 
application and goes into your logbook. Is the small difference in 
semantics really that important, considering we don't have any other 
info? I mean to the user it won't make that much difference, because the 
semantics will remain consistent across all his dives with that device. 
In a paper logbook you would also just write down the value that you get 
from your dive computer.
So while I agree that a crystal clear semantics would be ideal, I also 
believe some small deviations are acceptable. You can also put it in 
another way: how can we even be sure about the actual semantics given 
the fact that it's all based on reverse engineering, so there is always 
an uncertainty factor present.
I think it gets a lot more dramatically if an application interprets a 
value incorrect because it fails to take into account one of those flags 
correctly. For example interpreting the oceanic CNS range index as a 
percentage, which means it will never go past 5% (or some other small 
value). Ouch.
Jef

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: Next round of api improvements.