Re: Next round of api improvements.

24 Nov 2012


      On 2012-11-21 19:22, Dirk Hohndel wrote:
...
On Wed, 2012-11-21 at 17:57 +0100, Jef Driesen wrote:
...
That would work for a one vendor extension per sample, but I don't 
see
how you would make that work nicely in case there can be multiple
extensions per sample. It also makes it harder to use a custom data
structure for each extension, instead of just a blob of raw data. 
Yes,
you can abuse the void pointer to point to any structure, but that 
isn't
the most pretty interface.
Let's assume that a device supports a heartrate and an airtime value
per sample. And assume for a moment those are not supported with the
main api.
If we introduce some kind of a generic dc_sample_get_vendor() 
function,
then we have to make sure it can return both the heartrate and 
airtime
value. We can probably solve that in a similar way as you proposed 
for
the tank pressures. However, my idea was that the backend would 
provide
some dc_xxx_sample_get_heartrate() and dc_xxx_sample_get_airtime()
functions with a corresponding data structure. But then a single bit 
to
indicate the presence of a vendor extension isn't sufficient. If 
every
such extensions needs it own bit in the main bitfield, then we'll
quickly run out of bits. There are many backends and many possible
extensions, so we have a virtually unlimited number of possible
extensions. We would at least need an independent bitmap only for 
the
extensions, or some other solution.
There are several problems with that idea. Obviously you run out of 
bits
quickly. But more importantly you create a massive amount of function
entry points in the library. So your API gets very complex and even
worse, for every new feature that you support you add new functions 
and
now you need matching preprocessor code that allows the application 
to
appropriately do conditional compiles...
If it's just the number of functions, then it's easy to achieve the 
same result with just a single function of course. For example with a 
function that works like the dc_parser_get_field function in the current 
api:
dc_sample_get_vendor (sample, type, &value);
I'm fine with a one function approach, so that's not the main problem.
...
This brings us straight back to the enum vs. #define discussion we 
had
somewhere else. Think about the users of your library, the people
writing applications. It is very unpopular to create a hard link 
between
a specific version of an app and a specific version of a library -
especially as more software starts using libdivecomputer. So you 
should
assume that applications want to be able to say "works with
libdivecomputer 0.3.0 or later" - but then they need a rational way 
to
test if a specific feature is available.
How does this change depending on whether you conditionally compile 
with:
#if DC_VERSION_CHECK(x,y,z)
or:
#ifdef DC_SAMPLE_FOO
In both cases, the compiled version will depend on a certain minimum 
version, but the code can be compiled for several versions.
In the first case the developer has to lookup once (e.g. when doing the 
implementation) from which version on a particular feature is supported. 
But from there on, it's crystal clear from reading the code which 
version you need to get that feature. In the second case, you can 
blindly add the #ifdef, but now you can't tell from reading the code 
which libdivecomputer version you need. So the "works with 
libdivecomputer x.y.x or later" part just became more difficult to see 
if you ask me.
...
Why do I mention this here? Because with your function name based
approach, you now need a matching feature flag for each function that
you introduce. Not my idea of a great API.
Now imagine the other extreme.
dc_sample_get_vendor() returns a structure like this:
struct dc_sample_vendor_t {
     unsigned int type;
     unsigned int size;
     const void *data;
 };
The type is a unique number that you assign that precisely describes 
the
data that is delivered. Even, you can reuse that number if multiple
computers deliver the same data... let's assume heartrate... my guess
most every computer will give that to you as an integer and the 
backend
can pad this so it's 16 bit wide (and deal with other tricks that the 
dc
might do to fit it into the fewest number of bits). So now you have 
ONE
type
#define DC_VSAMPLE_HEARTRATE 0xABC
and data is known to be simply a pointer to the number of beats per
minute.
This seems MUCH cleaner than an API that requires you to call
dc_uwatec_get_sample_heartrate()
if the user happens to have an Uwatec computer with heartrate support
but
dc_suunto_get_sample_heartrate()
if it's a Suunto that now supports heartrate.
Instead what you do is this.
Foreach(sample) {
   dc_get_sample_basic(&basic_data); // basic_data has time, depth
   while (dc_sample_get_vendor(time, &sample_vendor)) {
   	switch(sample_vendor.type) {
   	case DC_VSAMPLE_HEARTRATE:
   		...
   		break;
   	...
   	default:
   		// Jef added more features, need to update app
   	}
   }
}
Now as I said, that's an extreme example and pushes EVERYTHING into 
the
sample_get_vendor logic. What might be more reasonable is to have a 
few
common values (as you suggested earlier):
Foreach(sample) {
   dc_get_sample_basic(&basic_data); // basic_data has time, depth
   if (basic_data.extradatabits & DC_SAMPLE_TANK_PRESSURE)
   	dc_get_tank_pressure(time, &tank_pressure);
   if (basic_data.extradatabits & DC_SAMPLE_DECO_NDL)
   	dc_get_deco_ndl(time, &deco_ndl);
   while (dc_sample_get_vendor(time, &sample_vendor)) {
   	switch(sample_vendor.type) {
   	case DC_VSAMPLE_HEARTRATE:
   		...
   		break;
   	...
   	default:
   		// Jef added more features, need to update app
   	}
   }
}
So we break out the common pieces of information and make them 
available
with their own API functions... but that of course brings us back to 
the
type specific flags:
struct dc_deco_ndl_t {
   int type;
   int depth; // in mm
   int duration; // in s
   int tts; // in s
}
Where type then tells you if this dive computer provides no duration,
just the duration of the deepest stop, just the time to surface, or
both.
The heartrate was maybe a very bad example. If Uwatec and Suunto both 
support a heartrate with the same beats per minute format, then that's 
already a first indication that it should probably be supported through 
the main interface, and not as a vendor extension. But okay, this was 
just a theoretical example.
Anyway, I see the vendor extensions as something to support things that 
are really *highly vendor specific*. Take for example the oceanics. They 
have a fixed 8 bytes of data per sample. Each model stores roughly the 
same set of information (o2bg, n2bg, pressure, depth, vari, dtr, atr, 
deco, temperature, alarms, etc), but each model packs this data slightly 
different into those 8 bytes. (This is a real pain in the ass, and the 
reason why the oceanic code needs by far the most bugfixes.) Here we 
could support the basic fields (time, depth, temperature, etc) through 
the standard interface, and leave everything else as raw data. But as a 
vendor extension, we could take care of the crazy oceanic data packing 
internally, and present the data in some nicer struct:
struct dc_oceanic_sample_t {
    unsigned int o2bg, n2bg; /* O2 and N2 bar graph */
    unsigned int dtr, atr; /* Dive and air time remaining */
    ...
};
This type of extension is extremely unlikely to be shared between 
different backends. So I think that one of the reason we have a 
different opinion is that we also have a different view on what we 
consider to be a vendor extension.
I'm also convinced that there will probably very few applications that 
will actually make use of the vendor extensions. If you look at the 
general purpose logbook applications that exist today (divinglog, 
macdive, etc), then you'll notice they only support the device 
independent features that would be supported by the standard interface. 
I don't see how you can even write an application that supports all 
possible features of each device, considering the vast amount of 
differences. The only applications that support these features are the 
vendor applications, which are by definition tied to some specific 
devices. But such applications are not really our (main) target 
audience.
BTW, these vendor extensions have the lowest possible priority for me. 
There will be plenty of work to get the standard interface implemented 
and reverse engineering bugs to fix. So the first version(s) will likely 
support no vendor extensions at all. However while this is pretty low on 
my priority list, I think it's good to keep future extensibility in 
mind.
...
...
We can maybe consider to provide language bindings for the most 
popular
languages (.net and python) ourselves?
Not a bad idea - but what is a bad idea is to then go for the least
common denominator and create an ugly, badly implemented C API just
because it's harder to map a well designed API into .net constructs.
You can always have the .net binding create its own data structures 
that
follow the preferred way to do things in .net - but don't force that
onto the C API...
The reason why I mentioned maintaining the language bindings as part of 
the project is that, if there is some ugly and obscure construct 
required, then we can take care of that ourselves, and hide it from the 
user. Then we don't have to worry anymore about users having trouble 
with the .NET integration.
Unfortunately I speak from experience here. I spend considerable 
amounts of time on the .NET interface for the current api (for 
divinglog). There were several non-trivial issues (calling conventions, 
interactions with the garbage collector, manual marshaling, etc) that 
required quite some research on my side to figure out why things were 
not working as expected. I'm pretty sure the average .NET developer 
wouldn't be able to get this right either.
So I'm not saying we should model after the least common denominator, 
just that we have to take care that whatever api we pick, it should be 
possible to use from other languages. That's why I referred to the 
varargs functions for example.
...
...
Anyway, I think it's very important to not treat these non C/C++
applications as second class citizens, especially because at least 
one
of them (divinglog) is likely the application with the largest 
userbase.
So that means all features should be usable from the languages 
bindings
too.
Certainly. But since these apps are binary-only they live in a 
somewhat
different world where they can simply take a random snapshop of
libdivecomputer, implement all what they want from the API and not 
worry
about libdivecomputer changing (or the user picking a different 
version
of libdivecomputer). For many other applications life isn't that 
simple.
The rules for binary compatibility are the same for C and .NET 
applications. If your application depends on symbol X, and you switch 
back to an older library without this symbol, than both the C and .NET 
app will fail. They may fail in some different way, but that doesn't 
matter for this discussion.
...
Sure, I could fork libdivecomputer and simply pull whatever version I
want to rely on into the Subsurface git and tell people to just build
from that version - but that's not my vision of how software 
engineering
across projects should work - I wouldn't want to do that for libxml2,
glib, gtk, libc... so why should I have to do that for 
libdivecomputer.
So fundamentally:
If your API does not allow a an application a sane way to detect 
which
features are supported and how to conditionally compile for those
features, then your API is broken.
Sure, that's definitely not what we want :-)
I just try to understand why you dislike the DC_VERSION_CHECK and 
prefer the DC_SAMPLE_FOO conditional compilation. In the end, regardless 
of how you do the conditional compilation, your application will need to 
know the data format associated with each vendor extension. That is also 
true whether you use an individual function for each extension, or one 
function with some type indicator. I mean everything that you wrote 
about the gyro stuff below can be done with both your and mine proposal.
...
...
I guess that once we have decided on the final list of dive and 
sample
data that we will support, the chances that we'll have to introduce
additional samples in the future are relative small. I mean the goal 
for
the v1.0 was to maintain backwards compatibility from then on. If we
then document the first version number in which every feature 
appeared,
then it shouldn't be that much of a hassle to do:
#if DC_VERSION_CHECK(x,y,z)
#endif
and then language bindings can do the same at runtime.
Certainly. But think twice how your API will look like. How many
functions will you have? What data types do they return? And how are
they extended to provide more data as that becomes available from 
dive
computers (think details about the deco model, saturation of 
different
compartments, gyro information about vertical and lateral movement,
pictures that were taken during that sample, voice recordings of the
conversation during the dive... I know you think I'm insane, but look 
at
what the Suunto Solution provided and what you can get from a state 
of
the art DC like the OSTC or the Uemis SDA... to assume that the speed 
of
innovation will suddenly come to a halt is being really foolish).
That's why I really wasn't joking with my extreme example above. Have
the smallest possible API with the most flexible data types that you 
can
think of. In my model it is easy and transparent to add
#define DC_VSAMPLE_GYRO 0xDEAD
and then have the data pointer point to a
struct dc_gyro_t {
   int flags;
   int delta_north; // in mm
   int delta_west;  // in mm
   int accuracy;    // in mm
};
which allows you to track the movement of the diver. And all the
application does is
#if defined(DC_VSAMPLE_GYRO)
   ...
#endif
in the switch statement that loops over all the vendor samples. 
Clean,
easy to maintain, easy to handle from an application point of view.
Now imagine the same thing with an API that only supports per vendor
data without complex data structures. So now you are adding the
following entry points to your API:
dc_atomics_gyro_get_delta_north()
dc_atomics_gyro_get_delta_west()
dc_atomics_gyro_get_delta_accuracy()
dc_uemis_gyro_get_delta_north()
dc_uemis_gyro_get_delta_west()
dc_uemis_gyro_get_delta_accuracy()
etc. And imagine the code flow in the application that has to 
duplicate
all this crap.
Oh, but that was definitely not my intention. There is absolutely no 
reason to limit a vendor extension to just a single value. It's 
perfectly acceptable to use a struct like your dc_gyro_t.
...
...
To be honest, I don't really like this. This tastes again as 
adapting
our model to that of the dive computers. If we supply three values, 
then
it breaks again if there is some dive computer out there that 
supports
some fourth value. And with those flags, the semantics can be 
altered in
many different ways. I fear this will quickly get messy on the
application side.
Quite the contrary. This allows the application to actually 
understand
the data it gets in a dive computer INDEPENDENT way. If all the app
cares for is "give me some temperature" then it ignores the flags. If
the application would like to be able to plot the temperature where 
it
was taken (even on the dive computer that may only record the
temperature at the deepest point of the plot or the minimum temp or
something) then it can do so in a sane way.
What you are proposing is to simply pretend that all dive computers 
are
the same and they give us a "temperature". And if the application 
wants
to know what this temperature actually means, tough shit, write DC
specific code.
[...]
...
...
I'll keep singing the same song. Pick one semantic as default and
provide clearly specified flags to modify that.
At this point I might as well abandon using libdivecomputer (or use
libdivecomputer to do the downloading of raw data) and write the rest 
in
my application - that way at least I know what I'm getting.
Do you see my point? I don't want to adapt libdivecomputer to ONE
SPECIFIC divecomputer. I want libdivecomputer to be designed so that 
it
can give me accurate and well defined information for EVERY 
divecomputer
it supports. That's why I want the flags. That's why I want the
flexibility.
You seem to want to have an API which provides the information as it
would come from some idealized divecomputer with a fixed feature set 
and
ill defined semantics and then let the application deal with the 
rest.
I want the API to be able to provide us the maximum amount of data 
with
the best possible semantic definition for each and every divecomputer
out there - and not dumb it down to some least common denominator or
some middle ground.
[...]
...
So it might be more interesting to have the begin/end temperature, 
and
the temperature at maximum depth delivered in the sample data. We
already do this today for the suunto vyper backend, and I think it 
works
quite well. Instead of temperature graph, you now draw a few 
temperature
dots. If we do that, we can then reserve the temperature field
exclusively for the true single temperature value case.
And again, you take a specific computer and try to make the model 
match
that computer.
I try to provide an API that allows every computer to provide as much
data as they have. And then have the app decide which information it
wants to display and how it wants to display it.
So if your computer gives you a full temperature graph on every 
sample,
GREAT. If the computer doesn't give sample temperature at all but 
only a
beginning and min temperature - provide that instead. And if it is 
end
temperature and temperature at maximum depth, then provide that AND
DON'T PRETEND THEY ARE BEGINNING TEMPERATURE AND MINIMUM TEMPERATURE.
Provide them as what they are and let the app decide what to do with
them.
I'm not trying to make the model match one specific dive computer. 
Placing the different temperatures values on the profile would be 
completely independent of any dive computer model for example. It can be 
done for every model. Imagine there is some crazy dive computer that 
gives you temperature at the decostops. That would still work!
Let me go back to the temperature problem, and formulate it in a 
different way. At the moment we already have five possible 
interpretations (begin/end, min/max/avg, and at maximum depth). A device 
can support one or more of them, in all possible combinations. Let's say 
I agree with you and we want to preserve the exact semantics. How should 
the api and data structure look like? Just a single temperature with a 
flag won't work for devices that have two values. If we extend the data 
structure with some more temperature fields (and flags), then it will 
break again if there is another device with one more value. How would 
you solve this problem?
So I'm not saying I'm against trying to preserve the exact temperature 
semantics. What I don't like are the flags you proposed like 
MIN_TEMP_IS_AT_MAXDEPTH, because it alters the interpretation of the 
min_temperature field. I'm not convinced that's the right solution. So 
maybe we should think again and try to come up with a third solution. 
I'll start by providing a couple of ideas that crossed my mind:
I already mentioned placing the different temperature values into the 
samples. We could also support each temperature variant as a different 
type: DC_FIELD_TEMPERATURE_{MIN,MAX,...}. When a particular variant 
isn't available we don't set the corresponding bit (or return 
UNSUPPORTED). Or we define a structure with all the variants we wish to 
support, along with a bitmap to tell which fields are present (or when 
using Kelvins, we could consider a value of 0 as not present):
struct dc_field_temperature_t {
    bitmap;
    minimum;
    maximum;
    ...
}
...
...
But then you introduce extra complexity on the application side. If 
you
get an CNS value, then you have to check all those flags to figure 
out
whether it's a percentage (default), or some scale index, or that 
weird
suunto OLF value, or...
You can't just take the value anymore and treat is as a percentage.
 From my point of view that makes the abstraction provided by
libdivecomputer less useful. I mean how different is that from doing 
a
model based interpretation:
if (suunto)
    /* OLF */
else if (oceanic)
    /* Range index */
else
    /* Percentage */
I think we are better off always providing a percentage, and if that 
is
not available then you don't get a CNS value at all.
No, Jef, that is NOT being better off. That is better off assuming 
all
you care about are the people who have a computer that follows the 
model
that you pick and everyone else can go screw themselves. We are 
better
off if we provide the app as much information as is there.
And if application writers are as lazy as you seem to assume they are
more than welcome to say
get the CNS
if (flag & CNS_IS_PERCENTAGE)
   show cns value
else
   tell user to go fuck themselves and buy my dive computer
Subsurface would rather be able to figure out a way to display the 
CNS
regardless how the dive computer manufacturer decides to encode it - 
and
figure out new ways to do this if new dive computers with a better 
(or
even only different) model show up.
CNS values are defined as percentages, just like depths are defined as 
lengths. The fact that some vendors choose to store it differently (to 
save some bits?) doesn't change that. Just like we convert all depths to 
a standard unit, we would do the same for CNS. If that's not possible 
because it's some vendor specific concept like OLF, then we can't do the 
conversion and we act as if the device doesn't support CNS. If you allow 
the CNS value to contain different types of values (e.g. CNS or OLF 
depending on the flag), then why not also leave the depth in imperial or 
metric units, and provide a flag with the units? Because according to me 
that's exactly the same as what you propose here for CNS. (I know CNS is 
just an example here, but the same logic applies to other values too of 
course.)
From my point of view, one of the main advantages of using 
libdivecomputer is that it provides an abstraction. Call that 
abstraction an idealized dive computer if you like. But as is the case 
with all abstractions, you can't support every single feature that's out 
there. So yes, in some cases you will end up with a more limited subset 
than what the device supports. I believe that's a trade-off you have to 
make.
...
...
I agree, although not as strong as you do.
Let's say we define the meaning of the temperature as the minimum 
water
temperature, which is probably the most common value. But then you
happen to have some dive computer which only provides the average 
water
temperature. Since that's all we have, that's what we give to the
application and goes into your logbook. Is the small difference in
semantics really that important, considering we don't have any other
info? I mean to the user it won't make that much difference, because 
the
semantics will remain consistent across all his dives with that 
device.
In a paper logbook you would also just write down the value that you 
get
from your dive computer.
Again, you assume that the application is lazy. If what 
libdivecomputer
does is "heck, it's the only temperature I get, let's claim it's the
minimum" then all the application can do is show the minimum 
temperature
and if my computer is picking the temperature at maximum depth and I 
am
diving in Hoodsport in the winter and the computer give 10C as
temperature at maximum depth but lazy ass app then claims "minimum
temperature was 10C then I will think that the application author is 
on
crack while in reality it is libdivecomputer that is giving me
information that is WRONG and therefore worthless.
Why is this so hard to understand.
If the other application authors really are that lazy, fine, they can
ignore the flags that correctly specify what type of information is
provided. Or if there is no sane way to present the information that 
is
provided (heck, what DO those Suunto OLF values mean? beats me...)
Suunto OLF (Oxygen Limit Fraction) is the maximum value of CNS and OTU, 
both expressed as percentages. So it's either CNS or OTU, but you can't 
tell which one.
...
But that is a decision you should LEAVE TO THE APP and not make in
libdivecomputer.
From what you are writing I think that you want an API that 
simplifies
things to a degree that destroys information. And I am vehemently
opposed to that.
While I really understand your concern regarding the temperature 
problem (and we can probably find some way around it), I really don't 
get the CNS one. For the temperature, ignoring the flag will still yield 
a temperature. The semantics may be a bit wrong, but it's still a 
temperature value. However for the CNS value you would get something 
completely different! So this is not about applications being lazy, but 
about providing an abstract model with standardized units. By not 
supporting certain vendor specific concepts (e.g. OLF) we are actually 
preserving that information, albeit in the form of raw data or a vendor 
extension.
...
...
So while I agree that a crystal clear semantics would be ideal, I 
also
believe some small deviations are acceptable. You can also put it in
another way: how can we even be sure about the actual semantics 
given
the fact that it's all based on reverse engineering, so there is 
always
an uncertainty factor present.
It is less and less based on reverse engineering as more and more 
dive
computer vendors seem willing to give us API information. And I think
this trend will only continue (of the ones that I have contacted, 
only
one so far has told me "no, we will not provide that").
I don't know which vendors you contacted, but as far as I know the "big 
players" like Suunto, Uwatec, Mares, etc have no interest in supporting 
third-party applications. Only the relative new players on the market 
are supporting us, like Reefnet, HW, Atomics, Shearwater, etc. No matter 
how great those few vendors are, the majority of the divers are still 
using a device from one of the big players.
...
...
I think it gets a lot more dramatically if an application interprets 
a
value incorrect because it fails to take into account one of those 
flags
correctly. For example interpreting the oceanic CNS range index as a
percentage, which means it will never go past 5% (or some other 
small
value). Ouch.
So you are saying you are doing a better job by not giving any data 
at
all in that case, instead of providing the data with a flag that 
defines
what it means?
Yes. See above for my reason behind it. If it doesn't fit into the 
abstraction provided by libdivecomputer, then it's likely a vendor 
specific feature, and should be treated likewise.
Jef

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: Next round of api improvements.