Re: Next round of api improvements.

16 Nov 2012

      Jef Driesen jefdriesen@telenet.be writes:
...
...
Yuck, yuck, yuck. It's standard and universally hated. It creates
significantly more pointless code and boilerplate on both sides of 
the
API for no real benefit.
And it's a wonderful way to create subtle, really hard to debug 
problems

you forget to reference the object and keep using it - Valgrind

won't
find that you are using 'freed' memory because it isn't freed - it 
just
changes values under you. I absolutely HATE this idea.
Personally, I don't think reference counting is that bad. On the 
application side it's just a few extra dc_object_ref calls, and 
dc_object_unref instead of dc_object_free. And you can equally well 
shoot yourself in the foot with plain malloc. But okay, your preference 
is noted :-)
I really want to stress my last point. It's debuggability that I am most
concerned about. If the application owns the memory, there are very
powerful tools to help you figure out where you shot yourself in the
foot (ask me how I know).
...
...
Please make the memory application owned and have the application be
responsible for freeing memory. Unless you are planning to run
libdivecomputer on a wristwatch the overhead for that is negligible 
with
today's OSs (that all have a reasonably smart malloc implementation 
that
will reuse the recently freed memory block on its own).
I certainly agree that this is probably the most intuitive and simple 
design.
But while I'm not too concerned about the memory allocations by 
themselves, I still find it a bit silly having to allocate resources 
where it's not strictly necessary. For the sample data I simply can't 
think of any use-case where you would want to hold on to the sample 
objects outside the callback function. Hence my search for possible 
alternatives.
I understand. But I think you are focusing on the wrong problem.
Having a consistent API that treats memory consistently and doesn't
require the application to know the internals of your implementation is
a HUGE advantage. It makes libdivecomputer easier to use.
"Any memory allocation that libdivecomputer returns to the application
is owned by the application and needs to be freed when it is no longer
needed".
This is easy to understand, easy to implement, even easy to verify with
static analysis tools.
Elegance and best possible algorithms are honorable goals. But
consistency and ease of use (and debuggability) are more important, I
think.
...
...
...

Querying available sample values?

=====================================
With the current design, the samples are pushed to the application, 
and
there is no need to know which values are actually supported. You'll
only get those values that are actually present. However, with the 
new
design, the application has to call an accessor function to retrieve 
a
piece of data, and thus it needs some way to query which values are
presented. The only values that are guaranteed to be available for 
each
sample are time and depth. Everything else is optional.
The level at which we can tell whether a particular sample value is
present varies. It can be at the device level (for example a non-air
integrated model will never store tank pressure data), at the dive 
level
(for example a model with a wireless pressure sensor can have dives 
with
and without tank pressure stored), or even at the sample level (for
example a tank pressure every x sample).
Or you have a Suunto dive computer and the sample data is randomly
missing... :-/
What do you mean?
Oh - we have several Suunto DCs where the wireless sensor keeps dropping
out and randomly you have samples with missing pressure... I was
basically agreeing with you, giving another scenario where this could be
the case.
...
...
I was hoping for a somewhat different design.
int dc_sample_get_basic(sample, basic_sample);
returns DC_STATUS_SUCCESS if you got a sample back, various errors
otherwise.
basic_sample_t {
   int time; // in seconds
   int depth; // in mm (or you can do a double in meters)
   uint64_t data_available; // bitmap on what else is available
}
Then your header files define those bits and the corresponding data
structures.
#define DC_PRESSURE_SAMPLE 1<<0
dc_pressure_sample_t {
   int nr_tanks;
   int *tankpressure;
}
So this tells you how many pressure samples and returns a pointer to 
the
array of pressures (or in the typical case to the single pressure) - 
an
integer in mbar or a double in bar or whatever.
And for simple stuff just have an int (or you can go fancy and define
types for all of them).
#define DC_TEMPERATURE_SAMPLE 1<<1
typedef dc_temperature_sample_t int; // temperature in mkelvin (or
whatever you want)
This way ONE call gets you the basics that every dive computer 
supports
and at the same time the information what else is available for THIS
sample. And you can keep things API compatible by simply adding new 
bits
whenever you add a new piece of information that you can get in a 
sample

the app can then do

#ifdef DC_DECO_SAMPLE
...
#endif
and work with older and newer versions of libdivecomputer...
I also considered using a bitmap (it was mentioned briefly somewhere in 
my mail I think), but I couldn't find a way to make it work with vendor 
extensions. A bitfield is limited to 8 to 64 different types, depending 
on the integer size we decide to use. For the normal samples, that is 
probably more than we'll ever need. But if every backend can support a 
few extension types, and each one needs another bit, then we'll quickly 
running out of bits. Do you have a solution for this?
Yes. Sort of. You have one bit in your bitmap that says "vendor
extension" and then the vendor extension can use another int (16bit or
something) with an enum (so 64k values for 16 bits) to indicate which
extension this is. So we don't waste the space in every sample, but if
there is an extension we have tons of room to grow and add extensions
for new things to come.
...
While I'm definitely not opposed to the bitmap idea, I'm not that that 
happy with the basic_sample_t struct. I would prefer to implement this 
with separate functions. One for the bitmap itself, and one function for 
each value. Since there are only two values that are always available, I 
wouldn't treat them any different from the others.
The reason that I thought of treating them differently is that these two
values are available in every sample from every DC (otherwise, what's
the point of that DC). So you can have one call that gives you what you
can always get (time, depth) plus the information about everything else
that is available.
But if you'd rather have this be three API calls (get_time, get_depth,
get_available_data_bitmap or something), I'll be happy with that, too.
...
For the multivalued data, I'm not sure how well such a struct 
(containing a length and pointer to an array) will work for language 
bindings (python, .net, etc). That's something we'll have to check.
I don't know about .net. In python this would work just fine.
...
I prefer enums over pre-processor constants too. I know this will loose 
the conditional compiling, but this is a trick that will work for C/C++ 
only.
The problem with enums is that you now have to have strong versioning of
your API - something which libdivecomputer has not a good history of
doing (and I know all the reasons why - don't take this as criticism,
just as an observation). So if you use enums then I have to recreate the
logic of "which version of libdivecomputer supports which values for
this function" in my app - and so does everyone else. We need to
basically create a "version number to feature mapping" in the client.
If you do this with #defines then this is much easier to take care of.
At least for C/C++ programers - I have seen other language bindings
parse the defines for this as well, but don't know what the current
users of libdivecomputer are
...
...
...

Supported dive and sample data

==================================

temperature

air temperature, I assume?
Depends. Different dive computers have different interpretations. It 
could be air/water temperature, the min/max/avg value, and so on. If 
that's even know to us. Not sure how to deal with this. We could just 
supply a temperature, and leaving the actual interpretation to the user, 
or we could have an extra flag with the exact interpretation (if known) 
as you mention below. But even then we're still left with devices that 
provide more than one temperature.
So offer a couple of values with flags that determin the semantics
* air_temperature
* min_water_temperature
* avg_water_temperature
And offer flags that clarify semantics
AIR_TEMP_IS_FIRST_SAMPLE
AVG_IS_LAST_TEMP
MIN_TEMP_IS_AT_MAX_DEPTH
(btw, I love the last one as I frequently dive in a water body that for
more than half of the year has temperature go UP as you go deeper...)
etc.
...
For example for the suunto vyper family, there is a temperature at the 
start/end of the dive, and at maximum depth. In this particular case I 
solved this by injecting artificial temperature values into the samples.
...
...

salinity, altitude/atmospheric pressure
dive mode (open/closed circuit, gauge, etc)

...
Samples:

time
depth
temperature
tank pressure
heartrate
airtime

That seems like a good list. Heartrate is the one that seems a bit 
out
of place - are there many computers that support that? I would have 
seen
CNS, NDL as more likely candidates for the default list.
Heartrate is only supported by the Uwatec Galileo as far as I know. It 
certainly not a must-have feature, but I can imagine that if you have 
such a device, this is an interesting value. Although the actual 
usefulness is questionable if you ask me, but it's cool gadget :-)
So much of what high end DCs provide is more cool than entirely useful -
but I agree that we should make it available regardless. Visualization
is the application's job :-)
...
CNS/OTU can be calculated from the samples, but we can consider to 
support it too.
Agree on both parts. I know several DCs provide CNS information and
would love to be able to use that instead of the internally calculated
values in Subsurface.
...
I have no idea which devices record this in the data. I 
know the Suunto's have this weird OLF (Oxygen Limit Factor) value, which 
is either the CNS or OTU, whichever is higher. And I think the Oceanics 
have a scale from 1 to 5 or something. So that might be a bit more 
difficult to support in a uniform way.
I'll keep singing the same song. Pick one semantic as default and
provide clearly specified flags to modify that.
...
I assume with NDL, you mean the remaining no-deco time? I don't think 
any device does support that. And then you are left with NDL equals the 
absense of decostop info.
Several do. OSTC and Uemis Zurich both give you their current no deco
time in either every sample or every nth sample (OSTC).
...
...
...

Deco stops (depth and time)
Warnings, bookmarks, etc

For the events, I'm not sure yet how the querying (see previous 
item)
should be done. The normal samples are continuous values, and will
usually appear at least every X sample, or not at all. But the 
events
are discrete in nature and will only appear occasionally. Should 
this be
done differently in the api?
We could do this with the bitmap proposed above, right?
What I had in mind, is to have a single DC_SAMPLE_EVENT type. If this 
bit is set, there are events available. The actual event value would 
then be another bitmap containing all the events. With this approach we 
can support several events without needing additional sample types for 
each event type.
Good idea.
...
...
...
We'll also have to agree on a data structure and the exact semantics
for each of those pieces of data. Some won't need much discussion, 
but
others will. For example take the temperature. A device can record a
single value which could be the minimum, maximum or average 
temperature.
It can have all three, or a full temperature profile. How do we deal
with that? Things are not always as easy as they may look.
I think the best way to deal with that is to have a bitfield flags 
that
is part of every sample. Be generous, at least 64 bits. And simply 
have
a clearly defined default semantic (whatever Suuntu uses, or some 
other
common DC) and whenever there's a different semantic for such a 
value,
add a flag bit that again has clear semantic.
So for example by default temperature is the minimum temperature. And
then there can be a DC_DATA_TEMP_IS_AVERAGE bit that tells you that
instead this is the average temperature.
So you have one of those flags bitfield in the dive data, and one in
each sample. And you do your darnedest to make sure that there is a
precise definition of the default meaning of each value, and a 
precise
definition of how the bits modify that.
I fear that the use of extra flags will quickly lead to a mess again. 
We now have this for the temperature, tomorrow we may need extra flags 
for something else, and before we know we end up with a flags 
everywhere.
And it doesn't solve the case where there are multiple temperature 
values either.
See what I wrote above. The moment you don't have precisely defined
semantics of the value (even if the semantic is just our current best
guess based on reverse engineering) the usefulness of that value to the
application declines dramatically. An app is always at liberty to ignore
that additional information ("hey, here's some temperature, no idea what
it is"), but if it isn't there all that will happen is that apps start
to make educated guesses based on the type of DC used. And that's just
ugly.
I feel very strongly about that. I think and API that doesn't define
what the values it gives you actually mean is really of rather limited
use.
...
Sorry for the possible confusion, but imperial was never an option. 
It's either metric or SI. The only difference between the two is the 
temperature (Celsius vs Kelvin). For the other units it's just a 
difference in scaling factors (Pascal vs bar).
Good!
/D

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: Next round of api improvements.