Next round of api improvements.

Jef Driesen jefdriesen at telenet.be
Fri Nov 16 18:40:46 UTC 2012


On 2012-11-15 19:43, Dirk Hohndel wrote:
> On Thu, 2012-11-15 at 12:16 +0100, Jef Driesen wrote:
>> C. Reference counting
>>
>> An alternative option would be to make the new objects reference
>> counted. This changes nothing to the fact that we have to decide 
>> whether
>> the application has to take a reference explicitly (e.g. case A) or
>> gains it automatically (e.g. case B). But with the reference count, 
>> we
>> can detect whether the application still holds a reference or not, 
>> and
>> avoid unnecessary memory allocations:
> [...]
>> It's of course slightly more complex to implement, and we have to 
>> make
>> sure the reference counting is thread-safe too using atomic 
>> operations
>> or protected with a mutex), but I think the resulting api will be 
>> more
>> intuitive. Reference counting is pretty standard today.
>
> Yuck, yuck, yuck. It's standard and universally hated. It creates
> significantly more pointless code and boilerplate on both sides of 
> the
> API for no real benefit.
>
> And it's a wonderful way to create subtle, really hard to debug 
> problems
> - you forget to reference the object and keep using it - Valgrind 
> won't
> find that you are using 'freed' memory because it isn't freed - it 
> just
> changes values under you. I absolutely HATE this idea.

Personally, I don't think reference counting is that bad. On the 
application side it's just a few extra dc_object_ref calls, and 
dc_object_unref instead of dc_object_free. And you can equally well 
shoot yourself in the foot with plain malloc. But okay, your preference 
is noted :-)

> Please make the memory application owned and have the application be
> responsible for freeing memory. Unless you are planning to run
> libdivecomputer on a wristwatch the overhead for that is negligible 
> with
> today's OSs (that all have a reasonably smart malloc implementation 
> that
> will reuse the recently freed memory block on its own).

I certainly agree that this is probably the most intuitive and simple 
design.

But while I'm not too concerned about the memory allocations by 
themselves, I still find it a bit silly having to allocate resources 
where it's not strictly necessary. For the sample data I simply can't 
think of any use-case where you would want to hold on to the sample 
objects outside the callback function. Hence my search for possible 
alternatives.

>> D. Mixed ownership
>>
>> The last option is to allow different ownership rules for the dive 
>> and
>> sample objects. For example we could choose "applications owns" for 
>> the
>> dives, and "library owns" for the samples. In that case, the dives 
>> can
>> be used outside the callback function, while their is no unnecessary
>> overhead for the samples. The obvious downside is less consistency.
>
> I vote for consistency any time.

I usually prefer consistency too...

The idea originated from the fact that there exits two types of 
containers. In a traditional container (e.g. array, linked list, etc) 
the elements are already owned by the container. So when you iterate 
such a container, you just get a direct reference to the element in the 
container. There is no need to deal with ownership anywhere, because 
that's already taken care of.

But in our case, the containers act more like generators, where the 
elements are produced on-the-fly. There it makes sense to have the 
application own the generated items. However, for the samples, you can 
consider a dive as a traditional container for the samples. The samples 
are already present, they just need to be wrapped in a device 
independent structure.

>> 2. Querying available sample values?
>> =====================================
>>
>> With the current design, the samples are pushed to the application, 
>> and
>> there is no need to know which values are actually supported. You'll
>> only get those values that are actually present. However, with the 
>> new
>> design, the application has to call an accessor function to retrieve 
>> a
>> piece of data, and thus it needs some way to query which values are
>> presented. The only values that are guaranteed to be available for 
>> each
>> sample are time and depth. Everything else is optional.
>>
>> The level at which we can tell whether a particular sample value is
>> present varies. It can be at the device level (for example a non-air
>> integrated model will never store tank pressure data), at the dive 
>> level
>> (for example a model with a wireless pressure sensor can have dives 
>> with
>> and without tank pressure stored), or even at the sample level (for
>> example a tank pressure every x sample).
>
> Or you have a Suunto dive computer and the sample data is randomly
> missing... :-/

What do you mean?

>>  To support all possible cases
>> this querying should thus be done at the sample level.
>>
>> I think the simplest solution is to use the accessor function itself 
>> to
>> probe for the presence of the info. You just call the accessor 
>> function
>> and if the info isn't available, a special error code like 
>> UNSUPPORTED
>> is returned:
>>
>> if (dc_sample_get_depth (sample, &depth) == DC_STATUS_SUCCESS) {
>>     /* Depth retrieved successfully. */
>> } else {
>>     /* No depth available (or an error occurred). */
>> }
>>
>> For multivalued data, like the tank pressure, we could then use an
>> extra index parameter in the accessor function, and then call it
>> repeatedly until the UNSUPPORTED error is returned:
>>
>> i = 0;
>> while (dc_sample_get_pressure (sample, &pressure, i) ==
>> DC_STATUS_SUCCESS) {
>>     /* Tank pressure retrieved successfully. */
>>
>>     i++;
>> }
>>
>> An alternative solution for the multivalued data would be to 
>> retrieve
>> all values at once using an array:
>>
>> dc_sample_get_pressure (dc_sample_t *sample, double pressure[],
>> unsigned int n);
>>
>> But then the caller somehow has to know the maximum value for n, and
>> the api also has to communicate back the actual number of values
>> returned. The second part is easy with an extra output parameter, 
>> but I
>> don't know a nice solution for the first problem (other than a 
>> second
>> dc_sample_get_pressure_count function).
>>
>> Alternatives for the probing approach outlined above, are an 
>> iterator
>> like interface to iterator over the available sample values, or a
>> function to retrieve a bitmap with the available sample values.
>
> I was hoping for a somewhat different design.
>
> int dc_sample_get_basic(sample, basic_sample);
>
> returns DC_STATUS_SUCCESS if you got a sample back, various errors
> otherwise.
>
> basic_sample_t {
> 	int time; // in seconds
> 	int depth; // in mm (or you can do a double in meters)
> 	uint64_t data_available; // bitmap on what else is available
> }
>
> Then your header files define those bits and the corresponding data
> structures.
>
> #define DC_PRESSURE_SAMPLE 1<<0
> dc_pressure_sample_t {
> 	int nr_tanks;
> 	int *tankpressure;
> }
>
> So this tells you how many pressure samples and returns a pointer to 
> the
> array of pressures (or in the typical case to the single pressure) - 
> an
> integer in mbar or a double in bar or whatever.
>
> And for simple stuff just have an int (or you can go fancy and define
> types for all of them).
>
> #define DC_TEMPERATURE_SAMPLE 1<<1
> typedef dc_temperature_sample_t int; // temperature in mkelvin (or
> whatever you want)
>
> This way ONE call gets you the basics that every dive computer 
> supports
> and at the same time the information what else is available for THIS
> sample. And you can keep things API compatible by simply adding new 
> bits
> whenever you add a new piece of information that you can get in a 
> sample
> - the app can then do
>
> #ifdef DC_DECO_SAMPLE
> ...
> #endif
>
> and work with older and newer versions of libdivecomputer...

I also considered using a bitmap (it was mentioned briefly somewhere in 
my mail I think), but I couldn't find a way to make it work with vendor 
extensions. A bitfield is limited to 8 to 64 different types, depending 
on the integer size we decide to use. For the normal samples, that is 
probably more than we'll ever need. But if every backend can support a 
few extension types, and each one needs another bit, then we'll quickly 
running out of bits. Do you have a solution for this?

While I'm definitely not opposed to the bitmap idea, I'm not that that 
happy with the basic_sample_t struct. I would prefer to implement this 
with separate functions. One for the bitmap itself, and one function for 
each value. Since there are only two values that are always available, I 
wouldn't treat them any different from the others.

For the multivalued data, I'm not sure how well such a struct 
(containing a length and pointer to an array) will work for language 
bindings (python, .net, etc). That's something we'll have to check.

I prefer enums over pre-processor constants too. I know this will loose 
the conditional compiling, but this is a trick that will work for C/C++ 
only.

>> 3. Supported dive and sample data
>> ==================================
>>
>> Which dive data and sample data should we support?
>>
>> I already made a tentative list, but this is open for discussion of
>> course. Items can be added or removed after discussion.
>>
>> Dives:
>>   * date/time
>>   * divetime
>
> length of dive as reported by DC?

Yes, and if such value isn't available, fallback to a value calculated 
from the samples. Exactly the same as we do today.

>>   * maxdepth
>>   * gasmixes
>
> That's variable length, right?
>
>>   * tank pressures (begin/end)
>
> ditto

Yes, multiple gases and tanks will be fully supported.

>>   * temperature
>
> air temperature, I assume?

Depends. Different dive computers have different interpretations. It 
could be air/water temperature, the min/max/avg value, and so on. If 
that's even know to us. Not sure how to deal with this. We could just 
supply a temperature, and leaving the actual interpretation to the user, 
or we could have an extra flag with the exact interpretation (if known) 
as you mention below. But even then we're still left with devices that 
provide more than one temperature.

For example for the suunto vyper family, there is a temperature at the 
start/end of the dive, and at maximum depth. In this particular case I 
solved this by injecting artificial temperature values into the samples.

>>   * salinity, altitude/atmospheric pressure
>>   * dive mode (open/closed circuit, gauge, etc)
>
>
>> Samples:
>>   * time
>>   * depth
>>   * temperature
>>   * tank pressure
>>   * heartrate
>>   * airtime
>
> That seems like a good list. Heartrate is the one that seems a bit 
> out
> of place - are there many computers that support that? I would have 
> seen
> CNS, NDL as more likely candidates for the default list.

Heartrate is only supported by the Uwatec Galileo as far as I know. It 
certainly not a must-have feature, but I can imagine that if you have 
such a device, this is an interesting value. Although the actual 
usefulness is questionable if you ask me, but it's cool gadget :-)

CNS/OTU can be calculated from the samples, but we can consider to 
support it too. I have no idea which devices record this in the data. I 
know the Suunto's have this weird OLF (Oxygen Limit Factor) value, which 
is either the CNS or OTU, whichever is higher. And I think the Oceanics 
have a scale from 1 to 5 or something. So that might be a bit more 
difficult to support in a uniform way.

I assume with NDL, you mean the remaining no-deco time? I don't think 
any device does support that. And then you are left with NDL equals the 
absense of decostop info.

> Also, your tank pressure contains the tank index as well, right?

Of course.

>> Events:
>>   * Gas changes
>
> as we discussed in earlier email - I think having the tank index in 
> the
> sample is a better solution...

Indeed.

>>   * Deco stops (depth and time)
>>   * Warnings, bookmarks, etc
>>
>> For the events, I'm not sure yet how the querying (see previous 
>> item)
>> should be done. The normal samples are continuous values, and will
>> usually appear at least every X sample, or not at all. But the 
>> events
>> are discrete in nature and will only appear occasionally. Should 
>> this be
>> done differently in the api?
>
> We could do this with the bitmap proposed above, right?

What I had in mind, is to have a single DC_SAMPLE_EVENT type. If this 
bit is set, there are events available. The actual event value would 
then be another bitmap containing all the events. With this approach we 
can support several events without needing additional sample types for 
each event type.

>> We'll also have to agree on a data structure and the exact semantics
>> for each of those pieces of data. Some won't need much discussion, 
>> but
>> others will. For example take the temperature. A device can record a
>> single value which could be the minimum, maximum or average 
>> temperature.
>> It can have all three, or a full temperature profile. How do we deal
>> with that? Things are not always as easy as they may look.
>
> I think the best way to deal with that is to have a bitfield flags 
> that
> is part of every sample. Be generous, at least 64 bits. And simply 
> have
> a clearly defined default semantic (whatever Suuntu uses, or some 
> other
> common DC) and whenever there's a different semantic for such a 
> value,
> add a flag bit that again has clear semantic.
>
> So for example by default temperature is the minimum temperature. And
> then there can be a DC_DATA_TEMP_IS_AVERAGE bit that tells you that
> instead this is the average temperature.
>
> So you have one of those flags bitfield in the dive data, and one in
> each sample. And you do your darnedest to make sure that there is a
> precise definition of the default meaning of each value, and a 
> precise
> definition of how the bits modify that.

I fear that the use of extra flags will quickly lead to a mess again. 
We now have this for the temperature, tomorrow we may need extra flags 
for something else, and before we know we end up with a flags 
everywhere.

And it doesn't solve the case where there are multiple temperature 
values either.

>> Closely related to the data structures is the choice of the unit
>> system. The choice is either metric or SI. SI units have the 
>> advantage
>> all values are positive, but applications are likely to use metric 
>> or
>> imperial. We can also consider to introduce a unit system as used in
>> subsurface, where all values are scaled to fit into an integer (e.g.
>> depths in millimeters instead of meters, and so on)? Or are we happy
>> with floating point values?
>
> I'm obviously biased here... but in the end, I'll be happy with 
> floats
> or integers. But please, please, please, ONLY USE SI-related values.
> It's the applications job to convert this into whatever insane
> preferences the user has. Distance in ft, volume in liter, 
> temperature
> in C, pressure in psi... whatever. But libdivecomputer should ALWAYS
> give us SI-related units (I say related because one can argue if C or 
> K
> is the right temperature value - we went with mK because that gives 
> us
> plenty of precision in an unsigned integer - but one might reasonably
> argue that float centigrade is an equally valid SI-related unit... 
> just
> don't allow F or other random non SI units...). If the DC is silly
> enough to return data in the unit system that the user has set up, 
> then
> libdivecomputer should convert that into sane units.

Sorry for the possible confusion, but imperial was never an option. 
It's either metric or SI. The only difference between the two is the 
temperature (Celsius vs Kelvin). For the other units it's just a 
difference in scaling factors (Pascal vs bar).

>> That's it for now. Comments are welcome! If you think I missed
>> something important, don't hesitate to bring it up. It's not 
>> unlikely
>> that I forgot about something :-)
>
> I like the overall direction. And I'm glad that we can discuss this 
> hear
> before it is set in stone... the current API has caused us some
> frustration here and there and this is a great opportunity to create
> something that can be maintained in a backwards compatible manner for 
> a
> long time to come...

Great.

Jef





More information about the Devel mailing list