Next round of api improvements.
Jef Driesen
jefdriesen at telenet.be
Thu Nov 15 11:16:14 UTC 2012
Hi,
For the next release, I'm planning to introduce a new dc_dive_t and
dc_sample_t interface. This will again be a major change to the
libdivecomputer api. Please note that the interface presented in this
email is by no means a final version. It's intended to serve as a
starting point for further discussions. Now, let's have a closer look.
The role of the new dc_dive_t object has been discussed before [1], so
I'll just summarize here. In a nutshell, the new dc_dive_t object is
basically a container for the raw dive data, with the parser
functionality built-in. As a result, there is no need anymore to create
a dc_parser_t object explicitely.
The dc_device_foreach() function will be modified to return dc_dive_t
objects directly in the callback, instead of the raw dive data. The raw
dive data (and fingerprint) can be retrieved from the dc_dive_t object
by means of the dc_dive_get_raw() and dc_dive_get_fingerprint()
functions. The dive metadata can be retrieved with a number of accessor
functions, similar to the current dc_parser_t interface:
int
dive_cb (dc_dive_t *dive, void *userdata)
{
/* Get raw dive and fingerprint data. */
dc_dive_get_raw (dive, &data, &size);
dc_dive_get_fingerprint (dive, &fpdata, &fpsize);
/* Get dive metadata directly. */
dc_dive_get_datetime (dive, &datetime);
dc_dive_get_maxdepth (dive, &maxdepth);
...
}
Note that this new dc_dive_t interface will also simplify the
transition from a callback based interface to an iterator based
interface, because the dc_iterator_next() function can now return the
dc_dive_t object:
dc_dive_t *dive;
dc_iterator_t *iterator;
dc_device_iterator (device, &iterator);
while (dc_iterator_next (iterator, &dive) == DC_STATUS_SUCCESS) {
/* Contents of the callback function goes here. */
}
dc_iterator_free(iterator);
Although this will be something for later, we can already keep this in
mind of course.
For parsing the samples, I have something very similar in mind. With
the current interface, the application receives a stream of sample
values (e.g. time, depth, etc), and the time sample serves as the marker
to start a new sample. However, with the new interface, the sample
values would be accumulated internally in the new dc_sample_t object.
Once the entire sample has been parsed, the sample is passed to the
callback function. As a result, there is no need anymore to treat the
time sample as special, and the individual sample values can be
retrieved with a number of accessor functions:
void
sample_cb (dc_sample_t *sample, void *userdata)
{
dc_sample_get_time (sample, &time);
dc_sample_get_depth (sample, &depth);
...
}
With these new dive and sample objects it's also possible to add vendor
specific extensions in the future. We just need a new accessor function
and a corresponding data structure.
So far the overview of the high level design that I had in mind. The
remainder of this email consists of more detailed information on a
number of technical challenges that will have to be addressed in this
new design.
1. Object ownership, lifetime and memory management.
=====================================================
How should we deal with the ownership of the new objects?
With the current api, the application has to call dc_parser_new()
function explicitly, and their is no doubt that the application is the
owner of the returned object. However, because the new dive and sample
objects will be allocated internally, their ownership is not immediately
obvious. It all depends on the internal implementation. There are a
couple of options, each with their own advantages and disadvantages:
A. Library owns
The first option is to make the library the owner of the objects. In
this case, the object is allocated internally, passed to the callback
function, and destroyed internally again:
dc_device_foreach (...)
{
while (!finished) {
dc_dive_t *dive = NULL;
dc_dive_new (&dive, ...);
callback (dive, userdata);
dc_dive_free (dive);
}
}
int
dive_cb (dc_dive_t *dive, void *userdata)
{
...
}
The consequence is that the lifetime of the object is limited to scope
of the callback function, and an application won't be able to keep a
reference to the object. This will be problematic for some legitimate
scenario's. For example storing the objects in a list for later
processing (e.g. after all dives have been downloaded), or passing
objects between threads (e.g. download the dives in a worker thread,
then and pass them to the main GUI thread to display them in real-time).
The only way to keep a (valid) reference to the object is to make a
copy. In the current design, this works fine because the raw data is
trivial to copy. It's just a byte array after all. But the new
dive/sample objects are opaque objects and can't be copied unless a
helper function is provided to make a deep copy (e.g. a dc_dive_clone()
functions). But that's not exactly the most elegant interface if you ask
me.
B. Application owns
The second option is to make the application the owner of the objects.
In this case, the object is allocated internally, passed to the callback
function, and should be destroyed by the application, not the library:
dc_device_foreach (...)
{
while (!finished) {
dc_dive_t *dive = NULL;
dc_dive_new (&dive, ...);
callback (dive, userdata);
}
}
int
dive_cb (dc_dive_t *dive, void *userdata)
{
...
dc_dive_free (dive);
}
This may work well for those scenario's where you really need to take
the ownership, but it adds unnecessary overhead when you don't need it
and there is no way to avoid this extra overhead. Take the sample
parsing for example. I think it's safe to assume that most applications
will convert the sample data to their native format on the fly, and thus
never keep a reference to the libdivecomputer sample objects. The result
is that a large number of sample objects will be allocated and
immediately destroyed again. Thus many memory allocations for no good
reasons. I don't like that. We should primarly aim for a good design,
but it doesn't hurt to keep an eye on the efficiency too.
Note that in my code snippet for the previous "library owns" case,
their is an equal amount of memory allocations taking place. But those
can easily be avoided by allocating just a single object, and re-using
it every time. However for the "application owns" case no such
optimization is possible.
C. Reference counting
An alternative option would be to make the new objects reference
counted. This changes nothing to the fact that we have to decide whether
the application has to take a reference explicitly (e.g. case A) or
gains it automatically (e.g. case B). But with the reference count, we
can detect whether the application still holds a reference or not, and
avoid unnecessary memory allocations:
dc_device_foreach (...)
{
dc_dive_t *dive = NULL;
while (!finished) {
if (dive && --dive->refcount == 0) {
/* We just dropped the last reference and
we can safely re-use the previous dive. */
} else {
/* The application still holds a reference to the
previous dive and we have to allocate a new one. */
dc_dive_new (&dive, ...);
}
callback (dive, userdata);
}
dc_dive_free (dive);
}
int
dive_cb (dc_dive_t *dive, void *userdata)
{
...
/* Optionally keep a reference. */
dc_dive_ref (dive);
}
It's of course slightly more complex to implement, and we have to make
sure the reference counting is thread-safe too using atomic operations
or protected with a mutex), but I think the resulting api will be more
intuitive. Reference counting is pretty standard today.
D. Mixed ownership
The last option is to allow different ownership rules for the dive and
sample objects. For example we could choose "applications owns" for the
dives, and "library owns" for the samples. In that case, the dives can
be used outside the callback function, while their is no unnecessary
overhead for the samples. The obvious downside is less consistency.
2. Querying available sample values?
=====================================
With the current design, the samples are pushed to the application, and
there is no need to know which values are actually supported. You'll
only get those values that are actually present. However, with the new
design, the application has to call an accessor function to retrieve a
piece of data, and thus it needs some way to query which values are
presented. The only values that are guaranteed to be available for each
sample are time and depth. Everything else is optional.
The level at which we can tell whether a particular sample value is
present varies. It can be at the device level (for example a non-air
integrated model will never store tank pressure data), at the dive level
(for example a model with a wireless pressure sensor can have dives with
and without tank pressure stored), or even at the sample level (for
example a tank pressure every x sample). To support all possible cases
this querying should thus be done at the sample level.
I think the simplest solution is to use the accessor function itself to
probe for the presence of the info. You just call the accessor function
and if the info isn't available, a special error code like UNSUPPORTED
is returned:
if (dc_sample_get_depth (sample, &depth) == DC_STATUS_SUCCESS) {
/* Depth retrieved successfully. */
} else {
/* No depth available (or an error occurred). */
}
For multivalued data, like the tank pressure, we could then use an
extra index parameter in the accessor function, and then call it
repeatedly until the UNSUPPORTED error is returned:
i = 0;
while (dc_sample_get_pressure (sample, &pressure, i) ==
DC_STATUS_SUCCESS) {
/* Tank pressure retrieved successfully. */
i++;
}
An alternative solution for the multivalued data would be to retrieve
all values at once using an array:
dc_sample_get_pressure (dc_sample_t *sample, double pressure[],
unsigned int n);
But then the caller somehow has to know the maximum value for n, and
the api also has to communicate back the actual number of values
returned. The second part is easy with an extra output parameter, but I
don't know a nice solution for the first problem (other than a second
dc_sample_get_pressure_count function).
Alternatives for the probing approach outlined above, are an iterator
like interface to iterator over the available sample values, or a
function to retrieve a bitmap with the available sample values.
3. Supported dive and sample data
==================================
Which dive data and sample data should we support?
I already made a tentative list, but this is open for discussion of
course. Items can be added or removed after discussion.
Dives:
* date/time
* divetime
* maxdepth
* gasmixes
* tank pressures (begin/end)
* temperature
* salinity, altitude/atmospheric pressure
* dive mode (open/closed circuit, gauge, etc)
Samples:
* time
* depth
* temperature
* tank pressure
* heartrate
* airtime
Events:
* Gas changes
* Deco stops (depth and time)
* Warnings, bookmarks, etc
For the events, I'm not sure yet how the querying (see previous item)
should be done. The normal samples are continuous values, and will
usually appear at least every X sample, or not at all. But the events
are discrete in nature and will only appear occasionally. Should this be
done differently in the api?
We'll also have to agree on a data structure and the exact semantics
for each of those pieces of data. Some won't need much discussion, but
others will. For example take the temperature. A device can record a
single value which could be the minimum, maximum or average temperature.
It can have all three, or a full temperature profile. How do we deal
with that? Things are not always as easy as they may look.
Closely related to the data structures is the choice of the unit
system. The choice is either metric or SI. SI units have the advantage
all values are positive, but applications are likely to use metric or
imperial. We can also consider to introduce a unit system as used in
subsurface, where all values are scaled to fit into an integer (e.g.
depths in millimeters instead of meters, and so on)? Or are we happy
with floating point values?
That's it for now. Comments are welcome! If you think I missed
something important, don't hesitate to bring it up. It's not unlikely
that I forgot about something :-)
[1] http://sourceforge.net/mailarchive/message.php?msg_id=29475503
Jef
More information about the Devel
mailing list