API redesign progress
Jef Driesen
jefdriesen at telenet.be
Fri Jun 29 14:54:54 UTC 2012
Hi Linus,
Allow me to start this email with some background information on the
libdivecomputer project first.
The current libdivecomputer api has been build around the assumption of
a "pipeline" model. In this model, there is a "device" component which
takes care of the communication with the dive computer, and produces raw
dives as the output. The "parser" component consumes those raw dives and
translates them into some more structured data. The key concept is that
both components are completely independent of each other. This provides
a great amount of flexibility, because you are not limited to a single
scenario.
I agree that most applications (subsurface, divinglog, etc) are only
interested in the final data, and don't care about the extra flexibility
or the intermediate data. But that doesn't mean there are no
applications that do care. I can give you several examples.
One such example is macdive. Currently it uses only the device
functionality, but not the libdivecomputer parser. The reason for that
is partly legacy (the libdivecomputer parser wasn't ready back then) and
partly because macdive supports some highly device specific features,
which are unlikely to make it into the libdivecomputer api. Therefore
access to the raw dive data is important.
Or, imagine the "dctool" command line application that I have been
planning to write (but for which I had no time yet). The idea was to
have a separate processes for downloading and parsing:
dctool download <args> | dctool parse <args>
The dives and their metadata would be serialized to some application
defined data format, and read again by the parser tool. This is a very
simple illustration of the offline parsing case, where there is no
active connection with the device.
And last but not least, there is the scenario where an application uses
the raw dive data (together with the associated metadata) as its
internal data format. Again this requires offline parsing support. This
may sound a bit strange at first, but if you think about it, it has
several advantages. First of all it's the most compact format possible,
and you never loose a single bit of information. If the application
gains new features or is capable of parsing more information, you can
apply that to all your old dives too. This is something you can't do if
you only care about the parsed representation. Migrating to another
application becomes easier too.
Anyway, the reason why I'm writing all this is to make clear there are
perfectly valid scenarios besides the typical scenario used by
subsurface. I don't mind adapting the api to something that makes the
common scenario easier to use, but at the same time I don't want to drop
the possibility to support alternative scenarios. The same goes for the
other features you are not using or don't care about (e.g. fingerprint
support). And yes I'm aware applications doing this kind of offline
parsing will need some additional knowledge, but then that's the
responsibility of the app developer, not the libdivecomputer api.
Having said all that, let's get back to your proposal! It basically
comes down to a design where the dive callback function is changed into
something like this:
int
dive_cb (dc_dive_t *dive, void *userdata)
{
...
dc_dive_get_datetime (dive, ...);
dc_dive_get_something (dive, ...);
dc_dive_get_anotherthing (dive, ...);
...
}
where the new dc_dive_t object is conceptually a container for storing
the dive data, combined with a built-in parser. I don't have any
particular arguments against such a change. I do have a couple of
questions/remarks regarding the actual design and implementation. Let's
go over them one by one.
1. Access to raw dive/fingerprint data
As you mentioned before, this can be achieved with a
dc_dive_get_rawdata() and dc_dive_get_fingerprint() function, so that
shouldn't be a problem.
2. Ownership and lifetime
I'm convinced the lifetime of the dc_device_t object should not be
limited to the scope of the callback function. Because if you do that,
you can't maintain a reference to the object for use outside the
callback function. For example a perfectly valid scenario would be to
append the dive object to a list inside the callback, and process them
afterwards, when the download has finished.
This would require that the dive object is allocated by the device
object, but not owned by it, and therefore should be freed by the
application (e.g a dc_dive_free function is needed):
int
dive_cb (dc_dive_t *dive, void *userdata)
{
...
dc_dive_free (dive);
}
A side effect is that the dive object can't maintain a back reference
to the device object, because the device connection might be closed
before the dive object is freed:
int
dive_cb (dc_dive_t *dive, void *userdata)
{
mylist.add (dive);
}
dc_device_open (&device, ...);
dc_device_foreach (device, dive_cb);
dc_device_close (device);
foreach dive in mylist {
...
dc_dive_free (dive);
}
I think that's actually a good thing, because leaving the device
connection open longer than necessary is bad practice (e.g. higher power
consumption for most devices). For the implementation this won't cause
any problems at all.
Note that this is consistent with the device enumeration interface,
where you have to call dc_descriptor_free() for each descriptor you
obtain. I'm aware that the current implementation of the
dc_descriptor_free() function is just an empty stub, but that may change
some day.
3. Thread safety
I intended to introduce a new library context object to eliminate all
global state (something which is present today in the logging code).
Full thread safety could then be achieved by requiring that every thread
operates on its own independent context. The context object is also
convenient to perform any library initialization and shutdown tasks.
However the new dc_dive_t object would cause some serious trouble here.
Assume you have an application that does the downloading on a worker
thread and the processing of the dives on the main thread. According to
the rule above, contexts shouldn't be shared between threads, and the
worker thread will have to create its own context and pass it to the
dc_device_open() function. Now, because the dive object is created
directly by the device object, it will automatically inherit the same
context. That means we won't be able to hand over the dive object to the
main thread, without breaking our threading rule.
/* Shared data (synchronization omitted) */
queue_t queue;
/* Main thread */
worker_thread_run ();
while (1) {
dc_dive_t dive = queue.pop ();
/* Ouch, this dive is associated with
* the context of the worker thread! */
dc_dive_free (dive);
}
/* Worker thread */
int
dive_cb (dc_dive_t *dive, void *userdata)
{
queue.push (dive);
}
dc_context_t *context = NULL;
dc_device_t *device = NULL;
dc_context_new (&context);
dc_device_open (&device, context, ...);
dc_device_foreach (device, dive_cb, ...);
dc_device_close (device);
dc_context_free (context);
The only workaround I can think of would be to make the context object
itself thread-safe (e.g. with an internal mutex), so it can be shared
safely between multiple threads. The disadvantage is that this approach
would introduce some additional complexity and runtime overhead,
compared to the simple thread-safety rule that we had first. We may also
have to implement reference counting for the context object to make
tracking its lifetime a bit easier and less error prone.
Of course the application will still need proper synchronization to
protect the shared data (e.g. the queue in the pseudo code above).
4. Offline parsing
For the offline parsing scenario, I think we can do something very
similar to what I proposed before for the dc_parser_t api:
dc_status_t
dc_dive_new (dc_dive_t **dive,
const char *data, unsigned int size,
dc_family_t type,
const dc_event_devinfo_t *devinfo,
const dc_event_clock_t *clock);
With this function we can create a dive object manually for the raw
dive data and its metadata. Applications that have no need for this
functionallity can just ignore this function.
5. Accessor functions
Is there a preference to have separate accessor functions instead of
the ioctl like dc_dive_get_field() function? The main argument in favor
for the ioctl like function was/is that it's a bit easier to add support
for new fields. No new functions are required, just a new enum value. If
you try to run a (compiled) application against an older version of the
library, it would just return an unsupported error, instead of
complaining about unresolved symbols. On the other hand it's a very ugly
interface, and very easy to misuse (e.g. no type checking at all). So
maybe we should switch to separate functions?
Jef
More information about the Devel
mailing list