Hi,
For the next release, I'm planning to introduce a new dc_dive_t and dc_sample_t interface. This will again be a major change to the libdivecomputer api. Please note that the interface presented in this email is by no means a final version. It's intended to serve as a starting point for further discussions. Now, let's have a closer look.
The role of the new dc_dive_t object has been discussed before [1], so I'll just summarize here. In a nutshell, the new dc_dive_t object is basically a container for the raw dive data, with the parser functionality built-in. As a result, there is no need anymore to create a dc_parser_t object explicitely.
The dc_device_foreach() function will be modified to return dc_dive_t objects directly in the callback, instead of the raw dive data. The raw dive data (and fingerprint) can be retrieved from the dc_dive_t object by means of the dc_dive_get_raw() and dc_dive_get_fingerprint() functions. The dive metadata can be retrieved with a number of accessor functions, similar to the current dc_parser_t interface:
int dive_cb (dc_dive_t *dive, void *userdata) { /* Get raw dive and fingerprint data. */ dc_dive_get_raw (dive, &data, &size); dc_dive_get_fingerprint (dive, &fpdata, &fpsize);
/* Get dive metadata directly. */ dc_dive_get_datetime (dive, &datetime); dc_dive_get_maxdepth (dive, &maxdepth); ... }
Note that this new dc_dive_t interface will also simplify the transition from a callback based interface to an iterator based interface, because the dc_iterator_next() function can now return the dc_dive_t object:
dc_dive_t *dive; dc_iterator_t *iterator; dc_device_iterator (device, &iterator); while (dc_iterator_next (iterator, &dive) == DC_STATUS_SUCCESS) { /* Contents of the callback function goes here. */ } dc_iterator_free(iterator);
Although this will be something for later, we can already keep this in mind of course.
For parsing the samples, I have something very similar in mind. With the current interface, the application receives a stream of sample values (e.g. time, depth, etc), and the time sample serves as the marker to start a new sample. However, with the new interface, the sample values would be accumulated internally in the new dc_sample_t object. Once the entire sample has been parsed, the sample is passed to the callback function. As a result, there is no need anymore to treat the time sample as special, and the individual sample values can be retrieved with a number of accessor functions:
void sample_cb (dc_sample_t *sample, void *userdata) { dc_sample_get_time (sample, &time); dc_sample_get_depth (sample, &depth); ... }
With these new dive and sample objects it's also possible to add vendor specific extensions in the future. We just need a new accessor function and a corresponding data structure.
So far the overview of the high level design that I had in mind. The remainder of this email consists of more detailed information on a number of technical challenges that will have to be addressed in this new design.
1. Object ownership, lifetime and memory management. =====================================================
How should we deal with the ownership of the new objects?
With the current api, the application has to call dc_parser_new() function explicitly, and their is no doubt that the application is the owner of the returned object. However, because the new dive and sample objects will be allocated internally, their ownership is not immediately obvious. It all depends on the internal implementation. There are a couple of options, each with their own advantages and disadvantages:
A. Library owns
The first option is to make the library the owner of the objects. In this case, the object is allocated internally, passed to the callback function, and destroyed internally again:
dc_device_foreach (...) { while (!finished) { dc_dive_t *dive = NULL; dc_dive_new (&dive, ...); callback (dive, userdata); dc_dive_free (dive); } }
int dive_cb (dc_dive_t *dive, void *userdata) { ... }
The consequence is that the lifetime of the object is limited to scope of the callback function, and an application won't be able to keep a reference to the object. This will be problematic for some legitimate scenario's. For example storing the objects in a list for later processing (e.g. after all dives have been downloaded), or passing objects between threads (e.g. download the dives in a worker thread, then and pass them to the main GUI thread to display them in real-time).
The only way to keep a (valid) reference to the object is to make a copy. In the current design, this works fine because the raw data is trivial to copy. It's just a byte array after all. But the new dive/sample objects are opaque objects and can't be copied unless a helper function is provided to make a deep copy (e.g. a dc_dive_clone() functions). But that's not exactly the most elegant interface if you ask me.
B. Application owns
The second option is to make the application the owner of the objects. In this case, the object is allocated internally, passed to the callback function, and should be destroyed by the application, not the library:
dc_device_foreach (...) { while (!finished) { dc_dive_t *dive = NULL; dc_dive_new (&dive, ...); callback (dive, userdata); } }
int dive_cb (dc_dive_t *dive, void *userdata) { ... dc_dive_free (dive); }
This may work well for those scenario's where you really need to take the ownership, but it adds unnecessary overhead when you don't need it and there is no way to avoid this extra overhead. Take the sample parsing for example. I think it's safe to assume that most applications will convert the sample data to their native format on the fly, and thus never keep a reference to the libdivecomputer sample objects. The result is that a large number of sample objects will be allocated and immediately destroyed again. Thus many memory allocations for no good reasons. I don't like that. We should primarly aim for a good design, but it doesn't hurt to keep an eye on the efficiency too.
Note that in my code snippet for the previous "library owns" case, their is an equal amount of memory allocations taking place. But those can easily be avoided by allocating just a single object, and re-using it every time. However for the "application owns" case no such optimization is possible.
C. Reference counting
An alternative option would be to make the new objects reference counted. This changes nothing to the fact that we have to decide whether the application has to take a reference explicitly (e.g. case A) or gains it automatically (e.g. case B). But with the reference count, we can detect whether the application still holds a reference or not, and avoid unnecessary memory allocations:
dc_device_foreach (...) { dc_dive_t *dive = NULL;
while (!finished) { if (dive && --dive->refcount == 0) { /* We just dropped the last reference and we can safely re-use the previous dive. */ } else { /* The application still holds a reference to the previous dive and we have to allocate a new one. */ dc_dive_new (&dive, ...); }
callback (dive, userdata); }
dc_dive_free (dive); }
int dive_cb (dc_dive_t *dive, void *userdata) { ... /* Optionally keep a reference. */ dc_dive_ref (dive); }
It's of course slightly more complex to implement, and we have to make sure the reference counting is thread-safe too using atomic operations or protected with a mutex), but I think the resulting api will be more intuitive. Reference counting is pretty standard today.
D. Mixed ownership
The last option is to allow different ownership rules for the dive and sample objects. For example we could choose "applications owns" for the dives, and "library owns" for the samples. In that case, the dives can be used outside the callback function, while their is no unnecessary overhead for the samples. The obvious downside is less consistency.
2. Querying available sample values? =====================================
With the current design, the samples are pushed to the application, and there is no need to know which values are actually supported. You'll only get those values that are actually present. However, with the new design, the application has to call an accessor function to retrieve a piece of data, and thus it needs some way to query which values are presented. The only values that are guaranteed to be available for each sample are time and depth. Everything else is optional.
The level at which we can tell whether a particular sample value is present varies. It can be at the device level (for example a non-air integrated model will never store tank pressure data), at the dive level (for example a model with a wireless pressure sensor can have dives with and without tank pressure stored), or even at the sample level (for example a tank pressure every x sample). To support all possible cases this querying should thus be done at the sample level.
I think the simplest solution is to use the accessor function itself to probe for the presence of the info. You just call the accessor function and if the info isn't available, a special error code like UNSUPPORTED is returned:
if (dc_sample_get_depth (sample, &depth) == DC_STATUS_SUCCESS) { /* Depth retrieved successfully. */ } else { /* No depth available (or an error occurred). */ }
For multivalued data, like the tank pressure, we could then use an extra index parameter in the accessor function, and then call it repeatedly until the UNSUPPORTED error is returned:
i = 0; while (dc_sample_get_pressure (sample, &pressure, i) == DC_STATUS_SUCCESS) { /* Tank pressure retrieved successfully. */
i++; }
An alternative solution for the multivalued data would be to retrieve all values at once using an array:
dc_sample_get_pressure (dc_sample_t *sample, double pressure[], unsigned int n);
But then the caller somehow has to know the maximum value for n, and the api also has to communicate back the actual number of values returned. The second part is easy with an extra output parameter, but I don't know a nice solution for the first problem (other than a second dc_sample_get_pressure_count function).
Alternatives for the probing approach outlined above, are an iterator like interface to iterator over the available sample values, or a function to retrieve a bitmap with the available sample values.
3. Supported dive and sample data ==================================
Which dive data and sample data should we support?
I already made a tentative list, but this is open for discussion of course. Items can be added or removed after discussion.
Dives: * date/time * divetime * maxdepth * gasmixes * tank pressures (begin/end) * temperature * salinity, altitude/atmospheric pressure * dive mode (open/closed circuit, gauge, etc)
Samples: * time * depth * temperature * tank pressure * heartrate * airtime
Events: * Gas changes * Deco stops (depth and time) * Warnings, bookmarks, etc
For the events, I'm not sure yet how the querying (see previous item) should be done. The normal samples are continuous values, and will usually appear at least every X sample, or not at all. But the events are discrete in nature and will only appear occasionally. Should this be done differently in the api?
We'll also have to agree on a data structure and the exact semantics for each of those pieces of data. Some won't need much discussion, but others will. For example take the temperature. A device can record a single value which could be the minimum, maximum or average temperature. It can have all three, or a full temperature profile. How do we deal with that? Things are not always as easy as they may look.
Closely related to the data structures is the choice of the unit system. The choice is either metric or SI. SI units have the advantage all values are positive, but applications are likely to use metric or imperial. We can also consider to introduce a unit system as used in subsurface, where all values are scaled to fit into an integer (e.g. depths in millimeters instead of meters, and so on)? Or are we happy with floating point values?
That's it for now. Comments are welcome! If you think I missed something important, don't hesitate to bring it up. It's not unlikely that I forgot about something :-)
[1] http://sourceforge.net/mailarchive/message.php?msg_id=29475503
Jef