Parsing header and profile data

Thu Jul 29 11:48:40 UTC 2010

Hi,

I have a couple of questions and proposals regarding the parsing of the 
header and profile data.

1. HEADER DATA

Currently, there is no way to retrieve the dive header data (except for 
the date/time). Usually there is lots of additional information present 
that might be useful for an application. As usual, the problem is that 
there is a lot of variation in the information that is stored in this 
header data. There are only a few fields that are always available (or 
can be calculated from the sample data):

divetime
maxdepth
avgdepth
gasmix {helium, oxygen, nitrogen}

Are there any fields missing in this list? The motivation for divetime 
and maxdepth are obvious, avgdepth is necessary to calculate surface air 
consumption (SAC), and for every dive you'll have at least one gasmix.

I'm aware that this list is quite short, but I think that this will 
already suffice for the majority of use cases. For example a device 
independent logbook application is most likely limited to the least 
common denominator anyway, because it's nearly impossible to support 
every single piece of information that is present on each device.

For the api, I have been experimenting with an ioctl() style api:

parser_status_t
parser_get_field (parser_t *parser, parser_field_type_t type, unsigned 
int flags, void *value);

which you would use as follows:

double maxdepth = 0;
unsigned int divetime = 0;
unsigned int ngasmix = 0;

parser_get_field (parser, FIELD_TYPE_DIVETIME, 0, &divetime);
parser_get_field (parser, FIELD_TYPE_MAXDEPTH, 0, &maxdepth);
parser_get_field (parser, FIELD_TYPE_GASMIX_COUNT, 0, &ngasmix);
for (unsigned int i = 0; i < ngasmix; ++i) {
     gasmix_t gasmix = {0};
     parser_get_field (parser, FIELD_TYPE_GASMIX, i, &gasmix);
}

It's not the most elegant api, but I haven't found a better one.

+ Easily extensible to other field types.
+ Supports fields with multiple values (like the gas mixtures above).
- No type safety (void pointer), which means you have to be very careful 
to use the appropriate data type for each field type.

Another option would be to add a separate function for each field (like 
the current parser_get_datetime function), or a single struct to get all 
the fields at once. Although that last one may need a few more structs 
for those fields that can have a variable number of values (gas mixtures).

2. PROFILE DATA

There are more or less two categories of sample data, based on how the 
device obtains the data.

The first category, which I believe is also the most important one, is 
the type of data that a device can measure with one of its sensors. The 
typical examples are:

     * Depth
     * Temperature
     * Tank pressure
     * Compass bearing
     * Heart rate

And probably a few we more that we haven't seen yet (for example a 
rebreather equipped with ppO2 sensors).

Since this type of data is always associated with a sensor that measures 
some physical quantity, there is always a standard format (with a 
difference in units only). That makes supporting this type of data very 
straightforward.

Of course a dive computer does a little more than measuring some data, 
and that's where it starts to get tricky. The main difference with the 
previous category is that this type of data is calculated by the dive 
computer. Because every device is different, there is also a lot of 
variation and no standard format. Typical examples are:

     * Air time remaining
     * Dive time remaining (or no decompression time if you prefer)
     * Decompression stops (time and depth)
     * Oxygen and nitrogen loading
     * Ascent rate
     * Alarms
     * ...

I you look at the ascent rate for example, it may seem easy to come up 
with a standard format, but it's not. There are device that store the 
ascent rate as a numerical value (e.g. expressed in m/min), or lookup 
tables with speed ranges, or only warnings when exceeding some limit, etc.

In the past, I have tried to support some of these sample types, but 
it's quite difficult to do that correctly and easily ends up in a mess. 
Take for example a look at the long list of events. There are many 
events that have a very similar meaning, but not exactly the same. There 
are events that also have an equivalent real sample type, and so on.

Therefore, I'm considering to keep only some of them (e.g. the more 
general ones), and leave everything else as a vendor specific binary 
format. Instead of trying to fit everything into a common format, we 
could document each vendor format. An applications that wants to use 
some of this data, will probably need some device specific code anyway.

Does that sounds reasonable? The next question is of course which sample 
type do we want to keep or not?

Note that since this data is usually calculated from the measurements in 
the previous category, it is often possible to recalculate this data 
manually, although you'll not always get exactly the same results. For 
example if you implement a decompression algorithm you can recalculate 
decostops, N2/O2 loadings, etc. Since you can do that also for devices 
that do not store this kind of information, that could even be 
considered as an advantage!

As usual, all comments and feedback are welcome!

Jef