On 2014-11-17 01:17, John Van Ostrand wrote:
On Sunday, November 16, 2014, Jef Driesen jef@libdivecomputer.org wrote:
There is a problem with how you pass the data to the dive callback function. I didn't notice this the first time, but you pass a pointer to the cochran_data_t structure to the callback function. That's not how this is supposed to work. You need to extract the raw binary data of each dive, and then pass that to the callback function. What you do now is pass the data of all dives, with a couple of fields pointing to the dive of interest. That's wrong and needs to change.
The cochran_data_t alone is already a problem here, because it contains pointers. The dive data is supposed to be a simple byte array which can be serialized as-is. Remember, libdivecomputer has absolutely no control over how or when the application uses the data. For example the application can copy the dives, and only parse them after the download has finished. In that case the device handle will already have been closed when parsing the data. Thus anything in your structure pointing to data in that handle will have been freed, and unexpected things will happen. With a self contained blob of binary data that's not a problem. You can copy it, write it to disk, etc with no problems.
I'm a little confused by what is expected. I understand that an application may choose to save the dive data and not parse so I can fix that. The only way that makes sense to me is to create a blob in malloc'ed space and hope the application frees it. That doesn't really seem to make sense to me. The blob will have to contain a size, offset into the data (there is pre-dive data) and some information about the type of data it is so that the common parser code can deal with it properly. The app won't necessarily know the structure of the blob or its length so what it can do with it will be limited.
Subsurface hasn't complained about freeing memory so there's one application that would leak memory.
I'll explain the principle with some trivial examples. A typical implementation of the foreach function will look like this:
dc_status_t xxx_device_foreach (dc_device_t *abstract, dc_dive_callback_t callback, void *userdata) { unsigned char dive[SIZE_DIVE]; unsigned char fingerprint[SIZE_FP];
for (i = 0; i < ndives; ++i) { /* Download the i'th dive here and pass it to the callback function .*/ callback (dive, sizeof(dive), fingerprint, sizeof(fingerprint), userdata)) } }
Of course this is overly simplified. In practice, dives are not fixed length, and the dive buffer will be dynamically allocated, but that doesn't matter here. What does matter, is that the memory buffer is clearly owned by the foreach function. As long as you're inside the callback function, the application can freely access the data. But once the callback function return, you can no longer assume the data is still accessible. The implementation of the foreach function may re-use the same buffer for the other dives, or even free the buffer.
So if you want to use the data after the callback function has returned, then you HAVE to copy the data. For example an application might not parse the dives immediately, but store them in a list, and only parse them after the download has finished:
static int dive_cb (const unsigned char *data, unsigned int size, const unsigned char *fingerprint, unsigned int fsize, void *userdata) { list *dives = userdata;
/* Copy the dive data into a new buffer. */ dc_buffer_t *buffer = NULL; dc_buffer_new(&buffer, size); dc_buffer_append(buffer, data, size);
/* Append the dive to the list. */ dives.add(buffer); }
int main (int argc, char *argv[]) { list *dives = ...;
/* Download the dives. */ dc_device_t *device = NULL; dc_device_open (&device, ...); dc_device_foreach(device, dive_cb, dives); dc_device_close(device);
/* Parse all dives. */ for (i = 0; i < dives.count; ++i){ /* Do something with the i'th dive here. */ } }
With a simple byte array, this will work just fine, because copying a byte array is well defined. But for structures containing pointers, that's not the case. If you copy the structure, the pointers will still point to their original data. But that data may no longer exist.
The important message here is that the data is owned by the foreach function. The application should certainly not try to free it. If the data is needed after the callback function returns, then the application needs to copy it.
The reason why you don't notice any problems with subsurface, is because it parses the dive inside the callback function.
Regarding your question about the data format of the blob. The application doesn't really need to know anything about the format. The parser will take care of that, by translating the raw data into something more practical.
Based on your documentation in the libdivecomputer wiki (which is excellent btw! You did a really nice job there!), each dive has a fixed size (256 bytes) header containing the logbook summary, and a variable length part with the sample data. That's exactly the same structure as used in the Oceanics. There, I simply concatenate those two parts, and use that as the raw data format for the dives. I suggest you do the same. The header is fixed size, so you don't even need to include a length.
The fact that the logbook header and profile data are stored in separate memory areas is just an implementation detail. The dive computer could equally well store it as one single blob. I assume that the reason behind using two different memory areas is the following. The length of the dive profiles is variable, and depends on many factors, like the sample rate, dive duration, etc. Thus the amount of profiles that can be stored isn't fixed. By storing the logbook summary separately, the dive computer can always display basic information for a fixed pre-defined number of dives. Even if the profile data for those dives has already been overwritten. (If you download such a dive where the profile data is no longer available, you just pass only the logbook data to the callback function.)
If the data format is different for the different models supported by the backend, and there is no identification data stored in the dive data itself, then you proceed as follows. First of all, extend the cochran_commander_parser_create() function with an extra parameter to pass the model number. Store the model number internally into the parser structure, and then you can use it when needed, just like you do now. Next, you need some way to pass the model number from the device to the parser. That is done as follows. In your xxx_device_foreach implementation, you emit a DC_EVENT_DEVINFO event containing the model number (and serial number if available). This info gets cached internally (see the device_event_emit() function), and in the dc_parser_new() function you can pass the model number to your cochran_commander_parser_create() function.
It looks like you have only two variants (e.g. commander and emc). If there is no model number in the data, you just assign artificial model numbers (e.g. commander=0 and emc=1). For completeness, you use the same numbers in the table in the descriptor.c file, like this:
{"Cochran", "Commander", DC_FAMILY_COCHRAN_COMMANDER, 0}, {"Cochran", "EMC-16", DC_FAMILY_COCHRAN_COMMANDER, 1}, {"Cochran", "EMC-20H", DC_FAMILY_COCHRAN_COMMANDER, 1},
Note that it's not a problem to list multiple models with the same model number, as long as the EMC-16 and EMC-20H share the same data format. If you want to be able to tell apart the individual models, then you'll need to assign different numbers to each of them. You can use the upper 16 bits for the main model, and the the lower 16bits for the individual models, as you already did with the COCHRAN_MODEL_XXX constants. As long as the same numbers are used everywhere, you can use whatever number work best. They are just artificial numbers :-) (For model numbers that are embedded in the data, that's of course another story.)
If you have further questions, just ask! I'll try to help where I can.
Jef