[PATCH] Revisions requested

Sun Nov 16 16:17:40 PST 2014

On Sunday, November 16, 2014, Jef Driesen <jef at libdivecomputer.org> wrote:
> John,
>
> There is a problem with how you pass the data to the dive callback
function. I didn't notice this the first time, but you pass a pointer to
the cochran_data_t structure to the callback function. That's not how this
is supposed to work. You need to extract the raw binary data of each dive,
and then pass that to the callback function. What you do now is pass the
data of all dives, with a couple of fields pointing to the dive of
interest. That's wrong and needs to change.
>
> The cochran_data_t alone is already a problem here, because it contains
pointers. The dive data is supposed to be a simple byte array which can be
serialized as-is. Remember, libdivecomputer has absolutely no control over
how or when the application uses the data. For example the application can
copy the dives, and only parse them after the download has finished. In
that case the device handle will already have been closed when parsing the
data. Thus anything in your structure pointing to data in that handle will
have been freed, and unexpected things will happen. With a self contained
blob of binary data that's not a problem. You can copy it, write it to
disk, etc with no problems.

I'm a little confused by what is expected. I understand that an application
may choose to save the dive data and not parse so I can fix that. The only
way that makes sense to me is to create a blob in malloc'ed space and hope
the application frees it. That doesn't really seem to make sense to me. The
blob will have to contain a size, offset into the data (there is pre-dive
data) and some information about the type of data it is so that the common
parser code can deal with it properly. The app won't necessarily know the
structure of the blob or its length so what it can do with it will be
limited.

Subsurface hasn't complained about freeing memory so there's one
application that would leak memory.

>
> See some more comments inline.
>
> On 03-11-14 02:58, John Van Ostrand wrote:
>>
>> On Sun, Nov 2, 2014 at 1:13 PM, Jef Driesen <jef at libdivecomputer.org>
wrote:
>>>
>>> Nice work. This is already a huge step in the right direction, but we're
>>> not there yet.
>>>
>>> There is a problem in the way you store the cochran_data_t structure in
>>> the device handle. Most of this info only used by the dump/foreach
function
>>> (or one of its helpers), so it doesn't really belong inside the handle.
>>
>> The structure members under "// Config items" are used to allow for
>> different Cochran models and different feature configurations for a given
>> model. The fields are set during cochran_set_device_config() so that I
know
>> how to handle reading the logbook and the sample data. So while I'm
>> detecting the model I thought I might as well set the variables needed in
>> foreach so I have one section of code to look at when adding a new model.
>>
>> Some of the variables I need to keep between calls to foreach and this
>> seemed to be a good place.
>>
>> I know I'm keeping id*, config* and misc* around after they are needed. I
>> could drop them sooner but they come in handy for the dump function.
>
> It's fine to store certain data in the device handle if it's needed for
the device connection state. For example the id packets are read during the
open, and you need them later in the other functions. That's a perfect
argument to store them in the device handle.
>
> But nearly everything else is state for either the dump or foreach
function. After those functions returns there is no need to keep any of the
data around. If those functions get called a second time, they should start
with a completely fresh state. Thus the "global" device handle is the wrong
place to store this data. The fact that there are memory leaks when someone
calls those functions twice is a symptom of that. If the state were private
to the function, then you wouldn't have to worry about left overs from a
previous call.
>

I see I can keep state in variable in the foreach function.

>>> This results in a number of problems. For example you allocate several
>>> pieces of data (id, config, etc) in those functions and store them in
the
>>> handle. But if someone calls one of those functions a second time, you
>>> allocate them again and overwrite the original pointers. So you'll end
up
>>> with a memory leak. You can free the old data first, but I think you're
>>> better off using local data structures because there is no reason why
any
>>> of this data needs to persist between the high-level calls.
>>>
>>>
>> I see, calling _open and _foreach a second time could leak memory.
>>
>> Fixed.
>
> It's better now, but freeing the pointers is just fixing the symptoms,
and not the underlying problem (see above).
>
>>> Overall, I find the logic rather hard to follow because it's spread over
>>> all your cochran_read_* functions. These functions do not only download
>>> some data, but also sets some fields, which are then used in one of the
>>> other functions. I think it would be easier if the processing was a bit
>>> more separated from the downloading. I have the feeling that there's
>>> something wrong, but I can't really tell what or where. Maybe it's just
>>> that I don't understand the format very well. Do you have some data
files
>>> available?
>>>
>>
>> There are two reasons I split the read up into several functions. The
>> cochran_read_id() function reads the blocks needed to detect the DC model
>> so it knows how to perform the other reads. I do this in the
>> cochran_common_device_open() because I wanted to give immediate feedback
to
>> the application if the DC wasn't supported.
>
> That's fine. I didn't had any problem with that.
>
>> The other reads could be in one function but it would be a large
function.
>> There is some handle data settings in the read functions.
>>
>> cochran_read_id(): determine model type and features
>> cochran_read_config(): determine the head pointer to the sample buffer, I
>> see now that I don't use this.
>> cochran_read_misc(): read only to mimic Analyst
>> cochran_read_logbook(): Get number of logged dives(from config1),
calculate
>> logbook_size
>> cochran_read_parms(): Doesn't read anything, calculate parameters needed
>> for reading samples.
>> cochran_read_samples(): Reads sample data.
>>
>> There really isn't much happening inside the read calls that isn't needed
>> there. The read_config() has a few lines that can actually be removed.
The
>> read_logbook has another few lines that calculate the size, and
>> read_samples has two lines that calculate size. Since all the read_*()
>> functions are called by read_all() it's trivial to move them there.
>>
>> Done. Calculations are moved to the read_all function.
>
> I always try to keep the code for downloading separate from the
parsing/processing. That usually makes it easier to follow the logic. So I
think your latest version is a move in the right direction.
>
>>> Note that several of those data pieces have a fixed size, so why even
>>> allocate them? Just use a byte array with the right size. Some like:
>>>
>>> struct {
>>>      ...
>>>      unsigned char id0[67];
>>>      unsigned char id1[67];
>>>      unsigned char config1[512];
>>>      unsigned char config2[512];
>>>      ...
>>> };
>>>
>>> Now you no longer have to worry about freeing them. Instead of config1,
>>> config2, etc you could even use an array here. That will make your code
>>> even more generic. Instead of repeating the same code, you can use a
loop.
>>>
>>>
>> I used mallocs to save memory. I can make them arrays. Done.
>
> If you're allocating them anyway, you're not really saving anything :-)
>
>>> BTW, I'm not sure what's in these config pages, but based on the name, I
>>> assume it contains settings. So this is something applications may want
to
>>> access directly. I suspect there the protocol also support changing
those
>>> settings. Have a look at the hw_ostc_device_eeprom_{read,write}. (This
is
>>> certainly not a must have feature right now, but it's something you may
>>> want to keep in mind already.)
>>>
>>>
>> Yes, that is device configuration as well as current calculated tissue
>> loads. I didn't look at the OSTC code so I didn't see any evidence of
>> setting configs on the DCs. I intend to write code to do just that for
the
>> Cochran and I wasn't sure where to do that since it looked like
>> libdivecomputer doesn't expose an API for it. There are some settings,
like
>> date/time that aren't configurable on the DC. And I would like to be able
>> to set configuration from Linux or even a tablet.
>
> There is indeed no generic api for writing settings, because this is done
very different for different models. This is not a high priority, and
something we can always look at later, when we have the basic downloading
done. But I just mentioned it in case you were planning to implement this
too.
>
>>> You have the same problem with the progress field. Again, I think it
makes
>>> more sense to make this a local variable, and pass it as a parameter to
the
>>> internal read function. Note that you're also reporting progress for
each
>>> individual step, rather than the entire download.
>>>
>>>
>> I don't know the size of the entire download when I start needing a
>> progress bar. You'll note I don't have one for cochran_read_id(),
>> read_config() or read_misc(). These are small enough I figured one wasn't
>> needed. I don't know the size of the sample data until I read the
logbook.
>> So it seemed the best approach was to have two.
>
> There are a few other backends that have the same problem where the total
size isn't known from the start. What I did there, is to initialize the
maximum field with the worst-case maximum (and in some cases that's
UINT_MAX, see EVENT_PROGRESS_INITIALIZER). As soon as you have a better
value available, you update the maximum value. There are backends where we
never know the exact amount of data until the download is over. In that
case we simply can't do better than assuming the worst case maximum.
>
> The main problem with two independent progress is that the user will see
a progressbar that goes from 0% to 100% twice. From the user point of view,
that's just very counter intuitive, because you don't know when the
download will be finished. And that's exactly the point of the progress bar
in the first place.
>
> With the worst case maximum approach, you do get one progress for the
entire download. Initially the progress bar will stay near zero, but once
the real maximum is known, you go nicely to 100%.
>
>> The other backends I looked at read the from the DC in blocks of data and
>> were able to manage the progress within the function. The cochran does
very
>> big reads at very high data rates (800Kb/s). So I can't do the progress
at
>> the cochran_common_device_read() function because it issues one command
to
>> read a huge block of data. So I did it in the cochran_packet() function.
>> Since that is exposed I couldn't add a parameter without breaking the
API.
>> I could add a separate packet function that accepts a progress but this
>> seems to be a better approach.
>
> Look for example at the hw_ostc3_transfer() function. The progress
structure is passed as a parameter there. A single progress structure for
the entire download operation is used there, and the hw_ostc3_transfer()
function just updates for one specific part of the download.
>
> You can find a similar situation in the suunto_vyper_device_read_dive()
function. This is also a public function, but there is a
suunto_vyper_read_dive() helper function that does the actual work and
takes the progress parameter. So you don't ever see the progress parameter
on the public function.
>
>>> Those high baudrates are indeed a bit odd. Do they even work? You don't
>>> check the return value of the serial_configure() function. So if the
>>> underlying driver (or OS) doesn't support it, unexpected results may
>>> happen. I implemented the support for non-standard baudrates long time
ago,
>>> and unfortunately it's not guaranteed to work everywhere. Initially it
was
>>> needed for the iconhd backend, in the end it turned out that this
baudrate
>>> was a bug in the mares application. The iconhd uses some microchip with
>>> CDC-ACM, where the baudrate doesn't matter at all. So every value works
>>> equally well.
>>>
>>
>> They are little twitchy but I get good reads most of the time. I've tried
>> different rates and it does require a non-standard rate. At least I'm not
>> aware of a standard rate that high.
>
> What I meant here, is that because you don't check the return value of
the serial_configure() function, you don't know whether setting the custom
baudrate actually succeeded. Support for custom baudrates depends heavily
on the OS and driver. So if the call fails, you'll be using a different
baudrate then the one you think you are using.
>
> For example on linux, setting a custom baudrate requires to set the
baudrate to 38400, and then some linux specific call to change the baudrate
to the custom baudrate. But if that call fails, then the actual baudrate
will be 38400, and not your custom one.

I can check. In the case of a failure the download would timeout. Checking
the return value would save the user a few seconds.

>
>>> We're getting closer, but the windows build is still broken:
>>>
>>>    CC       cochran_common_parser.lo
>>> In file included from ../../source/src/cochran_common_parser.c:33:
>>> ../../source/src/cochran_common.h:73: error: expected
>>> specifier-qualifier-list before 'time_t'
>>> ../../source/src/cochran_common_parser.c: In function
>>> 'cochran_common_parser_get_datetime':
>>> ../../source/src/cochran_common_parser.c:113: error: 'cochran_data_t'
has
>>> no member named 'date_format'
>>> ../../source/src/cochran_common_parser.c: In function
>>> 'cochran_common_parser_get_field':
>>> ../../source/src/cochran_common_parser.c:147: warning: control reaches
>>> end of non-void function
>>> ../../source/src/cochran_common_parser.c: In function
>>> 'cochran_common_parser_samples_foreach':
>>> ../../source/src/cochran_common_parser.c:166: warning: control reaches
>>> end of non-void function
>>> ../../source/src/cochran_common_parser.c: In function
>>> 'cochran_common_handle_event':
>>> ../../source/src/cochran_common_parser.c:175: warning: unused variable
>>> 'data'
>>> make[1]: *** [cochran_common_parser.lo] Error 1
>>>
>>> I think you're missing an #include <time.h> somewhere for the use of
>>> time_t in the cochran_data_t structure. But you shouldn't need that
because
>>> libdivecomputer has its own time functions and structures. Use those
>>> instead.
>>>
>>
>> I think I've fixed that.
>
> It still doesn't compile for me. However, the only place where you use
something from time.h, is to set the current_dive_start_time field which
isn't used anywhere else. Just remove this, and everything starts to work.
>
> I also need to link with -lm on my system. The symbols file also needs a
small update after the renaming. See the attached patches.
>
> The clang static analyzer also finds some potential issues. See the
attached logfile.
>
> Jef
>

-- 
John Van Ostrand
At large on sabbatical
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://libdivecomputer.org/pipermail/devel/attachments/20141116/ba40880b/attachment-0001.html>