On Fri, Feb 5, 2016 at 8:19 AM, Jef Driesen <jef@libdivecomputer.org> wrote:

On 2016-01-29 10:55, john@vanostrand.com wrote:

On Wed, Jan 27, 2016 at 10:21 AM, Jef Driesen
<jef@libdivecomputer.org> wrote:

The dc_device_foreach() function must return the dives in reverse
chronological order. This is a requirement for the download only new
dives feature. When dives are returned in reverse chronological
order, an application (or libdivecomputer itself) can simply stop
downloading dives as soon as a previously downloaded dive is
recognized. Very simple and efficient.

This is my loop in dc_device_foreach.

int i;
for (i = data->dive_count - 1; i > data->fp_dive_num; i--) {

Sorry, I missed that.

Thus if we need to return the dives in reverse chronological order,
it makes sense to process (and also download) the data in this
order. Otherwise you'll end up with a rather inefficient
implementation.

I have to read through your documentation again, but processing the
dives in reverse order might also make the recovery of corrupt dives
easier. If the tail of a dive is missing, then it can run at most
until the start of the next dive. And due the reverse order we
already have that one.

3. Corrupt dive handling. In some cases (like a low battery
especially in
cold water) the computer resets during a dive. This results in a
"start-dive" block written but no valid "end-dive" block written.
We know
information from the start of a dive (like date/time, gasses,
profile start
pointer, etc.) but we don't know information accumulated during or
at the
end of a dive (like end-profile pointer, max depth, min temp,
etc.) I've
taken to guessing the end of a dive by starting with the next
dive's
pre-dive-profile-pointer and backing up until we think we have the
previous
dive's end. We haven't resolved our differences on this. It seems
to down
to the question: Do we present a partial or broken profile in the
interest
of giving the diver something or do we give nothing in the
interest of
being accurate?

That's a difficult question. In general, I prefer to be very strict
and simply fail on unexpected data. Usually this is the correct
thing to do, because such unexpected data often turns out to be an
wrong assumption in the code. So being strict helps finding bugs.
But sometimes the data is really wrong (due to a firmware bug,
running out of battery during the dive, etc), and if it happens
frequently, then a workaround might indeed be necessary.

It also depends on where the data "corruption" is located. If the
information needed to move from one dive to another is good, but we
are unsure about the contents of the dive, then we can return the
bogus dive to the application and let the parser deal with it. You
might get incorrect data for that particular dive, or even a failure
to parse the dive. This would ensure that we can still download the
other dives. But if the primary structure is damaged and we can no
longer safely move to the next dive, then I think we should fail.

I've been working on code to retrieve corrupt dives and it's becoming
unwieldy. In working backwards from a future dive's profile I have to
remove inter-dive events which vary in length. it's possible that two
events might match the data and the code would have to decide which
event to use. The code also needs to do range checking, like ensuring
it doesn't exceed the malloc'ed memory and things like ring-buffer
wrapping. I should also remove any surface time (the time the DC still
stores samples in case you re-descend.)

I think the problem of recovering corrupt samples becomes a lot easier if we can defer it to the parser. At the device layer we simply use the begin/end/pre-dive pointers. If the tail of the dive is missing (e.g. end pointer equal to 0xFFFFFFFF), we assume it runs to the start of the next dive. And then it's up to the parser to deal with this.

In the parser we no longer have to worry about ringbuffer wrapping, only the end of the buffer. That should makes things already easier.

I agree and after sleeping on it I came the same conclusion. It also allows for another subtle feature.

You had mentioned in the past that an application could choose to download dives, store the dive blob to parse later. With that in mind I created the "dctool_export" patch so I could more easily extract dive blobs to test with the new incomplete dive parser. I've been testing various algorithms and it's easier to do outside of LDC and Subsurface.

There are ways to make this better but I think I'm going to have to
think about it for a bit. I'm considering pulling that code. Any ideas
on what I should do with the code so it isn't lost. An #ifdef maybe?

I've done some work and I may have a decent algorithm to backtrack
over the inter-dive events to get to good data. It goes as far as
removing good data in an effort to eliminate giving bad data.

To refresh the issue: it has never been finding the beginning of data.
The issue has been finding the end. ‎We do have the start of the
next dive or, if no next dive, we have the ring buffer head pointer.
What happens if we simply use those pointers is that interdive data,
the data recorded between dives like power on events, is interpreted
as profile data. The profile then looks erratic at the end. So the
trick is to backtrack to find the real end of good data. Ideally wed
be able to do that accurately but if we have to remove good data to be
sure it would result in only loosing a few seconds of dive samples.

Now that I've thought about the problem more and identified many of
the issues I have a standalone test program that's doing a good job of
backtracking to get to good data. I've used recursion to simplify the
code and derive at an optimal solution. In the dozen examples I'm
using it's working on 11 of them well enough to remove all non profile
data. The 12th leaves several seconds of non profile data.

Don't we have the pre-dive pointer to indicate where the interdive data starts? Thus if the tail of a dive is missing, then the end is not really at the start of the next dive, but at the start of the interdive data. So the risk of trying to interpret interdive data as sample data disappears. Or am I missing something here?

I wish it were that easy. The pre-dive pointer is the same on the incomplete dive as it is on the next dive. The DC stores the ring buffer head and pre-dive pointers in a config block. It seems to only update the pre-dive pointer when a dive is successfully ended. So a dive after an incomplete dive has the same pre-dive pointer but uses the updated ring-buffer head.

I've worked with several tactics to find the real dive end, to make it easier this is what an incomplete dive profile might look like:

[0 or more dive samples with dive events]
[0 or more surface samples, these have mostly -0 values for ascent and depth change]

[possibly non-sample, random memory data] (I have one dive like this)

[1 or more inter-dive samples]

The tactics are:

1. Back parse inter-dive events. This has worked well except it leaves surface samples.

2. Back parse surface samples. This works mostly except in the one case where random data is there.

3. Parse forward and watch data for reasonablenessike temp sample not changing much, reasonable depth and ascent rates, reasonable depths. Stop when it goes out of bounds.

3. Parse forward and look for surface samples. A diver might surface for a few minutes. So it's the last one we care about

John Van Ostrand

At large on sabbatical