Re: [PATCH] Cochran_commander refactoring

13 Jan 2015


      John,
I finally had some time again for another look at the cochran backend. 
This time I concentrated mainly on the data from your Commander. The 
main issue there is that the profile ringbuffer is completely full. The 
oldest logbook entries are pointing to profile data that has already 
been overwritten with newer dives. Because this is not handled properly, 
those older dives end up with completely bogus profile data.
There are 347 logbook entries, and the end-of-profile (eop) pointer 
contains the value 0x0009a20d. If I dump the pre, begin and end pointers 
again, I get this:
346 00098290 00098478 00099d5a   488 1203
  345 00097d5f 00097d5f 00097ddf     0 1201
  344 0009661a 000966d3 000978ae   185 1201
  ...
  225 0009b16e 0009b16e 0009cbce     0 1201
  224 00099e22 00099e6f 0009acbd    77 1201 <- Contains the eop address.
  223 00096e85 00098109 00099971  4740 1201
  ...
   78 0009b79a 0009c2a1 0009d247  2823 1205
   77 00099eb5 00099edf 0009b2e9    42 1201 <- Yet another time here.
   76 00098588 000985c4 00099a04    60 1201
  ...
    1 00021928 00021a00 00023420   216 1201
    0 00020000 000201fd 00021477   509 1201
As you can see dive #224 contains the eop address, which means part of 
the profile belongs to dive #346. Thus all older dives no longer have 
valid profile data.
The easiest solution to fix this is to process dives from the oldest to 
the newest one (which you already do), and sum the profile length. Once 
the total length exceeds the size of the ringbuffer (e.g. the maximum 
amount of profile data), all older dives will no longer have profile 
data. For those dives you only have the logbook entry.
I have attached a patch with a proof of concept implementation. It's far 
from complete, but I think you'll get the idea. The patch also 
illustrate how you can nicely concentrate all the main logic for 
downloading the dives into the foreach function, instead of scattered 
over many functions. The main idea is that you try to separate the 
download and parsing logic as much as possible.
(More comments inline.)
On 2014-12-22 21:58, John Van Ostrand wrote:
...
On Mon, Dec 22, 2014 at 10:38 AM, Jef Driesen jef@libdivecomputer.org
...
A memory dump should be independent from whatever logic we use for
locating the dives. If you simply read everything from address 0 to 
the
end, then you we can keep using those old memory dumps. If a device 
has
lots of memory, then yes, it may take a while to download, but that's 
fine.
Downloading a memory dump is not the normal use-case. It's intended 
for
troubleshooting, and then you do want all the data. (For reference,
downloading a full memory on a Suunto Vyper takes 10-15 minutes, and 
that's
only 8KB of data!)
If we really don't know the total amount of memory, then we'll have to
make an assumption. If that assumptions turns out to be wrong, we can
always adjust it. But at least we always have all data up to a certain
address.
I'll have to commit to knowing the end of memory then. Right now I've
avoided stating the end explicitly because I'm not sure where it is. If 
all
Cochran DCs return 0xFF when past memory I can be a little liberal and 
not
crash.
Rather than assuming a value that might be too large, I would start with 
a relative small value and increase whenever there is evidence that 
there is indeed more memory. Since the last memory area contains the 
profile ringbuffer, this evidence will most likely be a ringbuffer 
pointer that points past our assumed end. That's something that's very 
easy to check.
If we start with a value that is too large, and there is a device where 
the profile data crosses the ringbuffer end, then we will bogus profile 
data. I'll illustrate with an example. Assume for a momement that the 
real profile ringbuffer is in the range 0x200-0x400, and we 
(incorrectly) assume the end is at 0x800. If there happens to be a dive 
that crosses the ringbuffer end, for example from 0x380 to 0x280, then 
we'll read a first part from 0x380 to 0x800 and a second part from 0x200 
to 0x280.
200  280  380  400            800
  |xxxxx    xxxxx|              |    <- Correct
  |yyyyy    yyyyy|yyyyyyyyyyyyyy|    <- Wrong
As you can see that first part contains an extra 0x400 bytes that 
shouldn't be there. We can't easily detect this kind of errors, other 
than the user noticing the resulting profile is completely bogus.
If we had estimated the amount of memory too small, for example at 
0x300, then we would be able to tell immediately that the 0x380 pointer 
is outside the allowed range. Then we know we have to adjust the range.
...
...
Since the other pieces of data (id, misc and config) have a fixed 
size,
the structure for a good memory dump could be as simple as 
concatenating
all pieces:
id       67 bytes
misc     1500 bytes
config   2x512 (emc) or 4x512 (commander) bytes
memory   variable length, memory starting at address
An alternative is to only include the memory part, and take care of 
the
other three separately, by means of the vendor event. This is how it's 
done
in the several other backends (e.g. oceanic and reefnet).
Considering I'm one of two users of this I'd like to have the extra 
data.
And considering the person doing the dump knows the model it should be
apparent how to parse the data so I don't need to make a generic 
format.
I'm not sure I understand your concerns here. Of course we want that 
extra data. To locate the dives, we need at a minimum the id and config0 
packets. And depending on what we discover, we might need the others too 
at some point. I did only mention that instead of pre-pending the id, 
misc and config blocks to the memory dump, we could deliver them 
separately by means of the vendor event. That's how it's done in those 
other backends.
...
...
Do you think we should consider the pre-dive event as part of the dive
data? Based on your description, these pre-dive events do not look 
really
interesting from a dive logbook point of view. I have no idea how the
parser should deal with it, other than just ignoring them.
If the data were readily available I'd be interested in some of the 
data,
like altitude and temp changes while travelling. However I've looked at
some of the raw pre-dive data and I can't parse the event data. All 
that
said I think this data is a passing interest.
I took my cue from Analyst. It doesn't show the pre-dive events on dive
profiles nor does it show the off-gassing data. It has an option to 
display
"inter-dive" events and does so similar to a dive. I've not been 
interested
in the data, except to find out what it was. I've been thinking of
libdivecomputer in the context of Subsurface which has no mechanism to 
show
inter-dive events or data. I think libdivecomputer would need some 
other
mechanism to present this data and if only the Cochran does this it may 
not
make sense to spend the time.
That's also my impression.
But remember that even if libdivecomputer itself doesn't use this 
information nor provides an api to parse it, there might be someone out 
there who is interested in the raw data. It's perfectly possible to 
build a cochran specific application that use libdivecomputer only for 
downloading the data, and does the parsing on its own. Probably not a 
very likely scenario, but not impossible. Btw, that's also the reason 
why libdivecomputer always provides access to the raw data!
Note that if we now strip this info, and change our minds later one, 
then it might be too late. Changing the data format will break backwards 
compatibility.
...
...
...
A pointer of FFFFFFFF means the end of dive was never logged. I had a
problem with my DC where it would reboot not logging the end of dive. 
The
code tries to salvage the dive logs.
Can you recover some of the samples? Based on what I've seen so far 
(see
my long table) it makes no sense to try to recover the samples, 
because it
looks like at least part of the data you are recovering are the 
pre-dive
events of another dive. That certainly doesn't look right.
Look for example at dive #79:
78 0014b8f3 0014b94a 0014ef9d    87 1800
79 0014f6a5 0014f70d ffffffff   104 1800
80 0014f6a5 0015043c 00151442  3479 1373862
Your code to guess the end pointer will replace ffffffff with 
0015043c,
which is the start of the next dive (#80). So for dive #79 you take 
the
range 0014f70d to 0015043c as the sample data. But that memory range 
is
also the last part of the pre-dive events for dive #80. This overlap
doesn't make any sense to me.
You can see several of those starting about there. The battery was low 
and
the DC was starting to exhibit the crash problem so I was intentionally
invoking it about that time so I could be confident in describing the
problem.
Take a look at
95 001627c7 0016447b ffffffff 7348 1451976
 96 001627c7 0016463c 001661b3 7797 1451976
That 7348 byte of samples on dive 95 was a 40 minute dive in Tobermory,
Canada followed by another about that length. Both imported mostly
correctly. I'd rather have that flawed dive profile than nothing.
I'm not sure what dive 79 was, but I figured the user would rather see
corrupt dive imported than none so s/he could make a choice.
If you look at dive #96, then there is 7797 bytes of pre-dive events in 
the range 001627c7-0016463c. But 7348 of those bytes are also the 
pre-dive events of dive #96 (range 001627c7-0016447b). That's what 
puzzles me. How can the pre-dive events of two dives overlap? That 
simply doesn't make any sense to me.
If you then guess the end address of dive #95 to be the start address of 
dive #96, then the samples of dive #95 end up as somewhere in the 
pre-dive data of dive #96. That's just weird.
The only explanation that I can come up with is that the dive computer 
started writing dive #95. Then it failed before the dive was finished 
(e.g. battery empty, reboot or whatever) and the end pointer didn't get 
written correctly. Hence the ffffffff value. But the end-of-profile 
pointer is probably only updated after the dive is finished. Hence the 
pre-dive pointer of the next dive retains the same value. I suspect that 
the current write position is somehow preserved and thus the next dive 
will start from there again. But this is of course just guessing.
I understand why you're trying to salvage those corrupt dives (in the 
sense that having something is better than nothing at all), but I worry 
that this will complicate the code or produce bogus data (which is even 
worse). When I look at some of those salvaged dives, that seems to be 
the case already. They end abrupt, which is probably normal. But there 
appears to be some amount of "random" data (e.g. spikes in the profile) 
too. And that's not good. Maybe this is just a bug in the parsing that 
we can fix? I simply don't know at this point.
What does the Cochran application do with those corrupt dives? Does it 
show them? With or without a profile?
...
...
...
I've been holding off asking Cochran again for help. They previously
refused citing concern that open source code would reveal too much 
about
their algorithm. Now that the code exists I wonder if they would be
willing
to help by telling me how to determine model and memory 
configuration, or
if they will demand that I stop.. It's also the same reason I've been
trying to respect their privacy by not reading anything but my data.
That's their usual argument. The truth is that I've never encountered 
a
device where you can download anything worth protecting (e.g. firmware
code, decompression algorithm). All you can access is the memory 
containing
the dives.
Have you had any success in changing their minds? Do you have any good
examples of DC manufacturers that have been open that I could use as
examples when talking to Cochran next?
My experience so far is that either they are open minded (HW, Reefnet, 
Atomics, Shearwater, etc) or not. Convincing them appears to be rather 
difficult :-(
Jef

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH] Cochran_commander refactoring