Hi,
I have committed some several backwards incompatible changes to the libdivecomputer api. To keep the inconvenience to a minimum, a stable "release-0.1" branch has been created, which will only receive pure bugfixes. Applications are recommended to stick to this stable branch for now. Especially because these changes are only the first in a series, and backwards compatibility will be broken again in the master branch.
In a nutshell, the most important changes are:
* Introduce a namespace prefix for the high-level public api. * Use common status codes and backend type.
All functions, structs and enums in the device.h and parser.h headers have been renamed. The {device,parser}_status_t enums have been merged into a single dc_status_t enum. Similarly, the {device,parser}_type_t enums have been merged into a single dc_family_t enum. Although there is no functional change, this has a significant impact on the entire api and breaks backward compatibility.
* Add device enumeration support.
With the new device enumeration support, applications can now enumerate all the supported devices at runtime, and don't have to maintain their own list anymore. The api uses an iterator style interface. See universal.c for some example code.
* Add dc_device_open() and dc_parser_new() convenience functions.
With these new convenience functions, most applications won't need to use the device specific xxx_device_open() and xxx_parser_create() functions anymore. This has the advantage that you don't have add or change any code to support new models and/or backends. Again, see universal.c for example code.
As usual, feedback is always welcome!
Jef
On Fri, Jun 22, 2012 at 4:51 AM, Jef Driesen jefdriesen@telenet.be wrote:
In a nutshell, the most important changes are:
* Introduce a namespace prefix for the high-level public api. * Use common status codes and backend type.
Yup, good. That cleaned things up.
* Add device enumeration support.
And this is great, and I already changed my subsurface tree to use it.
* Add dc_device_open() and dc_parser_new() convenience functions.
Likewise.
Very good. Much better than the old interface. I've made the necessary subsurface changes, and while I'll need to tweak it some, it already resulted in
3 files changed, 138 insertions(+), 282 deletions(-)
while actually getting a better computer list with those fewer lines of code.
I think I'll just force subsurface users to get the new libdivecomputer source, because this is a big step forward. Even if it isn't perhaps ready and stable yet, I much prefer this over the release-0.1 interfaces.
I'll try to give more feedback when I've tested the basic changes a bit more, but for now, this looks like a huge improvement.
Linus
On Fri, Jun 22, 2012 at 1:29 PM, Linus Torvalds torvalds@linux-foundation.org wrote:
I think I'll just force subsurface users to get the new libdivecomputer source, because this is a big step forward. Even if it isn't perhaps ready and stable yet, I much prefer this over the release-0.1 interfaces.
Ok, I fixed up a few silly problems, tested it with my Vyper Air, and was all happy.
subsurface changes pushed out.
Looking at the subsurface libdivecomputer interfaces, the horrible device open and parser setup stuff is now gone. A few comments:
- do we have to do the parser create for each dive, or could that be something that is just done once? As libdivecomputer gets more detaisl from the computer (which might update the parse state), why not just do it all automatically?
It seems silly do do the create_parser() in the dive callback. In fact, it seems silly to do it in the application at all, why isn't it done implicitly and automatically by dc_device_open()?
- The event handling is still horrible. There does not seem to be any sane generic way to just get the event type and meaning.
- Is there some way to get air temperature before the dive?
But the big things that I *really* hated (enumeration of device models etc) are fixed and libdivecomputer.c doesn't look crazy. Good.
Linus
On 06/22/2012 10:53 PM, Linus Torvalds wrote:
Looking at the subsurface libdivecomputer interfaces, the horrible device open and parser setup stuff is now gone. A few comments:
- do we have to do the parser create for each dive, or could that be
something that is just done once? As libdivecomputer gets more detaisl from the computer (which might update the parse state), why not just do it all automatically?
There is no need to create a new parser for every dive. You can perfectly re-use the same one for every dive (downloaded from the same device of course). Just create the parser once, call set_data() for each dive, and parse it.
This scenario is exactly the reason why there is a set_data() function, and the dive data isn't directly passed to the dc_parser_new() function. The set_data() function can be called multiple times.
It seems silly do do the create_parser() in the dive callback. In
fact, it seems silly to do it in the application at all, why isn't it done implicitly and automatically by dc_device_open()?
Creating the parser in the callback is convenient. In some cases you can't create a parser upfront (at the time of calling dc_device_open, or even immediately after it) because the actual model number is required for parsing the data. The model code from the device descriptor isn't sufficient, because in most cases the model isn't necessary for downloading. As long as you select a model from the same family, downloading will work fine, but for parsing you need the exact model number. In practice this mismatch isn't an issue because the exact model number is usually autodetected somewhere during the download.
But the model code may not be available until some or all the data has been downloaded. So you have to postpone creating the parser until you have all the info. And in the dive callback you are guaranteed to have all the info.
You also have to take into account the use-case where you want to parse data without a device connection available. For example re-parsing older dives, importing data that was downloaded elsewhere, etc. An example is the Uwatec Smart. Downloading isn't supported on Mac OS X (due to the lack of IrDA support), and a possible workaround is to download the data elsewhere (the Uwatec SmartTrak application stores the raw data internally) and than import it. If the parser is directly tied to a device connection, then such use-case is impossible to support.
BTW, for this situation I want to introduce some variant of the new dc_parser_new function that doesn't take a device handle. I have been thinking of something like this:
dc_status_t dc_parser_new_for_data (dc_parser_t **parser, const dc_event_devinfo_t *devinfo, const dc_event_clock_t *clock);
- The event handling is still horrible. There does not seem to be any
sane generic way to just get the event type and meaning.
Which events are you referring to? The device events (e.g. DC_EVENT_*), or the parser events (e.g. DC_SAMPLE_EVENT)? Can you explain in some more detail?
- Is there some way to get air temperature before the dive?
Not yet, but it's on the todo list. This is probably the most often requested feature, but I want to concentrate on the api redesign first. The obvious implemention is to extend the get_field() api with a temperature field.
But the big things that I *really* hated (enumeration of device models etc) are fixed and libdivecomputer.c doesn't look crazy. Good.
Excellent!
Jef
On Sat, Jun 23, 2012 at 1:45 PM, Jef Driesen jefdriesen@telenet.be wrote:
Creating the parser in the callback is convenient. In some cases you can't create a parser upfront (at the time of calling dc_device_open, or even immediately after it) because the actual model number is required for parsing the data.
Jef, this is exactly the kind of totally broken thinking that makes libdivecomputer interfaces so nasty to use.
The user SHOULD NOT CARE about these kinds of internal libdivecomputer issues. The whole *point* of libdivecomputer should be to hide all these internal implementation issues, and as long as you continue to think that your internal details should matter for the interface, the interface will be crap.
Why the hell should I care that you need some model number for the parser? I do not even *KNOW* the model number, for chissake! The only thing that knows (and cares) is libdivecomputer itself.
So according to your absolutely insane logic, I cannot create the parser at device open time, because I don't know if that particular device needs the model number. But that means that I should magically decide to create the dive parser after I get the DC_EVENT_DEVINFO callback. Can you not see that THAT MAKES NO SENSE! From an interface standpoint, that is absolutely insane. I don't even *care* about the DC_EVENT_DEVINFO - the fact that I actually catch it and print out the model number on the progress bar is simply because that (to me) is an indicator that libdivecomputer is working at all.
You fixed the interfaces to the top-level open code, and to the top-level parser create code. Please just fix this too.
The fix is damn easy: get rid of the insane and useless "dc_parser_new()" and "dc_parser_destroy()" interfaces entirely. Create the parser inside of libdivecomputer (whenever it makes sense) and don't expose that totally idiotic and pointless interface to the user at all. Seriously. There is absolutely *ZERO* upside to ask me to create a parser, since I don't care, and I cannot possibly do it at a sane time anyway, because for the user, no sane time exists at all.
Just create the parser internally, and hide that information in the "dc_device_t" structure that is totally internal to libdivecomputer, and is just an opaque object to the users.
Please. Stop these insane interfaces that only complicate the use of libdivecomputer, and don't help anybody.
You also have to take into account the use-case where you want to parse data without a device connection available.
No you don't. You can expose damn well any interface you want for that case, but that's a totally separate case. That in *no* way means that when we have the device available, the user should somehow magically know when to create the parser. There is zero point.
BTW, for this situation I want to introduce some variant of the new dc_parser_new function that doesn't take a device handle. I have been thinking of something like this:
dc_status_t dc_parser_new_for_data (dc_parser_t **parser, const dc_event_devinfo_t *devinfo, const dc_event_clock_t *clock);
Congratulations on making the interface EVEN MORE BROKEN.
Don't do this. Seriously. Just f*cking don't. Any sane user of the libdivecomputer stuff will never ever care. If I opened a device through libdivecomputer, you already have ALL that information. It's totally pointless to give it to be as a callback, and then expect me to just give it back to you. Can't you see how STUPID that is?
Make that interface what you do *internally*. Use it for the pointless case where somebody cares. Don't force it for the normal case.
Even if I were to want to parse a pre-downloaded dump from a file, I WOULD NEVER EVER WANT TO USE THAT INTERFACE. I would want to use the "open dive computer" interface, and just give you the filename. And then libdivecomputer can again internally do the above.
So feel free to do those kinds of changes for your internal flow, but if you really think that it's a good idea to expose that kind of internal detail to the user, I don't know what to say. Except to say that you're wrong. It's a really *bad* idea to make these kinds of interfaces. Don't make things worse, make them better.
- The event handling is still horrible. There does not seem to be any sane generic way to just get the event type and meaning.
Which events are you referring to? The device events (e.g. DC_EVENT_*), or the parser events (e.g. DC_SAMPLE_EVENT)? Can you explain in some more detail?
The DC_SAMPLE_EVENT interface is just crazy.
Why do I have to have this kind of code in there:
static const char *events[] = { "none", "deco", "rbt", "ascent", "ceiling", "workload", "transmitter", "violation", "bookmark", "surface", "safety stop", "gaschange", "safety stop (voluntary)", "safety stop (mandatory)", "deepstop", "ceiling (safety stop)", "unknown", "divetime", "maxdepth", "OLF", "PO2", "airtime", "rgbm", "heading", "tissue level warning" }; const int nr_events = sizeof(events) / sizeof(const char *); ... type = value.event.type; name = "invalid event number"; if (type < nr_events) name = events[type];
where that set of strings is something I had to just mindlessly copy from your "example.c".
The 'enum parser_sample_event_t' numbers are totally meaningless to me. They don't tell me anything. Same goes for 'enum parser_sample_flags_t'. And what does the even 'value' actually do? None of this makes any sense, and more importantly, as far as I can tell, I *cannot* make sense of them.
Yes, the 'flags' things seems to possibly make sense as a "this begins the event" and "this ends it". That's not documented, but at least it's *somehow* sensible. But if that is all it does, why is it some bitfield thing? What else can it possibly be? What should I do as a user to protect myself from you changing the flags?
And the 'value' field? What does it do? It makes no sense to me, and I can't tell what it would do. Does it have any meaning? Is it in degrees for the heading event? Is it minutes for the airtime event? What is it?
Linus
Linus,
Could you do me a favor and provide some high level overview of the api design that you have in mind? I'm trying to understand your point of view, but it's not always crystal clear to me. If you could explain your api design with some pseudo code, call sequences, etc I think that would help a lot.
Some more comments below.
On 2012-06-24 20:32, Linus Torvalds wrote:
On Sat, Jun 23, 2012 at 1:45 PM, Jef Driesen jefdriesen@telenet.be wrote:
Creating the parser in the callback is convenient. In some cases you can't create a parser upfront (at the time of calling dc_device_open, or even immediately after it) because the actual model number is required for parsing the data.
Jef, this is exactly the kind of totally broken thinking that makes libdivecomputer interfaces so nasty to use.
The user SHOULD NOT CARE about these kinds of internal libdivecomputer issues. The whole *point* of libdivecomputer should be to hide all these internal implementation issues, and as long as you continue to think that your internal details should matter for the interface, the interface will be crap.
Why the hell should I care that you need some model number for the parser? I do not even *KNOW* the model number, for chissake! The only thing that knows (and cares) is libdivecomputer itself.
So according to your absolutely insane logic, I cannot create the parser at device open time, because I don't know if that particular device needs the model number. But that means that I should magically decide to create the dive parser after I get the DC_EVENT_DEVINFO callback. Can you not see that THAT MAKES NO SENSE! From an interface standpoint, that is absolutely insane. I don't even *care* about the DC_EVENT_DEVINFO - the fact that I actually catch it and print out the model number on the progress bar is simply because that (to me) is an indicator that libdivecomputer is working at all.
Why would you want to create a parser in advance? You can't really parse anything until you get the first dive in the dive callback function. And in that case you can equally well create the parser in the dive callback. So I don't understand why this is a problem.
If you call dc_parser_new() from inside the dive callback, you don't need to care about the model number at all. If a parser needs it, it's guaranteed to be available in the device handle (regardless of whether you catch the events or not). If it's not required, then no problem either. That's what I meant when I said it's convenient: you don't need any device specific knowledge, and it will always just work.
The only real issue I see here is that the api doesn't prevent you from calling dc_parser_new() immediately after dc_device_open(), in which case you may end up with a non functional parser.
Why you use DC_EVENT_DEVINFO as an indicator is a mystery to me. We have DC_EVENT_{PROGRESS,WAITING} for that purpose. The primary purpose for DEVINFO is for use with the fingerprint feature, which subsurface isn't using.
You fixed the interfaces to the top-level open code, and to the top-level parser create code. Please just fix this too.
The fix is damn easy: get rid of the insane and useless "dc_parser_new()" and "dc_parser_destroy()" interfaces entirely. Create the parser inside of libdivecomputer (whenever it makes sense) and don't expose that totally idiotic and pointless interface to the user at all. Seriously. There is absolutely *ZERO* upside to ask me to create a parser, since I don't care, and I cannot possibly do it at a sane time anyway, because for the user, no sane time exists at all.
Just create the parser internally, and hide that information in the "dc_device_t" structure that is totally internal to libdivecomputer, and is just an opaque object to the users.
Please. Stop these insane interfaces that only complicate the use of libdivecomputer, and don't help anybody.
I'm not sure what you propose here. Do you mean to integrate the parser object into the device object? How is that supposed to work? For example a device can contain more than one dive, so how do you decide which dive to parse? Or do you mean the dive callback should be changed to receive a parser object directly, pre-populated with the dive data, model numbers, and whatever else is necessary. Something like:
int callback (dc_parser_t *parser, void *userdata) { ... dc_parser_get_datetime (parser, ...); dc_parser_samples_foreach (parser, ...); ... }
Essentially turning the dc_parser_t into a dc_dive_t object. I considered doing this in the early days of the libdivecomputer project, but there are a couple of problems with this approach.
With this api, your have no other option than to do the parsing directly in the callback function. However some devices (e.g. the sensusultra) have very strict timing requirements and you are very limited in the amount of processing you can do in the callback without messing up the downloading. In this cases it's usually better to save the data and process it once the download has finished, or offload it to a separate thread. But that's impossible because the lifetime of the parser is now limited to the duration of callback function.
Assume we introduce some method to extend the lifetime of the parser to remain valid outside the callback (e.g. a deep copy or something). We still can't transfer this parser to another thread because it may share some state with the device handle. Remember that in the roadmap emails [1], I announced that I'm about to introduce a new library context object to fix the problems related to this shared state. With the context object, you could create one context for the device thread and another one for the parser thread. If you then call:
dc_context_new (&ctx_device); dc_context_new (&ctx_parser);
dc_device_open (&device, ctx_device, ...);
callback (...) { dc_parser_new (&parser, ctx_parser, device); }
The device and parser won't share any state and everything is fine. But if the parser is created directly by the device, there is no way to use another context object. The sensusultra timing problem can be fixed with internal caching, but the threading problem is a more fundamental one. How would you solve this?
[1] http://www.divesoftware.org/libdc/roadmap.html#17
You also have to take into account the use-case where you want to parse data without a device connection available.
No you don't. You can expose damn well any interface you want for that case, but that's a totally separate case. That in *no* way means that when we have the device available, the user should somehow magically know when to create the parser. There is zero point.
BTW, for this situation I want to introduce some variant of the new dc_parser_new function that doesn't take a device handle. I have been thinking of something like this:
dc_status_t dc_parser_new_for_data (dc_parser_t **parser, const dc_event_devinfo_t *devinfo, const dc_event_clock_t *clock);
Congratulations on making the interface EVEN MORE BROKEN.
Don't do this. Seriously. Just f*cking don't. Any sane user of the libdivecomputer stuff will never ever care. If I opened a device through libdivecomputer, you already have ALL that information. It's totally pointless to give it to be as a callback, and then expect me to just give it back to you. Can't you see how STUPID that is?
Make that interface what you do *internally*. Use it for the pointless case where somebody cares. Don't force it for the normal case.
I think you misunderstood me here. This new dc_parser_new_for_data() function is supposed to co-exist with the dc_parser_new() function. You would only use it for the "offline" parsing use-case, so it's not intended as a replacement for the dc_parser_new() function. Isn't that what you meant when you wrote I can "expose damn well any interface you want" above?
Even if I were to want to parse a pre-downloaded dump from a file, I WOULD NEVER EVER WANT TO USE THAT INTERFACE. I would want to use the "open dive computer" interface, and just give you the filename. And then libdivecomputer can again internally do the above.
So feel free to do those kinds of changes for your internal flow, but if you really think that it's a good idea to expose that kind of internal detail to the user, I don't know what to say. Except to say that you're wrong. It's a really *bad* idea to make these kinds of interfaces. Don't make things worse, make them better.
I'm not convinced a file import interface is the right solution here. What if the application doesn't store the data in a raw file (for example in a database), or uses some different format (for example the individual dives instead of a memory dump)? If an application wants to support offline parsing, then it will need some specific knowledge anyway. For example in the Uwatec Smarttrak scenario, the data isn't stored in the original format, and you have to combine several database fields to reconstruct it. You can't do this without any knowledge of the format. So the internals are already exposed.
- The event handling is still horrible. There does not seem to be any sane generic way to just get the event type and meaning.
Which events are you referring to? The device events (e.g. DC_EVENT_*), or the parser events (e.g. DC_SAMPLE_EVENT)? Can you explain in some more detail?
The DC_SAMPLE_EVENT interface is just crazy.
Why do I have to have this kind of code in there:
static const char *events[] = { "none", "deco", "rbt", "ascent", "ceiling",
"workload", "transmitter", "violation", "bookmark", "surface", "safety stop", "gaschange", "safety stop (voluntary)", "safety stop (mandatory)", "deepstop", "ceiling (safety stop)", "unknown", "divetime", "maxdepth", "OLF", "PO2", "airtime", "rgbm", "heading", "tissue level warning" }; const int nr_events = sizeof(events) / sizeof(const char *); ... type = value.event.type; name = "invalid event number"; if (type < nr_events) name = events[type];
where that set of strings is something I had to just mindlessly copy from your "example.c".
The 'enum parser_sample_event_t' numbers are totally meaningless to me. They don't tell me anything. Same goes for 'enum parser_sample_flags_t'. And what does the even 'value' actually do? None of this makes any sense, and more importantly, as far as I can tell, I *cannot* make sense of them.
Yes, the 'flags' things seems to possibly make sense as a "this begins the event" and "this ends it". That's not documented, but at least it's *somehow* sensible. But if that is all it does, why is it some bitfield thing? What else can it possibly be? What should I do as a user to protect myself from you changing the flags?
And the 'value' field? What does it do? It makes no sense to me, and I can't tell what it would do. Does it have any meaning? Is it in degrees for the heading event? Is it minutes for the airtime event? What is it?
The sample events are indeed a big mess. I'm aware of this, and they will be removed from the api. I don't know yet how to implement an appropriate replacement, so at the moment the idea is to introduce a vendor extension type and just pass the raw binary data. Applications that insist on using this type of data can still get it with some additional effort. But at least we won't be stuck forever with a broken interface.
Jef
On Mon, Jun 25, 2012 at 1:02 PM, Jef Driesen jefdriesen@telenet.be wrote:
Why would you want to create a parser in advance? You can't really parse anything until you get the first dive in the dive callback function. And in that case you can equally well create the parser in the dive callback. So I don't understand why this is a problem.
I don't want to create a parser AT ALL.
That's 100% your internal problem, and you should consider it that way.
You should create the parser on demand. You should *not* ask the application to create it and worry about it, since the application has absolutely zero use for it.
The app just wants to get the results.
If you call dc_parser_new() from inside the dive callback, you don't need to care about the model number at all. If a parser needs it, it's guaranteed to be available in the device handle (regardless of whether you catch the events or not). If it's not required, then no problem either. That's what I meant when I said it's convenient: you don't need any device specific knowledge, and it will always just work.
I currently *do* create the parser in the dive callback.
And it's stupid. It means that I either have to keep track of the old parser for the next dive callback (why?) or I do what I do now - create it anew for each dive. Right now I create it for each dive, because that's what your example.c code does, and it's not at all clear from anything whether I needed to do it or not.
But even if I can cache it across dives, the interface is just totally moronic. Why does that interface exist at all? Why should I do it, when you could do it so much better yourself without me ever calling "dc_parser_new()" at all?
In other words, please answer me this *simple* question: what is the advantage of me knowing about "create parser" at all? Seriously. I don't care. There is absolutely *NOTHING* I can do with the "dc_parser_t" structure.
And if there is nothing I can do with it, I should never even have to see it, and I shouldn't have to maintain it.
Why don't you just make it a hidden member in the opaque "dc_device_t" data structure, and you create it invisibly on demand? Why do you force me to use insane interfaces? What's the advantage of doing
dc_parser_set_data(parser, data, size); ... dc_parser_get_field (parser, DC_FIELD_DIVETIME, 0, &divetime); ...
etc? Why should the user care? The user *shouldn't* care.
This whole idiotic "you're passing me values just so that I can mindlessly pass them back to you" is a completely horrible model. Those values make no sense to me - why do you even give them to me? And the data structures you internally create to hold the parser state for the values equally make no sense to me - so why do you have interfaces that require that stupid back-and-forth model?
You could hide *everything* in 'dc_device_t'. When I get a divecb callback event, I shouldn't need to care about your parsers. But even more fundamentally, I shouldn't even need to care about that 'data/size' thing. It's totally irrelevant to me. I can't do anything with it anyway. Stop giving it.
Seriously.
So I would suggest that in dive_cb(), you should pass me exactly *one* thing as an argument: the user-data that I wanted to have and told you I wanted to have. And then you pass me an opaque "dive" data structure that contains all the information you have gathered so far. I don't want to know what it contains, I don't *want* to know about parsers, I don't want to know about crud like that. It might even be dive-computer-specific structure - not generic at all - that your particular dive computer wants. Do this, instead of giving me "const unsigned char *data, unsigned int size" that makes no sense, or the "const unsigned char *fingerprint, unsigned int fsize" stuff etc. None of that matters.
So the interface I would suggest would look simply like
int dive_cb(dc_dive_t *dive, void *userdata) {
because anything else is just garbage as far as I'm concerned. I can't use it anyway.
Then, give me a few functions to get the dive data from that device during that dive. Don't make me create a parser, don't make me do that "dc_parser_set_data()" insanity (you just *gave* me that data, you already know it, just remember it yourself in the device data structure!). Just give me
dc_dive_get_field(dc_dive_t *dive, enum dive_field, int index, void *result);
so that I can get the normal fields (DC_FIELD_DATETIME, DC_FIELD_DURATION, DC_FIELD_GASMIX etc). Or you might expose them as separate accessor functions, ie
/* Return dive date and time */ void dc_dive_get_datetime(dc_dive_t *dive, dc_datetime_t *datetime);
/* Return dive duration in seconds */ long dc_dive_get_duration(dc_dive_t *dive);
/* return number of gasmixes in 'mix[]' array of size maxmixes */ int dc_dive_get_gasmix(dc_dive_t *dive, dc_gasmix_t *mix, int maxmixes)
etc. If you need to create a parser, you do it here. But maybe for some dive computers the "parser" is some static thing that you don't even need to create at all. Why have that whole parser-dependent model and expose it? Even if you internally use a parser, just put the pointer to that parser in that "dc_device_t" data structure (or in the "dc_dive_t" one - maybe some day you decide that you really *do* want to do per-dive parser generation).
Why you use DC_EVENT_DEVINFO as an indicator is a mystery to me. We have DC_EVENT_{PROGRESS,WAITING} for that purpose. The primary purpose for DEVINFO is for use with the fingerprint feature, which subsurface isn't using.
You even said that you want to make the parser generation function take that devinfo structure pointer. You said you'd want to introduce something like
dc_status_t dc_parser_new_for_data (dc_parser_t **parser, const dc_event_devinfo_t *devinfo, const dc_event_clock_t *clock);
but I don't even *have* that structure until after I get the DC_EVENT_DEVINFO callback. I don't *care* about it.
And more importantly, I shouldn't care. I shouldn't have to care about *any* of this.
So why do you make the interfaces have all these totally irrelevant things?
You used to do this with the whole dive computer open thing. You've cleaned that up, and now we don't have to care about random internal libdivecomputer implementation issues any more. And that's GREAT.
Now I'm just asking you to basically do the same for the parser stuff and the event handling: clean it up so that the users don't have to see your internal implementation issues that are totally irrelevant to any user.
Even *if* the user actually wants to save some "raw dive data", make a helper function for that! So even if somebody wants to save the dive data you got, just add another
dc_dive_get_raw_divedata(dc_dive_t *dive, void **buffer, unsigned int *size);
and then that can give people the dive data. But the whole "let's give some raw dive data to the user, just so that the user should then set up a parser and pass it back to us" model is exactly the wrong way around. It designs things for the wrong model.
I'm not sure what you propose here. Do you mean to integrate the parser object into the device object? How is that supposed to work? For example a device can contain more than one dive, so how do you decide which dive to parse?
Look at the suggestion above. Put the parser pointer into the dc_device_t structure (you claimed earlier that it is a "one per device" thing), hide it from the user that doesn't care. Don't expose it.
The dive data you can pass in the per-dive opaque structure. Again, the user really really doesn't even want to know the details. You can hide anything you want in there. Maybe it just contains the raw dive data. Maybe it contains some extra fields (like your own internal pointers to the functions to parse it). Maybe it contains a back-pointer to the "dc_device_t" so that from the dive data you can always just find the device data, and the device data then contains the parser data.
See? There are many different implementation models for this, and the library interface shouldn't expose what you do internally. It should make sense at a higher level, and by making sense at that higher level it actually allows you to *change* your implementation.
For example, right now you have this model for the dive callback function:
static int dive_cb(const unsigned char *data, unsigned int size, const unsigned char *fingerprint, unsigned int fsize, void *userdata) {
and you could literally *inside* libdivecomputer change it to do
typedef struct dc_dive_t { dc_device_t *device; const unsigned char *data; unsigned int size; const unsigned char *fingerprint; unsigned int fsize; } dc_dive_t;
and then pass me a pointer to that object (without ever actually exposing the structure to me, so that I couldn't see what it contains) as the 'dive'. Now whenever I pass you that 'dive' back to the parse functions, you have the data right there. No need for 'dc_parser_set_data()' etc insane contortions, and the interface is simpler not just for me, but for you too. You hardly have to change any code at all, because you really have all the same data - but you've cleaned up the interface.
And now that you have that opaque data structure, if you make some organizational changes, none of the users of the library need to even know. They don't care. Maybe the "data" part won't even be a single buffer for some devices - maybe the dive computer itself uses XML, and instead of having that "const unsigned char *data/size" thing for those dive computers, maybe you'll use some xml library for that, and it has a "xmlNode" entry for the dive instead?
Ok, so hopefully no dive computer ever does anything that insane, but you know it could happen. It may not be XML, but it might be some other similar odd encapsulated format. The uemis has that odd base64 encoding, maybe you want the 'dc_dive_t' to have the pre-decoded information in etc.
Or if it's a database, it would again not be about "const unsigned char *data/size" model, but be about some kind of "dive key + connection information for the database". See? Exposing it as a "data/size" thing is wrong. Give the user just the opaque pointer, and you can hide *anything* in there, and hide it in the format that makes sense for *that* particular dive computer.
THAT is a good model. It's a good model for the user (who sees "ok, that's a dive, I want to know when it happened and the details of it"), and it's good for you ("I can put the data in the most simple form for *this* particular dive computer"), and it's flexible and simple. It doesn't make the user do extra things.
Or do you mean the dive callback should be changed to receive a parser object directly, pre-populated with the dive data, model numbers, and whatever else is necessary. Something like:
int callback (dc_parser_t *parser, void *userdata) {
Yes, I think that would actually be an improvement over the current model, and I don't think it would be wrong either. In many ways, my "dc_dive_t" *is* exactly that pre-populated parser thing, but I would suggest against thinking of it as a "parser" thing. It might contain pointers to the real parser, or more likely actually contain just a pointer to the dc_device_t, which in turn contains a pointer to the parser. Don't name things by how you happened to implement them, name them by what they represent.
It represents the data for "one dive". Name it that way.
Linus
Hi Jef,
[...]
To piggyback on the following from Linus
The app just wants to get the results.
And the result is just collection of dives.
For me, the functions like below would be more than enough[1]
- iterate over dives from a device/file or stream/array of bytes (return null if no more dives) - iterate over samples from a dive (return null if no more samples) - get dive data field value (if you feel some universal data structure is not possible) - get additional dive sample data field value (assuming the basic dive sample is dive time and depth and that universal dive sample data structure is not possible)
There is need for new parser, buffer or state machine - the library should deal with it internally.
Also, above means that there shall be no parser callbacks - if my app needs them, then I can deal with callbacks between iterator calls. The same for any state my app needs... but no artificial state, which is imposed on old API users due to callbacks being part of API.
[...]
Regards,
w
[1] I am skipping the obvious stuff like getting device info, configuration, etc.
On 2012-06-26 02:04, Artur Wroblewski wrote:
For me, the functions like below would be more than enough[1]
- iterate over dives from a device/file or stream/array of bytes
(return null if no more dives)
- iterate over samples from a dive (return null if no more samples)
- get dive data field value (if you feel some universal data
structure is not possible)
- get additional dive sample data field value (assuming the basic
dive sample is dive time and depth and that universal dive sample data structure is not possible)
There is need for new parser, buffer or state machine - the library should deal with it internally.
Also, above means that there shall be no parser callbacks - if my app needs them, then I can deal with callbacks between iterator calls. The same for any state my app needs... but no artificial state, which is imposed on old API users due to callbacks being part of API.
At some point I do want to replace the callback based interface with an iterator based interface. The callback interface is ugly, and breaks the control flow of the application. An iterator based interface is much more elegant (on the application side), but also add extra complexity (on the library side) and is easier to misuse. The callbacks was simply the easiest option to get something working.
However since such a change can easily be implemented without breaking backwards compatibility (by reimplementing the callback interface on top of the iterator interface), I suggest we concentrate on the changes that are currently being discussed first. And once those are implemented, we can have a look at another round of improvements.
Anyway, a proposal for the iterator interface could be something like this:
dc_dive_t *dive= NULL; dc_device_t *device = NULL; dc_iterator_t *iterator = NULL;
dc_device_open (&device, ...); dc_device_iterator (device, &iterator); while (dc_iterator_next (iterator, &dive) == DC_STATUS_SUCCESS) { /* Do something with the dive here */ dc_dive_free (dive); } dc_iterator_free (iterator); dc_device_close (device);
With something similar for the parsing of the samples (e.g. replace "dive" with "sample").
Jef
On Mon, Jul 2, 2012 at 12:34 PM, Jef Driesen jefdriesen@telenet.be wrote:
At some point I do want to replace the callback based interface with an iterator based interface.
I'd love to see that, because it allows us to just keep our own data in the natural data structures, instead of having to encapsulate them into some callback structure and a single pointer to it.
So in general, I understand that you *internally* may want to use a much more complex interface, and that you may even want to expose those complex interfaces to applications like MacDive that then actually have their own parsing logic etc. But at the same time, I would really love for there to be a simple and straightforward interface that does not have those kinds of internal details associated with them.
The same way you were able to abstract all the per-dive-computer open functions into one "dc_device_open()" one (while still having the "suunto_d9_device_open()" etc stuff internally), I would really hope that libdivecomputer migrates to that kind of simpler interface (while possibly having the more complex interface available for users that *want* to see the internals).
Linus
Hi Linus,
Allow me to start this email with some background information on the libdivecomputer project first.
The current libdivecomputer api has been build around the assumption of a "pipeline" model. In this model, there is a "device" component which takes care of the communication with the dive computer, and produces raw dives as the output. The "parser" component consumes those raw dives and translates them into some more structured data. The key concept is that both components are completely independent of each other. This provides a great amount of flexibility, because you are not limited to a single scenario.
I agree that most applications (subsurface, divinglog, etc) are only interested in the final data, and don't care about the extra flexibility or the intermediate data. But that doesn't mean there are no applications that do care. I can give you several examples.
One such example is macdive. Currently it uses only the device functionality, but not the libdivecomputer parser. The reason for that is partly legacy (the libdivecomputer parser wasn't ready back then) and partly because macdive supports some highly device specific features, which are unlikely to make it into the libdivecomputer api. Therefore access to the raw dive data is important.
Or, imagine the "dctool" command line application that I have been planning to write (but for which I had no time yet). The idea was to have a separate processes for downloading and parsing:
dctool download <args> | dctool parse <args>
The dives and their metadata would be serialized to some application defined data format, and read again by the parser tool. This is a very simple illustration of the offline parsing case, where there is no active connection with the device.
And last but not least, there is the scenario where an application uses the raw dive data (together with the associated metadata) as its internal data format. Again this requires offline parsing support. This may sound a bit strange at first, but if you think about it, it has several advantages. First of all it's the most compact format possible, and you never loose a single bit of information. If the application gains new features or is capable of parsing more information, you can apply that to all your old dives too. This is something you can't do if you only care about the parsed representation. Migrating to another application becomes easier too.
Anyway, the reason why I'm writing all this is to make clear there are perfectly valid scenarios besides the typical scenario used by subsurface. I don't mind adapting the api to something that makes the common scenario easier to use, but at the same time I don't want to drop the possibility to support alternative scenarios. The same goes for the other features you are not using or don't care about (e.g. fingerprint support). And yes I'm aware applications doing this kind of offline parsing will need some additional knowledge, but then that's the responsibility of the app developer, not the libdivecomputer api.
Having said all that, let's get back to your proposal! It basically comes down to a design where the dive callback function is changed into something like this:
int dive_cb (dc_dive_t *dive, void *userdata) { ... dc_dive_get_datetime (dive, ...); dc_dive_get_something (dive, ...); dc_dive_get_anotherthing (dive, ...); ... }
where the new dc_dive_t object is conceptually a container for storing the dive data, combined with a built-in parser. I don't have any particular arguments against such a change. I do have a couple of questions/remarks regarding the actual design and implementation. Let's go over them one by one.
1. Access to raw dive/fingerprint data
As you mentioned before, this can be achieved with a dc_dive_get_rawdata() and dc_dive_get_fingerprint() function, so that shouldn't be a problem.
2. Ownership and lifetime
I'm convinced the lifetime of the dc_device_t object should not be limited to the scope of the callback function. Because if you do that, you can't maintain a reference to the object for use outside the callback function. For example a perfectly valid scenario would be to append the dive object to a list inside the callback, and process them afterwards, when the download has finished.
This would require that the dive object is allocated by the device object, but not owned by it, and therefore should be freed by the application (e.g a dc_dive_free function is needed):
int dive_cb (dc_dive_t *dive, void *userdata) { ... dc_dive_free (dive); }
A side effect is that the dive object can't maintain a back reference to the device object, because the device connection might be closed before the dive object is freed:
int dive_cb (dc_dive_t *dive, void *userdata) { mylist.add (dive); }
dc_device_open (&device, ...); dc_device_foreach (device, dive_cb); dc_device_close (device);
foreach dive in mylist { ... dc_dive_free (dive); }
I think that's actually a good thing, because leaving the device connection open longer than necessary is bad practice (e.g. higher power consumption for most devices). For the implementation this won't cause any problems at all.
Note that this is consistent with the device enumeration interface, where you have to call dc_descriptor_free() for each descriptor you obtain. I'm aware that the current implementation of the dc_descriptor_free() function is just an empty stub, but that may change some day.
3. Thread safety
I intended to introduce a new library context object to eliminate all global state (something which is present today in the logging code). Full thread safety could then be achieved by requiring that every thread operates on its own independent context. The context object is also convenient to perform any library initialization and shutdown tasks. However the new dc_dive_t object would cause some serious trouble here.
Assume you have an application that does the downloading on a worker thread and the processing of the dives on the main thread. According to the rule above, contexts shouldn't be shared between threads, and the worker thread will have to create its own context and pass it to the dc_device_open() function. Now, because the dive object is created directly by the device object, it will automatically inherit the same context. That means we won't be able to hand over the dive object to the main thread, without breaking our threading rule.
/* Shared data (synchronization omitted) */
queue_t queue;
/* Main thread */
worker_thread_run ();
while (1) { dc_dive_t dive = queue.pop ();
/* Ouch, this dive is associated with * the context of the worker thread! */
dc_dive_free (dive); }
/* Worker thread */
int dive_cb (dc_dive_t *dive, void *userdata) { queue.push (dive); }
dc_context_t *context = NULL; dc_device_t *device = NULL;
dc_context_new (&context);
dc_device_open (&device, context, ...); dc_device_foreach (device, dive_cb, ...); dc_device_close (device);
dc_context_free (context);
The only workaround I can think of would be to make the context object itself thread-safe (e.g. with an internal mutex), so it can be shared safely between multiple threads. The disadvantage is that this approach would introduce some additional complexity and runtime overhead, compared to the simple thread-safety rule that we had first. We may also have to implement reference counting for the context object to make tracking its lifetime a bit easier and less error prone.
Of course the application will still need proper synchronization to protect the shared data (e.g. the queue in the pseudo code above).
4. Offline parsing
For the offline parsing scenario, I think we can do something very similar to what I proposed before for the dc_parser_t api:
dc_status_t dc_dive_new (dc_dive_t **dive, const char *data, unsigned int size, dc_family_t type, const dc_event_devinfo_t *devinfo, const dc_event_clock_t *clock);
With this function we can create a dive object manually for the raw dive data and its metadata. Applications that have no need for this functionallity can just ignore this function.
5. Accessor functions
Is there a preference to have separate accessor functions instead of the ioctl like dc_dive_get_field() function? The main argument in favor for the ioctl like function was/is that it's a bit easier to add support for new fields. No new functions are required, just a new enum value. If you try to run a (compiled) application against an older version of the library, it would just return an unsupported error, instead of complaining about unresolved symbols. On the other hand it's a very ugly interface, and very easy to misuse (e.g. no type checking at all). So maybe we should switch to separate functions?
Jef
Hi,
Now that we are discussing major api improvements, let's also have a look at the sample parsing api. The current api generates a stream of sample values, where the time value is special in the sense that it starts a new sample. I think it would be cleaner to somehow return an opaque sample, and then get all the values available for that sample. Something similar to this:
void sample_cb (dc_sample_t *sample, void *userdata) { dc_sample_get_time (sample, ...); dc_sample_get_depth (sample, ...); dc_sample_get_something (sample, ...); ... }
Let's discuss this a bit in more detail.
1. Sample types
The first question is of course which sample types do we want to support? The absolute minimum to support is:
* time * depth * pressure * temperature
At the moment we also have:
* events: Anything that happens only at specific point in time. All other sample types are typically measurements obtained from a sensor (or calculations based on them), with values available for each sample point (or at least every x sample points). But that is not the case for events. Typical examples are warnings, gas changes, etc. At the moment many events are supported, but the api sucks. The current plan is remove the events, use the vendor type instead (see below), and then add back support for the important events with a better api. The details of this last part are still unclear, but support for gas switches is definitely high on the list.
* rbt: The RBT (remaining bottom time) is the time you can spend at the current depth and still have enough gas supply to make a safe ascent. It's currently available for the Uwatec Smart/Galileo only. Other air integrated dive computers often support a similar concept, so I think we can just keep this. Maybe rename it to the less cryptic "airtime"? Note that the actual interpretation may vary between models, depending on whether factors like the safety reserve, the ascent time, decompression time, etc are taking into account or not.
* heartbeat: Another Uwatec Smart/Galileo feature. Although I don't know any other model that supports a heart monitor sensor, but it doesn't cause any trouble either. So I think we can keep this one too. BTW, is the correct name heartbeat or heartrate? I think it's heartrate, expressed in beats per minute (bpm). It's a bit confusing because both have the same translation in my native language.
* bearing: Yet another Uwatec Smart/Galileo feature. There are several other models that have a digital compass, but they usually report it as an event, not a sample type. I also believe it makes more sense to consider a change in the compass bearing as an event. It's a setting you change, not some sensor that is polled every sample. I don't think any dive computer will ever generate a continuous stream with compass bearings. So I think we should drop this. BTW, is the correct name comnpass bearing or heading?
* vendor: A vendor specific extension. This is intended for all samples that don't fit well into one of the types supported by the libdivecomputer library. The application is responsible for parsing the raw data, but at least it doesn't have to care about where and how the samples are stored. There are actually two variants of the vendor type today. For the first variant each piece of info gets its own vendor sample. That means a dc_sample_t can contain multiple vendor values. The second variant is the raw data representing the entire dc_sample_t data. Both have their use, so I propose to reserve the vendor type for the first case, and add a new raw type for the second case.
What else should we support? Some common requests I get are:
* gas switches * no decompression time * decompression stops * ...
Decompression is a bit troublesome. Some models have just a simple deco flag. Others provide a time and depth, which is typically the time to surface (TTS) (e.g. total time for all stops and ascent combined) and the depth of the current stop. But their are plenty of other interpretations possible too. I won't be surprised some of the more technical dive computers provide a full decoplan (e.g. time at each depth). I have no idea how to support something like this in a clean way.
Note that an alternative solution would be to implement a decompression algorithm in the application and calculate all kinds of deco related parameters (tissue saturation, deco stops, etc) that way. It won't necessary match with your dive computer, but at least you would get a consistent application experience, regardless of the type of dive computer you have. Personally, I think this makes more sense for a general purpose logbook application, but not everyone may agree with that of course.
2. Which sample types are present?
With the current callback api, you just get all the sample types pushed to the application. With the new proposal the application needs to retrieve each sample value manually. Thus somehow we need a mechanism to query which sample types are available in the dc_sample_t object. Suggestions are welcome.
Also, how do we deal with sample types that can occur multiple times (e.g. vendor extensions, pressure values for multiple tanks, alarms, etc)?
3. Memory management
To be consistent with the dc_dive_t object, the dc_sample_t object should be allocated by the dc_dive_t object and freed by the application. However I'm not convinced that is a good idea. The number of samples is usually relative large (e.g. 3600 samples for a 1 hour dive at a 1 second sample rate) and in the typical scenario there is no reason to retain the sample object after having extracting the values (e.g. you call free right before leaving the callback function). Thus you end up with several hundreds or thousands of tiny objects that are allocated and immediately freed again. That's just very inefficient if you ask me!
A possible workaround would be to make the objects reference counted. Then, if the application doesn't keep a reference to the object, we can re-use the previous one, instead of creating a new one from scratch. But the drawback is that if we want to support passing objects between threads (see my previous email) the reference counting should be made thread-safe too.
Note that I'm not an export on multi-threading, so please correct me if I'm writing nonsense! Also everything I wrote so far is open for discussion. Feedback is highly appreciated!
Jef
On Wed, Jul 4, 2012 at 4:53 PM, Jef Driesen jefdriesen@telenet.be wrote: [...]
Now that we are discussing major api improvements, let's also have a look at the sample parsing api.
[...]
What else should we support?
Additionally, my dive computer stores the following on sample level
- gradient factor (of leading tissue) - ppO2 (there can be multiple O2 sensors data stored by other computers) - cns
See also: http://www.heinrichsweikamp.info/read.php?2,8503,8553 - the custom functions documentation shows what data is stored by OSTC (i.e. divisor options for temperature, deco, cns, etc).
[...]
Regards,
w
On 2012-07-04 19:10, Artur Wroblewski wrote:
On Wed, Jul 4, 2012 at 4:53 PM, Jef Driesen jefdriesen@telenet.be wrote: [...]
Now that we are discussing major api improvements, let's also have a look at the sample parsing api.
[...]
What else should we support?
Additionally, my dive computer stores the following on sample level
- gradient factor (of leading tissue)
- ppO2 (there can be multiple O2 sensors data stored by other
computers)
- cns
See also: http://www.heinrichsweikamp.info/read.php?2,8503,8553 - the custom functions documentation shows what data is stored by OSTC (i.e. divisor options for temperature, deco, cns, etc).
I think we can consider adding ppO2 to the list of supported types. It's probably mainly relevant for rebreathers, where the traditional concept of gas switches doesn't exist and the actual breathing gas changes continuously. For open circuit and also semi-closed I think, there is no ppO2 sensor and I assume the value just gets calculated from the pre-configured gas mix and current depth?
For the other two, I'm not so sure. The gradient factors are very specific to the decompression model, and cns values can be calculated. I also don't know any other dive computer that has these.
Jef
On Thu, Jul 5, 2012 at 4:21 PM, Jef Driesen jefdriesen@telenet.be wrote:
On 2012-07-04 19:10, Artur Wroblewski wrote:
On Wed, Jul 4, 2012 at 4:53 PM, Jef Driesen jefdriesen@telenet.be wrote: [...]
Now that we are discussing major api improvements, let's also have a look at the sample parsing api.
[...]
What else should we support?
Additionally, my dive computer stores the following on sample level
- gradient factor (of leading tissue)
- ppO2 (there can be multiple O2 sensors data stored by other computers)
- cns
See also: http://www.heinrichsweikamp.info/read.php?2,8503,8553 - the custom functions documentation shows what data is stored by OSTC (i.e. divisor options for temperature, deco, cns, etc).
I think we can consider adding ppO2 to the list of supported types. It's probably mainly relevant for rebreathers, where the traditional concept of gas switches doesn't exist and the actual breathing gas changes continuously. For open circuit and also semi-closed I think, there is no ppO2 sensor and I assume the value just gets calculated from the pre-configured gas mix and current depth?
It is for rebreathers indeed. You will have averaged (I believe) ppO2 from multiple sensors, but I suspect some computers might report ppO2 from all sensors on top of that. Note, that some of sensors might be in error.
(please note I am not a CCR diver)
For the other two, I'm not so sure. The gradient factors are very specific to the decompression model,
Maybe... but I wonder how data from VPM-B + GFS combination is being stored
http://www.hhssoftware.com/v-planner/vpmdetail.html
:)
and cns values can be calculated.
You can't if you don't have residual oxygen load from previous dive (i.e. removed dive from logbook).
Also, CNS depends on the pressure, so your dive computer _might_ calculate it in real time (useful on altitudes different than sea level I suppose?).
I know you can estimate (or recalculate) a lot of values, but I am interested in what my machine reported to me at given time and depth. I am _not_ interested what some application thinks, even if it provides quite accurate information. Actually... I am interested in the dive computer data when it is innaccurate. :)
[...]
Regards,
w
On 2012-07-05 19:47, Artur Wroblewski wrote:
On Thu, Jul 5, 2012 at 4:21 PM, Jef Driesen jefdriesen@telenet.be wrote:
I think we can consider adding ppO2 to the list of supported types. It's probably mainly relevant for rebreathers, where the traditional concept of gas switches doesn't exist and the actual breathing gas changes continuously. For open circuit and also semi-closed I think, there is no ppO2 sensor and I assume the value just gets calculated from the pre-configured gas mix and current depth?
It is for rebreathers indeed. You will have averaged (I believe) ppO2 from multiple sensors, but I suspect some computers might report ppO2 from all sensors on top of that. Note, that some of sensors might be in error.
(please note I am not a CCR diver)
I'm not a CCR (or SCR) diver either. I know the basic principles, but that's about it.
I know you can estimate (or recalculate) a lot of values, but I am interested in what my machine reported to me at given time and depth. I am _not_ interested what some application thinks, even if it provides quite accurate information. Actually... I am interested in the dive computer data when it is innaccurate. :)
The primary use-case for the libdivecomputer library is a general purpose logbook application (like macdive, divinglog, etc), which is not tied to a specific model of dive computer. None of those applications will ever support every single feature of each dive computer. The goal is to support a good subset that will satisfy the majority of the users. It's a trade off you have to make.
If you want to squeeze out every single piece of information, then you are in fact looking at a highly device specific application (like ostctools), not some general purpose application. You can still use libdivecomputer for this purpose, but it will require some additional effort too. For example you can use libdivecomputer for the downloading (which is usually the hardest task), and do the parsing yourself.
Jef
On Wed, Jul 4, 2012 at 8:53 AM, Jef Driesen jefdriesen@telenet.be wrote:
Now that we are discussing major api improvements, let's also have a look at the sample parsing api. The current api generates a stream of sample values, where the time value is special in the sense that it starts a new sample. I think it would be cleaner to somehow return an opaque sample, and then get all the values available for that sample. Something similar to this:
I think that would result in simpler code, yes. That said, at least for the really fundamental ones I would actually prefer to not have to see those "dc_sample_get_xyz()" calls for each.
Because we *know* that for each sample, we're going to want time, depth, temperature and cylinder pressure if available. So if we want a really convenient format, just give those directly without even asking for it (so maybe "dc_sample_t" could have them as defined members that we can just access, or maybe there would be a separate structure that contains the standard ones).
The first question is of course which sample types do we want to support? The absolute minimum to support is:
- time
- depth
- pressure
- temperature
Note that you already do "pressure" wrong, in that there doesn't seem to be any way to get the pressure samples from *multiple* cylinders in the same sample.
And no, gas-change events are not sufficient (although we can work around *some* of the limitations with them). There are literally dive computers out there that can listen to four different HP sensors at the same time, and people actually use them for more complex cases than just switching between cylinders for one diver. One dive instructor talked about tracking student air use from his dive computer using that kind of setup, for example.
So I haven't seen it personally yet, but the "there is only one pressure per sample" is not correct.
At a minimum, the pressure needs to have an index, and then for the multi-cylinder case you could generate multiple samples with the same time (but different high pressure index and values).
(And again, cylinder change events are *independent* of the HP samples: when you switch cylinders, the pressure sensor does not move with the cylinder switch. So for my own setup, for example, the HP sensor is always connected to one particular cylinder, and the pressure samples keep coming from that one even if I switch cylinders on the dive computer and start breathing from the other one. So please never try to mix up "pressure info" with "cylinder change" events, that way lies madness).
At the moment we also have:
- events: Anything that happens only at specific point in time. All other
sample types are typically measurements obtained from a sensor (or calculations based on them), with values available for each sample point (or at least every x sample points). But that is not the case for events. Typical examples are warnings, gas changes, etc. At the moment many events are supported, but the api sucks. The current plan is remove the events, use the vendor type instead (see below), and then add back support for the important events with a better api. The details of this last part are still unclear, but support for gas switches is definitely high on the list.
I wouldn't mind that. Right now subsurface takes the events pretty much the way you feed them to us (except we then generate a string for them), and quite frankly, the end result is not pretty. We have this "filter events" thing in subsurface just to get rid of the crazy ones.
For example, the Suunto Vyper Air claims to do deco diving with multiple gases, and yes, it does do the calculations for it. But the frigging thing is totally broken.
With the Suunto Vyper Air, if you have a deco gas (say 50% oxygen) that you obviously want to use at the *end* of your dive to off-gas, the f*cking idiotic dive computer will constantly complain about the fact that you are using your bottom gas (say, air) until you hit 70 ft ON THE WAY DOWN. Because the idiot computer things that since you are shallow (never mind that you are at the beginning of your dive) you should use your high-oxygen cylinder.
As a result, all my deco dives are completely *full* of the idiotic Suunto "gas change warning" events. In fact, I won't even tell my dive computer that I have nitrox in my second cylinder if I'm not actually going to do a real deco dive, because it's so annoying (not just in the logs - it's actually distracting on the dive itself, because the "you should change gas" warning will actually replace the PSI reading with the O2% reading just to try to remind me).
So the "raw events" from the dive computer are often totally useless. It would be much better to have some sane per-computer "turn this event into something useful" model that can fix things like this.
(Sure, some people may again want the raw data, so have some raw data model for those things, but I'm talking about uses like subsurface that really want the *abstraction* part of libdivecomputer that makes all computers look sane - whether they really are that or not)
- rbt: The RBT (remaining bottom time) is the time you can spend at the
current depth and still have enough gas supply to make a safe ascent. It's currently available for the Uwatec Smart/Galileo only. Other air integrated dive computers often support a similar concept, so I think we can just keep this. Maybe rename it to the less cryptic "airtime"? Note that the actual interpretation may vary between models, depending on whether factors like the safety reserve, the ascent time, decompression time, etc are taking into account or not.
I'm not convinced the rbt is all that useful in post-dive logging. The dive computers that log it do so because they already calculated it, and the log is often just basically a "this is what I'm going to display" (many dive computers that *don't* log the rbt will still log the fact that they warned about low rbt - you can often set these things to warn when they think you are getting low).
But post-dive, what's the advantage of seeing the rbt? It's kind of pointless, and it was always really just calculated to begin with (ie it has no *real* data behind it, and the dive logging software could calculate its own version using the SAC rate etc). At the time of the dive, rbt is useful as a "this is what I estimate", but *after* the dive the fact that it is just an estimate makes it kind of pointless.
And as you say, it's not well-specified to begin with, since different computers calculate the remaining time totally differently (not just in details, but in the whole "what does it mean" department).
- heartbeat: Another Uwatec Smart/Galileo feature. Although I don't know
any other model that supports a heart monitor sensor, but it doesn't cause any trouble either. So I think we can keep this one too. BTW, is the correct name heartbeat or heartrate? I think it's heartrate, expressed in beats per minute (bpm). It's a bit confusing because both have the same translation in my native language.
I would call it heartrate.
- bearing: Yet another Uwatec Smart/Galileo feature. There are several
other models that have a digital compass, but they usually report it as an event, not a sample type. I also believe it makes more sense to consider a change in the compass bearing as an event. It's a setting you change, not some sensor that is polled every sample. I don't think any dive computer will ever generate a continuous stream with compass bearings. So I think we should drop this. BTW, is the correct name comnpass bearing or heading?
Either bearing or heading, and yes, at least for mine, it's an event (and I think it only happens when you ask for it with a bookmark).
- vendor: A vendor specific extension. This is intended for all samples
that don't fit well into one of the types supported by the libdivecomputer library. The application is responsible for parsing the raw data, but at least it doesn't have to care about where and how the samples are stored. There are actually two variants of the vendor type today. For the first variant each piece of info gets its own vendor sample. That means a dc_sample_t can contain multiple vendor values. The second variant is the raw data representing the entire dc_sample_t data. Both have their use, so I propose to reserve the vendor type for the first case, and add a new raw type for the second case.
Fair enough.
What else should we support? Some common requests I get are:
- gas switches
- no decompression time
- decompression stops
- ...
Gas switching is a real event (just as long as you don't mix it up with the pressure data). I would suggest against bothering with the computed stuff, although the "violated deco stop" event is obviously an interesting event.
Decompression is a bit troublesome. Some models have just a simple deco flag. Others provide a time and depth, which is typically the time to surface (TTS) (e.g. total time for all stops and ascent combined) and the depth of the current stop. But their are plenty of other interpretations possible too. I won't be surprised some of the more technical dive computers provide a full decoplan (e.g. time at each depth). I have no idea how to support something like this in a clean way.
I despise the deco events I see from suunto. In the raw format we get it now, there's a "you are now in deco" and then a second "you are now no longer in deco" event. I guess it makes sense, but I really don't know how to sensibly even show it sanely. Rigth now subsurface shows it as two events, and it's just annoying.
At the same time, the fact that the computer thinks you are in deco (or not) _is_ interesting information, so I think it's roughly the right thing to do. It's just that I don't like how libdivecomputer gives the data: it was not at all clear to me what the deco events meant initially.
Note that an alternative solution would be to implement a decompression algorithm in the application and calculate all kinds of deco related parameters (tissue saturation, deco stops, etc) that way. It won't necessary match with your dive computer, but at least you would get a consistent application experience, regardless of the type of dive computer you have. Personally, I think this makes more sense for a general purpose logbook application, but not everyone may agree with that of course.
I agree with you. *If* we really care, we could calculate most of these things on our own. In many cases, calculating them on our own is way more interesting than what the computer gives us, because then you can change the model and see "oh, according to *that* model I am already dead".
Linus
On 2012-07-04 20:47, Linus Torvalds wrote:
On Wed, Jul 4, 2012 at 8:53 AM, Jef Driesen jefdriesen@telenet.be wrote:
Now that we are discussing major api improvements, let's also have a look at the sample parsing api. The current api generates a stream of sample values, where the time value is special in the sense that it starts a new sample. I think it would be cleaner to somehow return an opaque sample, and then get all the values available for that sample. Something similar to this:
I think that would result in simpler code, yes. That said, at least for the really fundamental ones I would actually prefer to not have to see those "dc_sample_get_xyz()" calls for each.
Because we *know* that for each sample, we're going to want time, depth, temperature and cylinder pressure if available. So if we want a really convenient format, just give those directly without even asking for it (so maybe "dc_sample_t" could have them as defined members that we can just access, or maybe there would be a separate structure that contains the standard ones).
The problem with such a convenient format is that the only fields that are guaranteed to be available are time and depth. Everything else is optional and may not always be available. So if we would introduce some all-in-one structure, like the one below, how do you know when a particular field is available or not?
struct dc_sample_basic_t { unsigned int time; double depth; double temperature; double pressure[MAX]; };
Using some magic value to indicate the "not available" case is ugly and might cause problems too. For example zero or a negative value for temperature would be a bad choice, because they are perfectly valid temperatures.
Another complication are the multi-valued fields like the tank pressure. In the structure above I defined it as an array, but this won't work well, because the value for MAX will be different for every device. If we hardcode some global value that is large enough for all currently supported devices, then we are still screwed if a new devices appears with a larger number. And we can't fix it without breaking existing code.
I think what we really need here, is some kind of api to query which types are actually present.
BTW, with the dc_sample_get_xyz() accessor functions we can extend them to accept a unit parameter, and return the value in the requested units (e.g. si, metric or imperial) directly. Although it might be a bit overkill and better to leave unit conversion to the application. Just a thought.
Note that you already do "pressure" wrong, in that there doesn't seem to be any way to get the pressure samples from *multiple* cylinders in the same sample.
And no, gas-change events are not sufficient (although we can work around *some* of the limitations with them). There are literally dive computers out there that can listen to four different HP sensors at the same time, and people actually use them for more complex cases than just switching between cylinders for one diver. One dive instructor talked about tracking student air use from his dive computer using that kind of setup, for example.
So I haven't seen it personally yet, but the "there is only one pressure per sample" is not correct.
At a minimum, the pressure needs to have an index, and then for the multi-cylinder case you could generate multiple samples with the same time (but different high pressure index and values).
(And again, cylinder change events are *independent* of the HP samples: when you switch cylinders, the pressure sensor does not move with the cylinder switch. So for my own setup, for example, the HP sensor is always connected to one particular cylinder, and the pressure samples keep coming from that one even if I switch cylinders on the dive computer and start breathing from the other one. So please never try to mix up "pressure info" with "cylinder change" events, that way lies madness).
Multiple pressure values per sample are fully supported today, and I have no intentions to drop that in the new api. With the current api you get a pressure value and a tank id, just as you describe. I don't know why you think only a single pressure value is supported, although it might be due to the next paragraph :-)
Note that today, many dive computers with support for multiple pressure sensor, only store a pressure value for the active tank. In such case you will only get a single pressure value per sample, but this a limitation of the dive computer data format, not the libdivecomputer api. If a dive computer records multiple pressure values per sample, you'll get all of them. I assume this limitation is done to save memory. Inactive tanks are likely to stay at a constant pressure (unless you have a leak, have your BCD attached to it, or are doing buddy-breathing, etc) and record it is just a waste of memory. Another reason may be that a dive computer display has only space to show one pressure value at a time, and only that value gets recorded. I would have to double check, but I think this last one also applies to those dive computers that support monitoring your buddy's tank pressure (e.g. some oceanics). By default your own pressure is shown and you have to do something like pushing a button to actually see your buddy's pressure, and only then it gets recorded.
Tank pressures and gas switches are indeed independent, but I never claimed otherwise.
At the moment we also have:
- events: Anything that happens only at specific point in time. All
other sample types are typically measurements obtained from a sensor (or calculations based on them), with values available for each sample point (or at least every x sample points). But that is not the case for events. Typical examples are warnings, gas changes, etc. At the moment many events are supported, but the api sucks. The current plan is remove the events, use the vendor type instead (see below), and then add back support for the important events with a better api. The details of this last part are still unclear, but support for gas switches is definitely high on the list.
I wouldn't mind that. Right now subsurface takes the events pretty much the way you feed them to us (except we then generate a string for them), and quite frankly, the end result is not pretty. We have this "filter events" thing in subsurface just to get rid of the crazy ones.
[...]
So the "raw events" from the dive computer are often totally useless. It would be much better to have some sane per-computer "turn this event into something useful" model that can fix things like this.
(Sure, some people may again want the raw data, so have some raw data model for those things, but I'm talking about uses like subsurface that really want the *abstraction* part of libdivecomputer that makes all computers look sane - whether they really are that or not)
What I had in mind was that for now, we just resort to using raw data (e.g. the vendor extension), and try to document the format (much of this stuff is still unknown). Then later on, we can come back to this and provide some helper functions to assist parsing this data, or parse it internally into some nicer structure. But right now there are many things that I consider far more important.
- rbt: The RBT (remaining bottom time) is the time you can spend at
the current depth and still have enough gas supply to make a safe ascent. It's currently available for the Uwatec Smart/Galileo only. Other air integrated dive computers often support a similar concept, so I think we can just keep this. Maybe rename it to the less cryptic "airtime"? Note that the actual interpretation may vary between models, depending on whether factors like the safety reserve, the ascent time, decompression time, etc are taking into account or not.
I'm not convinced the rbt is all that useful in post-dive logging. The dive computers that log it do so because they already calculated it, and the log is often just basically a "this is what I'm going to display" (many dive computers that *don't* log the rbt will still log the fact that they warned about low rbt - you can often set these things to warn when they think you are getting low).
But post-dive, what's the advantage of seeing the rbt? It's kind of pointless, and it was always really just calculated to begin with (ie it has no *real* data behind it, and the dive logging software could calculate its own version using the SAC rate etc). At the time of the dive, rbt is useful as a "this is what I estimate", but *after* the dive the fact that it is just an estimate makes it kind of pointless.
And as you say, it's not well-specified to begin with, since different computers calculate the remaining time totally differently (not just in details, but in the whole "what does it mean" department).
I agree that the most interesting data is the "fundamental" data which is obtained directly from a sensor (e.g. time, depth, temperature, pressure). Everything else is usually calculated from this fundamental data, and can be recalculated (in a much more consistent way) post-dive. But nevertheless many people are still interested in this kind of info. One of the most common questions I get from divinglog or macdive users is: "Why don't I get info X when downloading my dives? It's available in the manufacturer's application, so it must be available in the data somewhere!".
So even while I'm personally not convinced of their usefulness either, the info is there and if it doesn't cause major trouble, it doesn't really hurt to support it to some extent. But that's only for the most common types of course. For the really highly device specific things, the answer is definitely no. That's simply out of the scope of the libdivecomputer library. And if anyone really insist, you can still get it from the raw data.
What else should we support? Some common requests I get are:
- gas switches
- no decompression time
- decompression stops
- ...
Gas switching is a real event (just as long as you don't mix it up with the pressure data). I would suggest against bothering with the computed stuff, although the "violated deco stop" event is obviously an interesting event.
I also consider gas switching a must have feature. It's one of the important things that is missing at the moment.
Like you say, there are probably a few more events that are interesting, but whatever list we end up with, it will always be a bit subjective. And the mapping from the raw device events to our standard set isn't always straightforward. For example the Suunto has voluntary and mandatory safety stops, deep stops and deco stops. Then what do we consider a "violated deco stop"? Only the real deco stops, or also the others? I'm pretty sure that if we pick one option, there will be some complaints that it doesn't match with the manufacturer's application. Not that I care too much about a perfect match...
At the same time, the fact that the computer thinks you are in deco (or not) _is_ interesting information, so I think it's roughly the right thing to do. It's just that I don't like how libdivecomputer gives the data: it was not at all clear to me what the deco events meant initially.
The exact meaning of the events isn't always clear for me either. Very often they show up in the manufacturer's application as a single word, without giving any real explanation. Even if you start digging deeper in the manuals there is no clear explanation, if there is an explanation at all. Some events in the data do not even show up in the applications and remain a mystery...
Jef
On Thu, Jul 5, 2012 at 4:20 PM, Jef Driesen jefdriesen@telenet.be wrote: [...]
BTW, with the dc_sample_get_xyz() accessor functions we can extend them to accept a unit parameter, and return the value in the requested units (e.g. si, metric or imperial) directly. Although it might be a bit overkill and better to leave unit conversion to the application. Just a thought.
IMHO, stick to SI. It is a standard... Pisses off everyone, but at the same time does not make happy anyone like metric or imperial do. For me unit conversion belongs to presentation layer. Of course, respect to those who disagree. ;)
[...]
Regards,
w
On 2012-07-05 19:58, Artur Wroblewski wrote:
On Thu, Jul 5, 2012 at 4:20 PM, Jef Driesen jefdriesen@telenet.be wrote: [...]
BTW, with the dc_sample_get_xyz() accessor functions we can extend them to accept a unit parameter, and return the value in the requested units (e.g. si, metric or imperial) directly. Although it might be a bit overkill and better to leave unit conversion to the application. Just a thought.
IMHO, stick to SI. It is a standard... Pisses off everyone, but at the same time does not make happy anyone like metric or imperial do. For me unit conversion belongs to presentation layer. Of course, respect to those who disagree. ;)
For me, the choice is either SI or metric. Definitely no imperial units!
The differences between SI and metric are actually quite small. It's basically only pressure (Pascal vs bar) and temperature (Kelvin vs Celsius) that are different. And for the pressure the difference is nothing more than a scale factor to a more everyday magnitude. What I do like about degrees Kelvin is that there are no negative values possible.
Jef
On 06.07.12 15:17, Jef Driesen wrote:
The differences between SI and metric are actually quite small. It's basically only pressure (Pascal vs bar) and temperature (Kelvin vs Celsius) that are different. And for the pressure the difference is nothing more than a scale factor to a more everyday magnitude. What I do like about degrees Kelvin is that there are no negative values possible.
Please go for bar and Celcius. Those are units that we actually use. No need to add mandatory conversions on the client side.
Henrik
On 22-06-12 13:51, Jef Driesen wrote:
I have committed some several backwards incompatible changes to the libdivecomputer api. To keep the inconvenience to a minimum, a stable "release-0.1" branch has been created, which will only receive pure bugfixes. Applications are recommended to stick to this stable branch for now. Especially because these changes are only the first in a series, and backwards compatibility will be broken again in the master branch.
I have committed another change to the master branch. There is now a new dc_context_t object that is intended to initialize and shutdown the library. All public interfaces have been changed to include a context pointer as one of their arguments. If you are using the dc_device_open() and dc_parser_new() functions, you'll only have to update the dc_device_open() call, because the context pointer is automatically inherited.
At the moment, the main benefit is the improved logging, which can be enabled/disabled at both compile time (--enable-logging configure option) or runtime (dc_context_set_loglevel). By default, all logging goes to stderr as before, but an application can register a custom logging function (dc_context_set_logfunc). This will make it easier to display the log messages in your user interface.
As usual, feedback is always welcome!
Jef
Den 27.08.12 23:18, skrev Jef Driesen:
I have committed another change to the master branch. There is now a new dc_context_t object that is intended to initialize and shutdown the library. All public interfaces have been changed to include a context pointer as one of their arguments. If you are using the dc_device_open() and dc_parser_new() functions, you'll only have to update the dc_device_open() call, because the context pointer is automatically inherited.
Hi Jef.
When I compile the new libdivecomputer on MacOSX 10.8, I get the following error:
CC atomics_cobalt.lo atomics_cobalt.c: In function ‘atomics_cobalt_device_version’: atomics_cobalt.c:224: error: ‘DEVICE_STATUS_TIMEOUT’ undeclared (first use in this function) atomics_cobalt.c:224: error: (Each undeclared identifier is reported only once atomics_cobalt.c:224: error: for each function it appears in.) atomics_cobalt.c:224: error: ‘DEVICE_STATUS_IO’ undeclared (first use in this function) atomics_cobalt.c: In function ‘atomics_cobalt_read_dive’: atomics_cobalt.c:280: error: ‘DEVICE_STATUS_TIMEOUT’ undeclared (first use in this function) atomics_cobalt.c:280: error: ‘DEVICE_STATUS_IO’ undeclared (first use in this function) make[3]: *** [atomics_cobalt.lo] Error 1 make[2]: *** [all] Error 2 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2
The only references to DEVICE_STATUS_TIMEOUT and DEVICE_STATUS_IO I can find is in atomics_cobalt.c. What am I missing?
Cheers, Henrik
On 28-08-12 08:12, Henrik wrote:
Den 27.08.12 23:18, skrev Jef Driesen:
I have committed another change to the master branch. There is now a new dc_context_t object that is intended to initialize and shutdown the library. All public interfaces have been changed to include a context pointer as one of their arguments. If you are using the dc_device_open() and dc_parser_new() functions, you'll only have to update the dc_device_open() call, because the context pointer is automatically inherited.
Hi Jef.
When I compile the new libdivecomputer on MacOSX 10.8, I get the following error:
CC atomics_cobalt.lo atomics_cobalt.c: In function ‘atomics_cobalt_device_version’: atomics_cobalt.c:224: error: ‘DEVICE_STATUS_TIMEOUT’ undeclared (first use in this function) atomics_cobalt.c:224: error: (Each undeclared identifier is reported only once atomics_cobalt.c:224: error: for each function it appears in.) atomics_cobalt.c:224: error: ‘DEVICE_STATUS_IO’ undeclared (first use in this function) atomics_cobalt.c: In function ‘atomics_cobalt_read_dive’: atomics_cobalt.c:280: error: ‘DEVICE_STATUS_TIMEOUT’ undeclared (first use in this function) atomics_cobalt.c:280: error: ‘DEVICE_STATUS_IO’ undeclared (first use in this function) make[3]: *** [atomics_cobalt.lo] Error 1 make[2]: *** [all] Error 2 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2
The only references to DEVICE_STATUS_TIMEOUT and DEVICE_STATUS_IO I can find is in atomics_cobalt.c. What am I missing?
I noticed this too. I made a mistake during merging the bugfixes from the release-0.1 branch. No idea why I didn't notice this before pushing the changes yesterday. I guess I was half asleep already :-) Anyway, the fix will be available in a couple of minutes.
Jef
Den 28.08.12 08:30, skrev Jef Driesen:
I noticed this too. I made a mistake during merging the bugfixes from the release-0.1 branch. No idea why I didn't notice this before pushing the changes yesterday. I guess I was half asleep already :-) Anyway, the fix will be available in a couple of minutes.
Great!
After I changed those to DC_STATUS_*, I stumbled upon a new error:
CC irda_dummy.lo irda_dummy.c:56: error: conflicting types for ‘irda_socket_open’ irda.h:35: error: previous declaration of ‘irda_socket_open’ was here make[3]: *** [irda_dummy.lo] Error 1
It looks like irda_socket_open has two different declarations:
$ grep -R irda_socket_open * src/irda.c:irda_socket_open (irda_t **out, dc_context_t *context) src/irda.h:int irda_socket_open (irda_t **device, dc_context_t *context); src/irda_dummy.c:irda_socket_open (irda_t **out) src/uwatec_smart.c: int rc = irda_socket_open (&device->socket, context);
Cheers, Henrik
On 28-08-12 08:45, Henrik wrote:
Den 28.08.12 08:30, skrev Jef Driesen:
I noticed this too. I made a mistake during merging the bugfixes from the release-0.1 branch. No idea why I didn't notice this before pushing the changes yesterday. I guess I was half asleep already :-) Anyway, the fix will be available in a couple of minutes.
Great!
After I changed those to DC_STATUS_*, I stumbled upon a new error:
CC irda_dummy.lo
irda_dummy.c:56: error: conflicting types for ‘irda_socket_open’ irda.h:35: error: previous declaration of ‘irda_socket_open’ was here make[3]: *** [irda_dummy.lo] Error 1
It looks like irda_socket_open has two different declarations:
$ grep -R irda_socket_open * src/irda.c:irda_socket_open (irda_t **out, dc_context_t *context) src/irda.h:int irda_socket_open (irda_t **device, dc_context_t *context); src/irda_dummy.c:irda_socket_open (irda_t **out) src/uwatec_smart.c: int rc = irda_socket_open (&device->socket, context);
Should be fixed too now.
Jef
On 2012-06-22 13:51, Jef Driesen wrote:
- Add device enumeration support.
With the new device enumeration support, applications can now enumerate all the supported devices at runtime, and don't have to maintain their own list anymore. The api uses an iterator style interface. See universal.c for some example code.
I noticed two possible "problems" with the new device enumeration api. Right now each device is identified by means of its name (with a vendor and product part) and model number. (There is also the family type, but that's irrelevant for this discussion.) Because it's the combination of both fields which should be unique, there can be two slightly confusing cases:
1. Devices with the same name, but different model numbers.
This should be rare, but can happen when there are multiple variants of a particular device. The only example that exists today is the Oceanic OC1. There is a variant with model number 0x434E and one with 0x4449.
Having two devices with the same name might be a bit annoying. In a typical application only the name will be displayed, and thus the end-user would see a duplicated entry. Note that for the Oceanic's, the actual model number isn't necessary for downloading, so both entries would work equally fine and the application could hide the duplicate entries. However I can't guarantee that will always be the case.
Removing the duplicated entries from the library isn't really an option either, because then reverse lookups won't work anymore. Assume a user owns an OC1, but selects another model from the same family, for example the Atom 2.0. Downloading will work fine, because internally the backend will autodetect the correct model, and report back the correct model number to the application. The application could then (but that's entirely optional) lookup the associated name based on the model code and the family type. But of course this reverse search will only work if the duplicate entries are still there.
2. Devices with a different name, but the same model number.
This can happen when identical devices are sold under a different name (e.g. rebranded models), or when a newly introduced variant is fully backwards compatible with previous variants and shares the same model code. An example of the first case are the Aladin Air X series, which were renamed to Aladin Air Z. Examples of the second case is the HW OSTC, which exists in 3 variants (OSTC, MK2 and 2N).
For the end-users it's probably convenient to have these "duplicate" entries, because they may not be aware they are in fact the same device. However for a reverse lookup, it means all entries will match and you end up with multiple candidates instead of an exact match.
Definitely not a major problem, but something you might notice some day.
Jef