Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arrays and maps: Members, entries, values, contents, pairs, … #1338

Closed
ChristianGruen opened this issue Jul 22, 2024 · 5 comments
Closed
Assignees
Labels
Enhancement A change or improvement to an existing feature Propose for V4.0 The WG should consider this item critical to 4.0 XPath An issue related to XPath XQFO An issue related to Functions and Operators

Comments

@ChristianGruen
Copy link
Contributor

ChristianGruen commented Jul 22, 2024

With version 4.0, we are adding a lot of promising and powerful new map and array features. This is a big step forward, compared to the obvious limitations of 3.1.

Some aspects of the 3.1 design have made it difficult (or impossible) to fully adjust array and maps, but (in my opinion) the old overall concept was impressively consistent – and it is definitely a big challenge to achieve a 4.0 design that is not too fragmented.

To me, this becomes particularly evident in the case of arrays. The following example sums up the items of all members of an array. For the cumbersome 3.1 solution…

for $pos in 1 to array:size($array)
return sum($array($pos))

…we now have at least several (roughly?) equivalent options to do this; for example…

  1. for member $m in $array return sum($m)
  2. array:members($array) ! sum(?value)
  3. $array?entry::* ! sum(?value)
  4. $array?value::* ! sum(.)

…which is great – but the downside is that we have introduced a terminological jungle. The examples above could imply that:

  • for 1., an array member is a sequence (which it indeed is);
  • for 2., an array member is a map;
  • for 3., an array has entries (but there is no array:entries);
  • for 4., an array has values (which is true, but array:value returns a different structure).

Next, with the current proposals, $array:content::1 gives us the sequence-concatenated version of the first member of an array. Similar observations can be made with maps: map:entries($map) returns singleton maps, whereas $map?entry::* is actually equivalent to map:pairs.

The fundamental obstacle are clear have already been discussed a lot, but I think that with each new concept, we should try really hard not to blur terminology, and work with terms that users can assign to the underlying concepts without too much guessing or trial’n’error.

My general suggestions would be to…

  1. align the new lookup terminology and the builtin functions, and
  2. omit, rename or drop builtin functions that do not rely on the existing or arising terminology.

My concrete proposals:

  1. As we already have map:pairs, $map-or-array?entry::* should become $map-or-array?pair::*, and we should add a array:pairs function, and probably array:of-pairs (see 77 Lookup returning path selection #832). We shouldn’t do it the other way round and rename map:pairs to map:entries, as the existing map:entry function returns a singleton map.
  2. If we keep calling the sequence-concatenated result “content”, we should include it in the definition of sequence-concatenation. In addition, (array|map):values should be renamed to (array|map):contents (see Editorial: array:values, map:values #1179).
  3. Due to the existence of array:value::*, we should make clear what an “array value” is, how it it positions itself in relation to an “array member”, and we should add map:values and array:values for equivalent results.
  4. Due to the existence of array:key::*, we should add a array:keys function (which returns a dense integer range). 1 to array:size($array) could then be written as array:keys($array).
  5. As we have map:entries and map:merge, we could add equivalent array:entries and array:merge functions.
  6. I would suggest dropping array:members/array:of-members in favor of either array:split/array:join, array:pairs/array:of-pairs (see 1.) or array:entries/array:merge (see 5). I really believe that an “array member“ should not be a map; an “array pair” or ”array entry” certainly could.

One might question if we should really introduce map terminology for arrays. I think we have no other chance if we want to treat maps and arrays identically with lookup key specifiers, and it may help us later on to treat both data structures as similar as possible.

@ChristianGruen ChristianGruen added XPath An issue related to XPath XQFO An issue related to Functions and Operators Enhancement A change or improvement to an existing feature labels Jul 22, 2024
@ChristianGruen
Copy link
Contributor Author

In #1457 (comment), I found cases in which “map keys” and “array indexes” are mentioned. I think it gets more and more confusing to respect the differences between maps and arrays, and analogous to array:key::*, I would be happy if we treated array index values as keys (adding an array:keys function, as suggested in this issue, would support this approach).

@ndw
Copy link
Contributor

ndw commented Oct 7, 2024

I think that if we're going to contemplate adding array:pairs, we need to revisit a more fundamental question: are we going to continue to present arrays as a sequence of values, or are we going to present them as maps with sequential integer keys starting at 1? (I'm not really talking about how they're implemented or even how they're defined technically, I'm talking about how the specification is going to present them.)

If we're expecting the reader to build a mental model of arrays as a sequence of values and maps as a set of key/value pairs, then I worry that adding functions that make them more uniform just makes them harder to understand.

I wonder if there'd be some substantial simplification possible if we just accepted that arrays are maps with sequential, integer keys. All the functions that apply to map {1: "Hello", 2: "World"} apply equally to array{("Hello", "World")} in exactly the way they would if the latter was (literally) implemented as the former.

@ChristianGruen
Copy link
Contributor Author

I wonder if there'd be some substantial simplification possible if we just accepted that arrays are maps with sequential, integer keys. All the functions that apply to map {1: "Hello", 2: "World"} apply equally to array{("Hello", "World")} in exactly the way they would if the latter was (literally) implemented as the former.

I could imagine that this would be a huge change and affect virtually every expression and function that handles arrays. It might as well introduce backward incompatibilities, as arrays are stricter data types than maps (with additional bound and type checks). But your idea sounds enticing; maybe I’m overly cautious.

My intent in the scope of this issue would be mostly to get as consistent as possible, even if we cannot roll back the 3.1 decision to treat maps and arrays differently: If we want to provide a pair lookup specifier for arrays (which is what we currently do), we should also have a corresponding function. If we don’t want the function, we shouldn’t support $array?pair::* either.

@michaelhkay
Copy link
Contributor

I wonder if there'd be some substantial simplification possible if we just accepted that arrays are maps with sequential, integer keys. All the functions that apply to map {1: "Hello", 2: "World"} apply equally to array{("Hello", "World")} in exactly the way they would if the latter was (literally) implemented as the former.

It's an appealing idea but the devil is in the detail. While it's true that all functions that access maps could be made to view arrays as maps with integer keys, the same isn't true for construction, and therefore it isn't true for operations (such as filtering and mapping) that combine retrieval access and construction. There's also the detail that array keys are naturally sorted.

@ChristianGruen
Copy link
Contributor Author

Superseded by #1871.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement A change or improvement to an existing feature Propose for V4.0 The WG should consider this item critical to 4.0 XPath An issue related to XPath XQFO An issue related to Functions and Operators
Projects
None yet
Development

No branches or pull requests

3 participants