Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize the natural sorting: improve sorting and performance #83

Merged
merged 1 commit into from
Feb 17, 2023

Conversation

dyphire
Copy link
Contributor

@dyphire dyphire commented Dec 10, 2022

Original sorting method doesn't correctly sort numbers with decimal points.

@CogentRedTester
Copy link
Owner

Can you give some before and after examples of how a set of files will be sorted as a result of this change?

@dyphire
Copy link
Contributor Author

dyphire commented Dec 10, 2022

Can you give some before and after examples of how a set of files will be sorted as a result of this change?

This is the method 4 mentioned in there: http://notebook.kulchenko.com/algorithms/alphanumeric-natural-sorting-for-humans-in-lua

The following is an example of the difference between the two sorting methods.
git mater:
image
with this pr:
image

In addition, the original sorting method has the following problems:

Ignore leading zeros. This one sorts "060" and "60" together (so "050" will come before "60"), but the order in which "060" and "60" are sorted is undetermined

file-browser.lua Outdated
Comment on lines 472 to 473
table.sort(t, function(a, b) return a.type:sub(1, 1) .. (a.label or a.name):lower():gsub("%.?%d+", padnum) .. ("%3d"):format(#b)
< b.type:sub(1, 1) .. (b.label or b.name):lower():gsub("%.?%d+", padnum) .. ("%3d"):format(#a) end)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the ("%3d"):format(#b) and ("%3d"):format(#a) doing?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case the rest of the comparison is the same, it'll sort them by string length.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in this case a and b are both item tables though? The length operator should not work. But I assume the idea is that if extra digits are inserted for the comparison then we want to prefer the item which had fewer digits initially?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly, that seems to be a bug. It should use the length of (label or name) instead.

@christoph-heinrich
Copy link

There have been performance and sorting improvements on this algorithm since this was posted.
I'd recommend to adapt the current state of autoload.lua.
The same algorithm is also used in uosc (plus some special handling depending on platform).

@CogentRedTester
Copy link
Owner

There have been performance and sorting improvements on this algorithm since this was posted. I'd recommend to adapt the current state of autoload.lua. The same algorithm is also used in uosc (plus some special handling depending on platform).

@dyphire do you have a response to this?

@dyphire
Copy link
Contributor Author

dyphire commented Jan 29, 2023

There have been performance and sorting improvements on this algorithm since this was posted. I'd recommend to adapt the current state of autoload.lua. The same algorithm is also used in uosc (plus some special handling depending on platform).

@dyphire do you have a response to this?

I agree with @christoph-heinrich that it is necessary to improve the implementation of the algorithm. But I'm not sure how to handle the sorting of folders and files in the new algorithm.

@christoph-heinrich
Copy link

But I'm not sure how to handle the sorting of folders and files in the new algorithm.

I didn't test it, but I think something like that should work.

tuples[i] = {f.type:sub(1, 1) .. ((f.label or f.name)):lower():gsub("0*(%d+)%.?(%d*)", padnum), f}

@dyphire
Copy link
Contributor Author

dyphire commented Jan 29, 2023

I didn't test it, but I think something like that should work.

tuples[i] = {f.type:sub(1, 1) .. ((f.label or f.name)):lower():gsub("0*(%d+)%.?(%d*)", padnum), f}

Yes, You're right. It's works.

@dyphire dyphire changed the title optimize the natural sorting: sort numbers with decimal points optimize the natural sorting: improve sorting and performance Jan 29, 2023
Copy link
Owner

@CogentRedTester CogentRedTester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Next time send me a ping when you want me to do a re-review. I had no idea that you had pushed an updated commit until today.

Comment on lines 463 to 464
--the number format functionality was proposed by github user twophyro, and was presumably taken
--from here: http://notebook.kulchenko.com/algorithms/alphanumeric-natural-sorting-for-humans-in-lua
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to change this comment? Perhaps cite autoload.lua instead? Or does autoload.lua also use an algorithm from that page?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, autoload.lua also use an algorithm from that page: mpv-player/mpv#10779

@CogentRedTester
Copy link
Owner

@dyphire I'm happy to merge this if you are.

Copy link
Owner

@CogentRedTester CogentRedTester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually wait, one change.

@dyphire
Copy link
Contributor Author

dyphire commented Feb 17, 2023

@dyphire I'm happy to merge this if you are.

It has been ready since the last push

@CogentRedTester CogentRedTester merged commit 695362e into CogentRedTester:master Feb 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants