Skip to content

Make import of std.internal.unicode_tables lazy. #5945

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

wilzbach
Copy link
Member

@wilzbach wilzbach commented Dec 19, 2017

There's no need to open and open and parse the gigantic
std.internal.unicode_tables if it's never needed.

My journey:

std.net.curl -> etc.c.curl -> std.socket -> std.stdio -> std.uni -> std.internal.unicode_tables.

> cd std && echo "import std.uni;" > foo.d
> time dmd -c -o- foo.d
dmd -c -o- foo.d  0.10s user 0.00s system 99% cpu 0.101 total

> time dmd -c -o- foo.d -I..
dmd -c -o- foo.d -I..  0.04s user 0.01s system 99% cpu 0.050 total

Follow-up to #5916

Alternatively a benchmark with avgtime:

> avgtime -r100 dmd -c -o- foo.d
Total time (ms): 9942.26
Repetitions    : 100
Sample mode    : 96 (15 occurrences)
Median time    : 96.908
Avg time       : 99.4226
Std dev.       : 6.38707
Minimum        : 92.079
Maximum        : 120.269
95% conf.int.  : [86.9042, 111.941]  e = 12.5184
99% conf.int.  : [82.9706, 115.875]  e = 16.452
EstimatedAvg95%: [98.1707, 100.674]  e = 1.25184
EstimatedAvg99%: [97.7774, 101.068]  e = 1.6452

> avgtime -r100 dmd -c -o- foo.d -I..
Total time (ms): 4678.42
Repetitions    : 100
Sample mode    : 46 (40 occurrences)
Median time    : 46.537
Avg time       : 46.7842
Std dev.       : 1.39462
Minimum        : 44.338
Maximum        : 53.491
95% conf.int.  : [44.0508, 49.5176]  e = 2.73341
99% conf.int.  : [43.1919, 50.3765]  e = 3.59231
EstimatedAvg95%: [46.5108, 47.0575]  e = 0.273341
EstimatedAvg99%: [46.425, 47.1434]  e = 0.359231

There's no need to open and open and parse the gigantic
`std.internal.unicode_tables` if it's never needed.
@dlang-bot
Copy link
Contributor

Thanks for your pull request, @wilzbach!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

std/uni.d Outdated
alias UpperTriple = AliasSeq!(toUpperIndex, MAX_SIMPLE_UPPER, toUpperTab);

enum MAX_SIMPLE_LOWER = 1043;
alias LowerTriple = AliasSeq!(toLowerIndex, MAX_SIMPLE_LOWER, toLowerTab);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only bit I couldn't figure out how to do lazily. Better ideas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are these aliases used? Since they were originally private, you may be able to just rewrite the places where they're used to lazily instantiate the definition. Or maybe something like:

private alias UpperTriple() = ...

might do the trick? Not 100% this syntax is supported, or whether it will do what I think it will do.

@wilzbach
Copy link
Member Author

wilzbach commented Dec 19, 2017

Another module:

> cd std && echo "import std.net.curl;" > foo.d
> time dmd -c -o- foo.d
time dmd -c -o- foo.d
dmd -c -o- foo.d  0.24s user 0.04s system 99% cpu 0.288 total

> time dmd -c -o- foo.d -I..
dmd -c -o- foo.d -I..  0.21s user 0.03s system 99% cpu 0.241 total

With avgtime:

> avgtime -r100 dmd -c -o- foo.d
------------------------
Total time (ms): 31928.2
Repetitions    : 100
Sample mode    : 340 (29 occurrences)
Median time    : 315.292
Avg time       : 319.282
Std dev.       : 28.5593
Minimum        : 276.011
Maximum        : 375.411
95% conf.int.  : [263.307, 375.257]  e = 55.9752
99% conf.int.  : [245.718, 392.846]  e = 73.5638
EstimatedAvg95%: [313.685, 324.88]  e = 5.59752
EstimatedAvg99%: [311.926, 326.639]  e = 7.35638
> avgtime -r100 dmd -c -o- foo.d -I..

------------------------
Total time (ms): 24724
Repetitions    : 100
Sample mode    : 230 (46 occurrences)
Median time    : 237.948
Avg time       : 247.24
Std dev.       : 19.8293
Minimum        : 228.172
Maximum        : 301.335
95% conf.int.  : [208.375, 286.105]  e = 38.8647
99% conf.int.  : [196.163, 298.317]  e = 51.0769
EstimatedAvg95%: [243.353, 251.126]  e = 3.88647
EstimatedAvg99%: [242.132, 252.347]  e = 5.10769

@wilzbach
Copy link
Member Author

I managed to make a couple of other imports lazy and now std.uni is quite slim:

Before:

parse     foo
importall foo
import    object	(/usr/include/dlang/dmd/object.d)
import    std.uni	(/usr/include/dlang/dmd/std/uni.d)
import    std.meta	(/usr/include/dlang/dmd/std/meta.d)
import    std.traits	(/usr/include/dlang/dmd/std/traits.d)
import    std.functional	(/usr/include/dlang/dmd/std/functional.d)
import    std.range.primitives	(/usr/include/dlang/dmd/std/range/primitives.d)
import    std.internal.unicode_tables	(/usr/include/dlang/dmd/std/internal/unicode_tables.d)
semantic  foo
import    std.range	(/usr/include/dlang/dmd/std/range/package.d)
import    std.typecons	(/usr/include/dlang/dmd/std/typecons.d)
import    core.stdc.stdint	(/usr/include/dlang/dmd/core/stdc/stdint.d)
import    core.stdc.stddef	(/usr/include/dlang/dmd/core/stdc/stddef.d)
import    core.stdc.signal	(/usr/include/dlang/dmd/core/stdc/signal.d)
import    core.stdc.wchar_	(/usr/include/dlang/dmd/core/stdc/wchar_.d)
import    core.stdc.config	(/usr/include/dlang/dmd/core/stdc/config.d)
import    core.stdc.stdarg	(/usr/include/dlang/dmd/core/stdc/stdarg.d)
import    core.stdc.stdlib	(/usr/include/dlang/dmd/core/stdc/stdlib.d)
import    core.stdc.stdio	(/usr/include/dlang/dmd/core/stdc/stdio.d)
import    core.stdc.time	(/usr/include/dlang/dmd/core/stdc/time.d)
import    core.sys.posix.sys.types	(/usr/include/dlang/dmd/core/sys/posix/sys/types.d)
import    core.sys.posix.config	(/usr/include/dlang/dmd/core/sys/posix/config.d)
import    std.array	(/usr/include/dlang/dmd/std/array.d)
import    std.algorithm.iteration	(/usr/include/dlang/dmd/std/algorithm/iteration.d)
import    std.range.interfaces	(/usr/include/dlang/dmd/std/range/interfaces.d)
import    std.format	(/usr/include/dlang/dmd/std/format.d)
import    core.vararg	(/usr/include/dlang/dmd/core/vararg.d)
import    std.exception	(/usr/include/dlang/dmd/std/exception.d)
semantic2 foo
semantic3 foo
code      foo

After

parse     foo
importall foo
import    object	(/home/seb/dlang/druntime/import/object.d)
import    std.uni	(/home/seb/dlang/phobos/std/uni.d)
import    std.meta	(/home/seb/dlang/phobos/std/meta.d)
import    std.traits	(/home/seb/dlang/phobos/std/traits.d)
import    std.functional	(/home/seb/dlang/phobos/std/functional.d)
import    std.range.primitives	(/home/seb/dlang/phobos/std/range/primitives.d)
semantic  foo
semantic2 foo
semantic3 foo
code      foo

@@ -1588,7 +1588,7 @@ private auto packedArrayView(T)(inout(size_t)* ptr, size_t items) @trusted pure
// Partially unrolled binary search using Shar's method
//============================================================================

string genUnrolledSwitchSearch(size_t size) @safe pure nothrow
auto genUnrolledSwitchSearch(size_t size) @safe pure nothrow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for this change?

}

@safe pure nothrow @nogc @property
{
import std.internal.unicode_tables; // generated file

// It's important to use auto return here, so that the compiler
// only runs semantic on the return type if the function gets
// used. Also these are functions rather than templates to not
// increase the object size of the caller.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the logic of the comment here. Why would templates increase the object size of the caller if they never get used? And does returning auto really suppress the import inside the function body until the function is called?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And does returning auto really suppress the import inside the function body until the function is called?

Yes it does, the compiler needs to have the full function body and run semantic3 on it to determine the return type. Auto return functions behave a lot like templates in that sense, they don't get compiled into the call-site though (AFAIR).

Copy link
Member

@DmitryOlshansky DmitryOlshansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I uge you to check how it performs after this surgery.

Secondly putting all tables to templates means we get to see them duplicated in many object files. As Martin I think he was the last to optimize tgis w.r.t. dynamic libraries

{
/// Run-time checked search.
static auto opCall(C)(in C[] name)
if (is(C : dchar))
{
// lazily import the big unicode_table
mixin(q{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG, this going beyond acceptable mutilation imho.

@@ -6095,6 +6099,10 @@ template SetSearcher(alias table, string kind)
/// Compile-time checked search.
static @property auto opDispatch(string name)()
{
// lazily import the big unicode_table
mixin(q{
import std.internal.unicode_tables;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@quickfur
Copy link
Member

Maybe another approach that will speed up imports without massive mutilation of std.uni might be to generate .di file from unicode_tables.d and import that instead. That way, no actual code change will be required, and we avoid template bloat.

@quickfur
Copy link
Member

ping @wilzbach @DmitryOlshansky

Any ideas how to move this forward?

@CyberShadow
Copy link
Member

Can we use .di files here to just provide an interface to the tables, so that the actual tables are parsed only when building Phobos and never when building user code?

@wilzbach
Copy link
Member Author

wilzbach commented Feb 4, 2018

Can we use .di files here to just provide an interface to the tables, so that the actual tables are parsed only when building Phobos and never when building user code?

Yeah that should work in theory, but then we would probably have to change the layout of Phobos a bit (e.g. like Druntime with it's import folder). However, I have an idea and will give it a try.

@wilzbach
Copy link
Member Author

wilzbach commented Feb 4, 2018

Can we use .di files here to just provide an interface to the tables, so that the actual tables are parsed only when building Phobos and never when building user code?

Tried it here: #6123.
tl;dr: DMD's header generation has many bugs, I bumped into two. They can be fixed though (submitted PRs). The big problem though is that the unicode_tables file uses templates too, so they won't be removed by the header file generation.

Copy link
Member

@MartinNowak MartinNowak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OKish though I'm slightly wondering about some function -> template conversions.
Functions have the advantage that they are shipped pre-compiled with the phobos release.


ushort toUpperIndex(dchar c) { return toUpperIndexTrie[c]; }
ushort toUpperSimpleIndex(dchar c) { return toUpperSimpleIndexTrie[c]; }
dchar toUpperTab(size_t idx) { return toUpperTable[idx]; }
dchar toUpperTab()(size_t idx) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not a function?

Actually making things a function above was a decision to also save on codegen time when sth. gets used, because the binary is already in phobos. Imports in function bodies do not leak to the call site, so you also save those when merely calling.
The compiler can still inline things if found appropriate at which point you pay the same price as for instantiating the template.

// lazily import the big unicode_table
mixin(q{
import std.internal.unicode_tables;
alias table = } ~ tableName ~ ";");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

mixin("std.internal.unicode_tables : table="~tableName~";")

instead?

@@ -6218,7 +6226,7 @@ auto caseEnclose(CodepointSet set)
/+
fetch codepoint set corresponding to a name (InBlock or binary property)
+/
@trusted CodepointSet getUnicodeSet(in char[] name, bool negated, bool casefold)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make it a template?

@MartinNowak
Copy link
Member

Yeah that should work in theory, but then we would probably have to change the layout of Phobos a bit (e.g. like Druntime with it's import folder). However, I have an idea and will give it a try.

Dmd prefers importing .di over .d files, so you could have them side by side. But our experience with .di files in druntime was fairly bad, try to avoid them.

@LightBender
Copy link
Contributor

PR closed as stalled. If you wish to resurrect it you are welcome to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants