Skip to content

convert last small purpose of builtin.pm to C and NOOP require's I/O #22699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: blead
Choose a base branch
from

Conversation

bulk88
Copy link
Contributor

@bulk88 bulk88 commented Oct 24, 2024

See commit text. Embedding modern Perl's near universal .pm'es/pragmas into static XS/C to avoid parsing/IO, is advantageous for all perl CI everywhere. builtin.pm was very easy to do, b/c it already was 99% static XS. And its part of the .pm dep tree of -E"".

-E say is supposed to be shorter to type, but whats the point if it requires typing a -I../lib every time for core hacking.

More philosophically, I want the -E"say();" from my first -e"print();" many years ago. I only do perl, because I couldn't get the C compiler to ever work, after loaning a big purple C book from my middle school library. Don't know why the book even was there in that library.

Many decades ago, perl5 was supposed to be a better shell script, or batch file. And perl was promised to be a single executable disk binary. Not 100s or 1000s of disk files for the C++ STL/.NET/Node/Java base class libraries, just to do hello world. A broken first ever newbie dev perl install that can do atleast -E"say()" will maybe keep someone in the perl community. A "features.pm file not found" error, well, that person moves onto another programing language in a few minutes and never looks back at perl.

node.bin is a fat packed pre-compiled pre-jitted ~70 MB single OS binary file with the basic class library burned in (undump() sort of). No env var or broken install problems on that platform. The .js files on disk from the installer are only for the JS debugger to use. Node can't be compared to P5, but P5 can atleast knock the low hanging fruit off and embed the basic .pm/pragmas or primary execution paths of them/lazy load pragma .pm'es etc.


  • This set of changes requires a perldelta entry, and it is included.

hv_store(inc_hv, "builtin.pm", STRLENs("builtin.pm"), newSVpvs(__FILE__), 0);
ver_gv = gv_fetchpvs("builtin::VERSION", GV_ADDMULTI, SVt_PV);
ver_sv = GvSV(ver_gv);
/* Remember to keep $VERSION in this file and $VERSION in builtin.pm synced. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have a test for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test is in this patch already. Change either side to 0.017 and test fails.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I missed that.

-builtin.pm is now primarily for POD and .pm indexing tools, core, CPAN or
user written.  It also is a backup mechanism for very strange %INC
localization, clearing, or manipulation done by users, probably in a .t,
and whatever %INC manipulation is being done is probably developer error.
-This removes all the libc/kernel I/O calls for builtin.pm, and Perl code
parser overhead.
-A large benefit is, this commit is 50% of the work, to make

perl -E 'say "hi";'

"/lib"-less or not dependent on any file I/O. perl.bin, libperl.so,
and miniperl.bin should be able to execute as a standalone binary.
If perl -e "1;" doesn't need a dozen separate library files,
perl -E "1;" also shouldn't need a dozen files.

perl -E "say 'Hello world';" should work, even with a broken perl
installation or unreachable "/lib/*.pm"s or broken "portable" perls.

Only a feature.pm dep is left, for -E to be lib-less.  That is for another
patch and PR in the the future.
@bulk88 bulk88 force-pushed the remove_builtin_package_IO_dep_on_dot_pm branch from d5208db to eb882bf Compare October 24, 2024 21:24
@bulk88
Copy link
Contributor Author

bulk88 commented Oct 24, 2024

repushed,, forgot to stage an extra sentence comment in the .pm

builtin.c Outdated
@@ -774,6 +774,9 @@ XS(XS_builtin_import)
void
Perl_boot_core_builtin(pTHX)
{
HV * inc_hv;
GV * ver_gv;
SV * ver_sv;
I32 i;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we fold these declarations into the assignments below now that we're C99? That would seem more readable to me. The loop index should probably also move but that might be considered out of scope for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in latestest .rev

@Leont
Copy link
Contributor

Leont commented Oct 24, 2024

This would mean %INC would no longer by empty at startup, right? That will actually break code of mine. I suspect there will be tests broken by this too.

@bulk88
Copy link
Contributor Author

bulk88 commented Oct 24, 2024

This would mean %INC would no longer by empty at startup, right? That will actually break code of mine. I suspect there will be tests broken by this too.

Cperl had a dozen things in %INC on startup. I dont remember any bug reports from back then and nothing comes up on google/GH with bug reports or complaints, but sample size of people who would write a ticket, is small, for that fork, but I wouldve found a ticket of some kind if it was a common pattern to depend on deep compare of %INC against a const hash.

%INC preloaded sounds safer than hard-coding pp_require to skip the IO on string_eq. "UNAUTHORIZED RELEASE"/CPAN/future back ports, IDK what, delete $INC{} and require(); is cargo culted for forking a module. If its a XS .so you are on your own to wipe the package glob without SEGVs. %INC preloads can be undone by anyone at anytime, a memcmp() in pp_require can't easily be undone or requires a documented backdoor/provision.

Or idea 3 /lib/strict/PP.pm and hardcode str_eq/memcmp into pp_require. Remember the orig PP .pm impl if it exists, has to stay around for some months/years, and has to be CI-ed against the builtin XS version on a make test. It might be build only, never install the .pm, maybe keep .pod separate, but the PP .pm has to initially stay around I think for CI.

Another argument is, hard coding string_eq skips into pp_require() is fine. strict.pm module/package/disk file is P5P owned, random local() monkey patching CVs/globs by random modules, where the developer decided to runtime replace vars token in use strict 'vars'; with use strict 'dacostumers'; is not supported ever.

Another argument, the upstream dep list of any CPAN/Core module is UB. There is no API contract that a module will never add or remove a upstream dep between released. And the interp is a module ever since DynaLoader:: was introduced, or maybe https://perl5.git.perl.org/perl5.git/blob/e334a159a5616cab575044bafaf68f75b7bb3a16:/usub/curses.mus so if %INC is suddenly dirty, that was never promised. See this line in modern perl

perl5/gv.c

Line 1235 in 9a9d70c

if (stash_name && memEQs(stash_name, HvNAMELEN_get(stash), "IO::File")

@bulk88
Copy link
Contributor Author

bulk88 commented Oct 25, 2024

Another idea, P5P modules upstream or p5p not shipped on CPAN, The "no updates unless you install new major release kinds of module

I propose p5p modules, from now they are blacklisted from %INC and should only appear in new ``%^INC` global hash, the hash should be RO from day. light foot steps towards code signing/DRM/Security/tampering with .pm files by a maids/auditing perl interps for useless govt reports

-builtin.pm is now primarily for POD and .pm indexing tools, core, CPAN or
user written.  It also is a backup mechanism for very strange %INC
localization, clearing, or manipulation done by users, probably in a .t,
and whatever %INC manipulation is being done is probably developer error.
-This removes all the libc/kernel I/O calls for builtin.pm, and Perl code
parser overhead.
-A large benefit is, this commit is 50% of the work, to make

perl -E 'say "hi";'

"/lib"-less or not dependent on any file I/O. perl.bin, libperl.so,
and miniperl.bin should be able to execute as a standalone binary.
If perl -e "1;" doesn't need a dozen separate library files,
perl -E "1;" also shouldn't need a dozen files.

perl -E "say 'Hello world';" should work, even with a broken perl
installation or unreachable "/lib/*.pm"s or broken "portable" perls.

Only a feature.pm dep is left, for -E to be lib-less.  That is for another
patch and PR in the the future.
-silence nearby MSVC x64 only truncation warnings
@tonycoz
Copy link
Contributor

tonycoz commented Nov 4, 2024

The original specification of builtin suggested we might ship builtin as a module on CPAN that implemented backports of at least some of the builtins.

From what I can see this change prevents such an implementation from working.

@bulk88
Copy link
Contributor Author

bulk88 commented Apr 5, 2025

The original specification of builtin suggested we might ship builtin as a module on CPAN that implemented backports of at least some of the builtins.

From what I can see this change prevents such an implementation from working.

You bring up a good question here. So I need to do more thinking on my side now.

NOTE NOTE NOTE!!!!!! This is a long @#$ post. I'm probably just going to add 5-10 more lines of new C code in this PR, for a simple non-contraversal quick engineering "fix", regarding TonyCs question, vs waiting for a social or legal or policy or management "fix" for this PR. The rest of this post is about very important "policy" and "direction" problems, with the Perl 5 lang/grammer, and the PAUSE SQL DB, and not about C/PP source code or engineering. The "policy" or "direction" problems/concerns need to be answered "officially", whatever "officially" means here.

I did read https://github.com/Perl/PPCs/blob/f22f090f1a77002d7f9b3af38717b2067e495e9a/ppcs/ppc0009-builtin-namespace.md all the way to the end just now. Yes, Revision 1 (or current/last as of writing) disables the provision, for a CPAN builtin.tar.gz installed in /site/lib to ever exist, since the interp wont check the FS or do disk I/O anymore for builtin.pm.

But I see conflicts with the text inside PPC-0009 and current established P5P policies/practices. Quoting PPC-0009

Backwards Compatibility .....CUT..... As a related note,
by creating a dual-life distribution containing the
builtin.pm pragma module along with a polyfill implementation of any
functions it ought to contain, this can be shipped to CPAN in order
to allow code written using this new mechanism to be at least
partly supported by older perl versions. Because the pragma still
works as a regular module, code written using the use builtin
... syntax would work as intended on older versions of perl if
the dual-life builtin distribution is installed.

Now, lets look at P5P's official policies

https://github.com/Perl/perl5/blob/blead/pod/perlpolicy.pod#maintenance-and-support

This document codifies the support and maintenance
commitments that the Perl community should expect from Perl's
developers:

We "officially" support the two most recent stable release
series. 5.36.x and earlier are now out of support. As of the
release of 5.42.0, we will "officially" end support for Perl
5.38.x, other than providing security updates as described below.

To the best of our ability, we will attempt to fix critical
issues in the two most recent stable 5.x release series. Fixes
for the current release series take precedence over fixes for
the previous release series.

https://github.com/Perl/perl5/blob/blead/pod/perlpolicy.pod#maintenance-branches

New releases of maintenance branches should only contain
changes that fall into one of the "acceptable" categories set out
below, but must not contain any changes that fall into one of the
"unacceptable" categories. (For example, a fix for a crashing bug must
not be included if it breaks binary compatibility.)

It is not
necessary to include every change meeting these criteria, and in
general the focus should be on addressing security issues,
crashing bugs, regressions and serious installation issues. The
temptation to include a plethora of minor changes that don't affect
the installation or execution of perl (e.g. spelling
corrections in documentation) should be resisted in order to reduce
the overall risk of overlooking something. The intention is to
create maintenance releases which are both worthwhile and which
users can have full confidence in the stability of. (A secondary
concern is to avoid burning out the maint-release manager or
overwhelming other committers voting on changes to be included (see
["Getting changes into a maint branch"](https://github.com/Perl/perl5/blob/blead/pod/perlpolicy.pod#Getting-changes-into-a-maint-branch) 
below).)

P5P over many years, has reduced further and futher, the count of, release frequency per year, and commit counts, of stable perl, to near zero except for CVE fixes. In 2024/2025, nothing except a SEGV from PP code, no CPAN XS allowed, or a CVE qualifies for EOL/legacy "support" from P5P. Whether by policy, or just cherry pick votes by P5P devs with a commit bit, the truth is maint perls tar.gz'es at 5.4X.1 or 5.4X.2, just don't exist anymore, and if they exist, they offer little value to bother installing and redeploy all your production code.

A perl user needs to wait till next June of next year, or use blead perl in production. Thats the real P5P maint policy.

So the text in PPC-0009 wants to ship a builtin.tgz on CPAN for ancient/historic perl interps. The code in this PR, by disabling stat() or open() on builtin.pm disk file, doesn't affect the provision in PPC-0009, for someone with a commit bit/PAUSE control from shipping a builtin.tgz on CPAN for ancient/historic perl interps. Since those are ancient perl interps going backwards into the past that never had a builtin.pm or *main::builtin::.

My PR can only change what a blead perl/future perl does going forwards, it can't change what happens on legacy libperl.so'es already deployed in the field. This write up by me is so long, because it brings a serious design question, of what the "pragmas" in Perl 5 language really are. All of the Perl 5 language pragmas today are mis-use of Larry's/Perl's dl_open() keyword and mis-use of a disk I/O PP language keyword, because nobody, including me, ever thought of anything better.

Larry Wall's official language extension mechanism in Perl 1-5.000 alpha, which is https://perldoc.perl.org/perlvar has been obsolete for decades. I don't personally, and nobody on WWW thinks, that historic "extension mechanism" is correct and proper, and shoud be "expanded". Anyone can chime in on $^1_11\1READABLE from perlvar.pod vs use token API design pattern topic. I'd like to hear someones backstory or historical review on this API design part of the P5 lang.

I'm not advocating for use strict; to be "put on CPAN". NO NO NO. But if use strict; is a P5P controlled string for the next 100 years, why is libperl.so/.dll doing DISK I/O and ASCII PARSING for something that is not end user servicable, not Perl 5 end user runtime modifiable, and not independently maintained on CPAN, and forever part of the P5 grammer and P5 git repo?

The above isn't my idea. Its someone elses. And there is Artisic 2.0 licensed C code out there, implementing all of Perl 5's all-lowercase pragmas, as test passing .c code for someones /bin/perl/libperl.so ya know ;-)

PPC-0009 was written on Oct 16, 2021. This is April 2025. AFAIK, that provision of PPC-0009 hasn't been done by now. But there are multiple imitations/copy-cats of that provision on CPAN right now.

Its been 4 years, nobody wrote the official CPAN code for builtin.pm. Are there any volunteers in year 2025 willing to do this?

https://metacpan.org/pod/builtin::compat

Is that provision from PPC-0009 even applicable anymore if builtin::compat.tgz exists on CPAN currently?

Is there a legitimate reason for P5P to ship a builtin.tgz that is in competition with builtin::compat.tgz and its maintainer Graham Knop who has a commit bit, and was a P5P release manager?

Graham K 5.40.0 2024-Jun-09

Will Knop (the person) suddenly become adversarial/ideological over his builtin::compat name space? Or will they curse someone out in the Perl community using many homophobic and racist words, and an obvious official ethics /Conduct Code process is followed after that, and then he is banned from the Perl community?

Thats fantasy talk IMO. There isn't a reason for that current CPAN module and its PAUSE package namespace, to turn into a war zone requiring builtin.tgz to exist on CPAN . War zone == CPAN modules owned by Schmorp/Lehmann.

There is also PPC-0009's sponsors/authors module at

https://metacpan.org/release/PEVANS/builtin-Backport-0.02

and while I do see this file and this line of source code https://metacpan.org/release/PEVANS/builtin-Backport-0.02/source/lib/builtin.pm#L3 according to https://metacpan.org/dist/builtin-Backport/permissions that is not the "official" builtin.pm. Just a random citizens builtin::Backport.pm file. Im wonder why I dont see BIG RED LETTERS saying unauthorized on metacpan, for that tarball, like I did on the old cpan.org site. He didn't use the \n trick in the package statement to trick the PAUSE indexer.

But it looks abandoned by its CPAN/PAUSE owner, and it doesn't have a GH repo published anywhere. I tried searching for it. Dont think it exists. And it looks like the Perl community agreed on the CPAN module located at https://metacpan.org/pod/builtin::compat as the implementation of PPC-0009's text that talks about backporting through CPAN to anicient perls.

https://github.com/haarg/builtin-compat/commits/master/ shows https://metacpan.org/pod/builtin::compat has very many updates, and its owner easily accepts other people's patches, and does quick releases on CPAN of new .TGZs.

But lets discuss this provision, as if it will happen, or someone will write the code in the next 3 months and ship builtin.tgz on CPAN or atleast write a branch and have the PR rejected by consensus at the end.

P5P PAUSE account doesn't have namespace ownership of package builtin

https://pause.perl.org/pause/authenquery?ACTION=peek_perms&pause99_peek_perms_by=ml&pause99_peek_perms_query=version&pause99_peek_perms_sub=Submit&pause99_peek_perms_by=a&pause99_peek_perms_query=P5P&pause99_peek_perms_sub=1

Some other people do

https://pause.perl.org/pause/authenquery?ACTION=peek_perms&pause99_peek_perms_by=ml&pause99_peek_perms_query=builtin&pause99_peek_perms_sub=Submit

For purpose of this discussion, lets pretend LeoNerd had a tragic interaction with a public transit vehicle 12 hours ago. So he isnt alive to write new code going into the future. So what person with a P5P commit bit, is going to write the code, and release a future builtin.tgz on CPAN?

How does a builtin.tgz on CPAN not violate P5P's maint policy and is not, in other peoples words, an unacceptable support/maint burden on unpaid Perl volunteers?

https://github.com/Perl/perl5/blob/blead/pod/perlpolicy.pod#maintenance-branches

The only prior talk I've found at https://groups.google.com/g/perl.perl5.porters/c/5V2aK9OSb5g/m/JKlpRWvHBgAJ https://www.nntp.perl.org/group/perl.perl5.porters/2023/07/msg266763.html (same link) suggests PEVANS was told its a policy violation to do what PPC-0009 says. But that is 1 persons (James E Keenan) P5P ML post. Which is someones personal opinion, and not a "decision", because its 1 person ML post. And PEVANS called his module builtin-Backport.tgz, not builtin.tgz, and he knows there is some fuzzyness over who "owns" the grammer token and the disk file and its not him.

Someone correct me if the kid51 Keenan ML post from 2023 was the "official decision", which said PPC-0009's builtin.tgz on CPAN text will never happen and is against consensus/P5P policy. 1 other person said in the thread "There are already 2 useful copy-cat implementations of PPC-0009's builtin.tgz on CPAN provision. I don't want to see a 3rd implementation of PPC-0009's idea on CPAN."

I can't find any other prior talk about an official P5P sponsored and controlled builtin.tgz on CPAN other than that July 2023 ML thread. But that 2023 ML thread, doesn't solve the policy question, its just chat-room style talk to me.

Now lets say, its not a maint policy violation, won't all future updates/changes to builtin.pm for future perls be delivered every year during the month of June inside perl-5.4X.0.tgz, or be delivered in perl-5.3X.1.tgz or perl-5.3X.2.tgz?

If future use builtin qw( ); are delivered as C code inside perl-5.4X.0.tgz, perl-5.3X.1.tgz or perl-5.3X.2.tgz, this PR changes nothing since a builtin.tgz still wont exist on CPAN, because any update comes inside a new perl-5.4X.0.tgz or perl-5.3X.1.tgz.

Those paragraphs in PPC-0009 need official clarification, if those paragraphs about builtin.tgz on CPAN are actually true, and will be done, or can not be done and were a mistake in the text.

Also official clarification is needed, is CPAN module called https://metacpan.org/pod/builtin::compat the actual deployed production implementation of that provision inside PPC-0009, or did Knop create an unauthorized, unofficial, not P5P connected, CPAN module, for their personal/private day job needs, or is builtin::compat.tgz the "consensus" implementation of PPC-0009 even if its not a P5P official product (aka PAUSE "dual-life" status, with a P5P GH bug tracker).

PEVAN's CPAN already has a bug ticket at https://rt.cpan.org/Public/Bug/Display.html?id=158702 describing a runtime bug, with his builtin-Backport.tgz causing a Perl ecosystem runtime conflict/PAUSE package namespace ownership problem/issue/concern with other P5P/CPAN code, reported by ETHER. It is unanswered as of me writing this post.

So official clarification is needed, does PEVANS as the author of PPC-0009, have legal or social or moral ownership of 4 different things.

  • PAUSE package namespace builtin
  • ownership of a right to upload a file called builtin.tgz file on CPAN if he wants to upload a file with that name
  • social control of the features offered by builtin.tgz on CPAN to legacy/out of maint/unsupported ancient stable perls? social control, like refusing to accept another persons patch adding a new token from P5P blead perl to his builtin.tgz that he ships on CPAN, refusing to accept for any reason right or wrong.

It can be argued P5P, looses control of obsolete stable "Perl 5" releases when P5P according to its own policy, drops "maint" for old stable Perl 5 interps after 24 months, so what his builtin.tgz file does on cpan.org is "out of scope" for discussion at P5P. Half glass full version of the same words: P5P forks the "Perl 5 language" every 24 months, what CPAN modules do, to ancient EOL perl interps, is irrelavent. And P5P doesn't offer support for random malware that end users download CPAN every day.

  • social control p5p.git/cpan/* style over parser token builtin in Perl 5 language

Im writing all of this up, since its a good topic that needs a final deciscion by someone, who controls or what group "owns" PP src code ASCII string use all_lower_case;, and can those pragmas, since they are inseperable from the disk file libperl.so, just be .xs-ed or .c-ed or PP minified and fat packed into disk file libperl.so for performance and optimization reasons.

If an end user for some reason really wants to runtime replace the P5P C code at runtime in their perl process (monkey patching), they can just delete the %INC entry at runtime in a BEGIN block and call use do or require again.

The cPerl fork decided to always keep the P5P PP versions of the core pragmas around for miniperl.exe reasons and testing its own .c code implementations of the P5P PP code as a sanity/back compat test during its interp core self-make-test. I agree with leaving a monkey patch hole open at runtime if for some crazy reason an end user wants to monkey patch their perl binary and their perl process. Perl 5 isn't a WWW HTML browser app with mandatory extreme security rules for end user code. Perl 5 lang will always give end users a box of bullets and a long rope. Its their address space, their OS process, and their PP/C source code. Perl isn't a mobile app store or a OS hypervisor project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants