Skip to content
Open
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 31 additions & 15 deletions src/wp-includes/script-loader.php
Original file line number Diff line number Diff line change
Expand Up @@ -3438,26 +3438,48 @@ function wp_enqueue_command_palette_assets() {
'is_network_admin' => is_network_admin(),
);

/**
* Extracts root-level text nodes from HTML string.
*
* @ignore
* @param string $label HTML string to extract text from.
* @return string Extracted text content, trimmed.
*/
$extract_root_text = static function ( $label ) {
if ( '' === $label ) {
return '';
}

$processor = WP_HTML_Processor::create_fragment( $label );
$text_parts = array();

if ( $processor->next_token() ) {
$root_depth = $processor->get_current_depth();
do {
if ( '#text' === $processor->get_token_type() && $root_depth === $processor->get_current_depth() ) {
$text_parts[] = $processor->get_modifiable_text();
}
} while ( $processor->next_token() );
}

return trim( implode( '', $text_parts ) );
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey nice job @t-hamano getting this built. I hope it wasn’t too obscure to figure out.

this looks like it should be solid, but I can share a couple of points of feedback.

Finding root-level text nodes

when creating a fragment (with the default <body> context) we will always have an open HTML element and BODY element, meaning that root-level text will always have a depth of 3 (and likewise, the breadcrumb depth will be three).

this means we can eliminate the nested loop and directly check if the depth is 3. we don’t have to capture the root depth. that open HTML and BODY are guarantees with how it works.

On the other hand, we can also test this via the breadcrumbs. I found an issue that we should probably change/fix on matches_breadcrumbs(), because that won’t work here, but for the time being this would.

while ( $processor->next_token() ) {
	if ( array( 'HTML', 'BODY', '#text' ) !== $processor->get_breadcrumbs() ) {
		continue;
	}

	$text_parts…
}

Efficiency and reliability

The use of the HTML Processor is particularly convenient because it provides depth automatically. On the other hand, if you find that it’s too slow or fails too frequently (because it receives the fraction of input documents it can’t parse) then we can still adjust the lever on the reliability/practicality spectrum. The Tag Processor will not fail with the same parsing issues the PCRE matches did, even though that can lead to some kinds of parsing failures (with, for example, mismatched tags).

Still, the Tag Processor won’t fail a parse each token and is considerably faster than the fully-fledged HTML Processor. If we were to choose this approach, we’d want to manually track depth, which again, could be wrong because HTML is so wonderfully complex (vs. the HTML Processor which will not be wrong here).

$processor = new WP_HTML_Tag_Processor( $label );
$depth     = 0;
while ( $processor->next_token() ) {
	$token_name = $processor->get_token_name();

	if ( '#text' === $token_name && 0 === $depth ) {
		$text_parts…
		continue;
	}

	if ( $processor->is_closing_tag() ) {
		--$depth;
	} else if ( ! WP_HTML_Processor::is_void( $token_name ) ) {
		++$depth;
	}
}

The choice is up to you. The only thing I’d watch out for is that occasionally we get things like “nested” A tags, and those can cause the HTML Processor to abort out of caution.


if ( $menu ) {
$menu_commands = array();
foreach ( $menu as $menu_item ) {
if ( empty( $menu_item[0] ) || ! empty( $menu_item[1] ) && ! current_user_can( $menu_item[1] ) ) {
continue;
}

// Remove all HTML tags and their contents.
$menu_label = $menu_item[0];
while ( preg_match( '/<[^>]*>/', $menu_label ) ) {
$menu_label = preg_replace( '/<[^>]*>.*?<\/[^>]*>|<[^>]*\/>|<[^>]*>/s', '', $menu_label );
}
$menu_label = trim( $menu_label );
$menu_label = $extract_root_text( $menu_item[0] );
$menu_url = '';
$menu_slug = $menu_item[2];

if ( preg_match( '/\.php($|\?)/', $menu_slug ) || wp_http_validate_url( $menu_slug ) ) {
$menu_url = $menu_slug;
} elseif ( ! empty( menu_page_url( $menu_slug, false ) ) ) {
$menu_url = html_entity_decode( menu_page_url( $menu_slug, false ), ENT_QUOTES, get_bloginfo( 'charset' ) );
$menu_url = WP_HTML_Decoder::decode_attribute( menu_page_url( $menu_slug, false ) );
}

if ( $menu_url ) {
Expand All @@ -3474,21 +3496,15 @@ function wp_enqueue_command_palette_assets() {
continue;
}

// Remove all HTML tags and their contents.
$submenu_label = $submenu_item[0];
while ( preg_match( '/<[^>]*>/', $submenu_label ) ) {
$submenu_label = preg_replace( '/<[^>]*>.*?<\/[^>]*>|<[^>]*\/>|<[^>]*>/s', '', $submenu_label );
}
$submenu_label = trim( $submenu_label );
$submenu_label = $extract_root_text( $submenu_item[0] );
$submenu_url = '';
$submenu_slug = $submenu_item[2];

if ( preg_match( '/\.php($|\?)/', $submenu_slug ) || wp_http_validate_url( $submenu_slug ) ) {
$submenu_url = $submenu_slug;
} elseif ( ! empty( menu_page_url( $submenu_slug, false ) ) ) {
$submenu_url = html_entity_decode( menu_page_url( $submenu_slug, false ), ENT_QUOTES, get_bloginfo( 'charset' ) );
$submenu_url = WP_HTML_Decoder::decode_attribute( menu_page_url( $submenu_slug, false ) );
}

if ( $submenu_url ) {
$menu_commands[] = array(
'label' => sprintf(
Expand Down
Loading