Skip to content

Conversation

@wrobelda
Copy link
Contributor

@wrobelda wrobelda commented Nov 16, 2025

Previous MR #4820 introduced a bug where the URI wasn't getting expanded. This is because it is obtained from a non-standard data-uri attribute which defaultLinkTo() doesn't support.

On top of that:

  • sanitize the HTML in Content
  • use a longer Description found in JSON
  • fix timestamp processing, including for relative Today and Yesterday strings
  • move media to enclousures
  • be explicit about elements chosen to augument the description
  • simplify the image URL processing

@github-actions
Copy link

github-actions bot commented Nov 16, 2025

Pull request artifacts

Bridge Context Status
Kleinanzeigen 1 By search (current) ✔️
Kleinanzeigen 1 By search (pr) Bridge returned error 0! (20408)
Type: ErrorException
Message: Array to string conversion
Kleinanzeigen 2 By profile (current) ⚠️ The feed has no items
Kleinanzeigen 2 By profile (pr) ⚠️ The feed has no items

last change: Sunday 2025-11-16 16:00:59

Previous MR RSS-Bridge#4820 introduced a bug where the URI wasn't getting
expanded. This is because it is obtained from a non-standard data-uri
attribute which defaultLinkTo() doesn't support.

On top of that:
- sanitizes the HTML in Content
- use a longer Description found in JSON
- fix timestamp processing, including for relative Today and Yesterday strings
- move media to enclousures
- be explicit about elements chosen to augument the description
- simplify the image URL processing
Copy link
Contributor

@Mynacol Mynacol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR still has CI failures, please look at them and my comments. Thanks!


if ($element->find('img', 0)) {
//enhance img quality. Cannot use convertLazyLoading() here due to non-standard URI suffix in srcset.
$item['enclosures'] = [preg_replace('/rule=\$_\d+\.AUTO/i', 'rule=$_57.AUTO', $element->find('img', 0)->getAttribute('src')) . '#.image'];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the . '#.image' part doing at the end?

); //enhance img quality

$item['content'] = '<img src="' . $imgUrl . '"/>' . $element->find('div.aditem-main', 0)->outertext;
$item['uri'] = urljoin($this->getURI(), $element->getAttribute('data-href'));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could fail if Kleinanzeigen chooses to make the value an absolute URL? Add a small if or ternary operator thin to catch that.


$item['timestamp'] = strtotime($dateString);
} else {
$item['timestamp'] = time();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to not include a timestamp if not available/on failure? Otherwise, the timestamp of existing ads could change over time? Not sure here.

)
); //enhance img quality

$item['content'] = '<img src="' . $imgUrl . '"/>' . $element->find('div.aditem-main', 0)->outertext;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still add the image into the content HTML in addition to the enclosure. Depending on RSS reader, the display of this stuff can vary widely.
If you want, you can add a bridge option to control that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants