Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept last_reviewed_on field as a string input #302

Open
NickWalt01 opened this issue May 4, 2022 · 17 comments
Open

Accept last_reviewed_on field as a string input #302

NickWalt01 opened this issue May 4, 2022 · 17 comments

Comments

@NickWalt01
Copy link

NickWalt01 commented May 4, 2022

What should change

Accept the use of last_reviewed_on field as a string input in any string variation; 2021/02/23, March 2027 23, January 13 2020, rather than a Date data type only. Whilst updating dependencies in the repositories listed in the next section I have found exceptions regarding the last_reviewed_on field as shown here. It appears that middleman expects the Date to be 2021/02/23 as highlighted on this page but the embedded ruby in this repository expects it to be 2021-02-23 as a Date object. As this field causes an exception it makes the title field fail causing the real error to be hidden by middleman asking for H1 tags when <%= current_page.data.title %> is used in the html file.

It currently fails here and the fix is to accept the string that is converted to a Date object:

def format_string_date(string_date)
   Date.parse(string_date)
end

User need

At MoJ we are using the tech-docs-gem in this repository to publish our template documents based on this template repository. We have many html pages that are utilising the tech-docs-gem to generate the html sites.

I have modified the tech-docs-gem ruby code and tested it works with both a string and a Date object.

@NickWalt01
Copy link
Author

Hi @laurahiles any idea how long it will take to implement the change? I can share the code and test if it helps

kr8n3r added a commit to alphagov/paas-tech-docs that referenced this issue Dec 2, 2022
Upstream tecg-docs gem with rub 2.7.2 is set up to expect dates as "yyyy-mm-dd" which are incompatible with ruby 3.x nd dependency updates - see alphagov/tech-docs-gem#302 for more details.

Until upstream is fixed, we're removing the review date frontmatter from the only place we use it, so that we can upgrade to a non-vulnerable ruby version
@richardTowers
Copy link
Contributor

Thanks for raising this Nick. I've done a bit of digging, and to summarise the issue as I understand it:

  • In The Good Old Days, users of this gem had frontmatter like last_reviewed_on: 1970-01-01, which middleman loaded with load_file, resulting in {"last_reviewed_on" => #<Date: 1970-01-01...>}
  • ruby 3.1 switched to Psych 4.x by default, which changes the default for load_file from unsafe_load_file to safe_load_file, which means trying to parse last_reviewed_on: 1970-01-01 raises an exception (because Dates are not allowed)
  • If users of the gem instead try a different date format like last_reviewed_on: 1970/01/01 that parses as a string ({"last_reviewed_on" => "1970/01/01"}, but then this gem expects a Date so it blows up later
  • Given that middleman can't handle dates, and this gem can't handle strings, there's no way to construct frontmatter that works.

I have an untested hypothesis that this is fixed in middleman 4.4.3, because they switched to ::YAML.safe_load(..., permitted_classes: [Date, ...], ...) which should mean formats that parse as dates are allowed now (e.g. 1970-01-01).

You could still make a strong argument that this gem should support both strings and dates, since middleman's docs clearly suggest that weird stringy date formats are permitted.

@richardTowers
Copy link
Contributor

@LeePorte reports that the issue is still present with middleman 4.4.3 however.

@NickWalt01
Copy link
Author

Hello @richardTowers thanks for having a look at this.

Your summary make perfect sense, I do remember the load_file to unsafe_load_file change causing an issue here.

The repo is using Middleman 4.4.2 at the moment, Gemfile.lock, so if it does get fixed in that dependency I can bump the version but we do have some other dependency issues in the Gemfile. The repo is using govuk_tech_docs (3.2.1). Due to these issues we paused updating the repo.

I do not have my branch with the fix anymore. Is it likely a fix will be applied?

@richardTowers
Copy link
Contributor

It's at least on our radar now, I can't promise we'll get to it immediately though.

@richardTowers
Copy link
Contributor

I've raised middleman/middleman#2614 with middleman. We should still consider supporting strings, irrespective of whether they decide to fix this.

@lfdebrux
Copy link
Member

Ah, so has this broken building docs completely now?

@lfdebrux
Copy link
Member

You could still make a strong argument that this gem should support both strings and dates, since middleman's docs clearly suggest that weird stringy date formats are permitted.

Could you point me towards where this is documented?

@NickWalt01
Copy link
Author

Hi @lfdebrux my notes on the topic are here. I am not sure where the documentation is for Middleman. There is this page that states: "If you want, you can specify a full date and time as a date entry in the front matter, to help with ordering multiple posts from the same day. " and has the date as date: 2011/10/18

@lfdebrux
Copy link
Member

Hi @lfdebrux my notes on the topic are here. I am not sure where the documentation is for Middleman. There is this page that states: "If you want, you can specify a full date and time as a date entry in the front matter, to help with ordering multiple posts from the same day. " and has the date as date: 2011/10/18

Cool, thanks for the added context. Apologies, when I first looked at this ticket I thought it was solely a feature request, I didn't realise that the issue was preventing the docs from building the site even if the date was formatted correctly.

My main concern in allowing a string for the last_reviewed_on field is that it might mean we get the whole issue with ambiguously written dates; Ruby's Date.parse seems a little too flexible [1], it's not clear to me whether the result of parsing say 04/02/2020 is guaranteed to always be 4th February 2020, or if it is locale dependent. I also noticed that it says that 'if string does not specify a valid date, the result is unpredictable', which is worrying!

I think it might be safer to use Date.strptime(string, '%Y-%m-%d') instead, restricting the kinds of dates accepted by last_reviewed_date; does that sound reasonable @NickWalt01 @richardTowers?

@NickWalt01
Copy link
Author

Hi @lfdebrux

Date.strptime("2022-08-22", '%Y-%m-%d') is accepted as a valid date, but Date.strptime("2022/08/22", '%Y-%m-%d') is not valid, so for me yes that is acceptable and I agree that it is better to restrict the date format that is used in the doc files to YYYY-MM-DD

@richardTowers
Copy link
Contributor

richardTowers commented Jan 23, 2023

I'd be fine with restricting dates to '%Y-%m-%d', but I think there's a tricky interaction here between ruby / yaml / middleman / tech docs gem.

The issue is that if you put a date in your frontmatter like:

---
date: 1970-01-01
---

Middleman will parse this YAML using YAML.load, which will now error in ruby >=3.1 because of the switch to safe_load. That's a bug in middleman in my opinion. Users of middleman that don't use the tech docs gem can work around this by using any format which ruby doesn't interpret as a date, but the tech docs gem actually wants a date so our users are between a rock and a hard place.

I don't think the tech docs gem gets involved early enough in the loading of a template to fiddle with the YAML frontmatter before it's parsed (but maybe it does?).

I guess we could also consider monkey patching middleman while we wait for a release... 🙈

@NickWalt01
Copy link
Author

Looking at the Middleman commit history and the last tag release date, I can't see a new release soon, happy to wait for a new one as the docs work with the older version of Ruby, but 2.7 will be phased out soon:

Ruby 2.7
status: security maintenance
release date: 2019-12-25

Ruby 2.6
status: eol
release date: 2018-12-25
EOL date: 2022-04-12

@lfdebrux
Copy link
Member

lfdebrux commented Jan 27, 2023

Right, thanks for spelling it out for me @richardTowers. How frustrating.

@richardTowers
Copy link
Contributor

There's now a release of Middleman with support for Ruby 3.1 - v4.5.0. In theory, this bug should be fixed on middleman 4.5.0, but I haven't tested it yet.

@NickWalt01
Copy link
Author

Hi @richardTowers thanks for the update, nice, we will give it a try with the new versions

@stevenjmesser
Copy link

There's now a release of Middleman with support for Ruby 3.1 - v4.5.0. In theory, this bug should be fixed on middleman 4.5.0, but I haven't tested it yet.

I ran bundle update on design-system-team-docs and it seems to have fixed the problem using YYYY-MM-DD. I had to lock the contracts gem at 0.16.1 because 0.17 errored, but all seems present and correct. Relevant changes in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants