Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Highlight Keywords and Description more on resource details page? #147

Closed
andrewsu opened this issue Dec 20, 2022 · 4 comments
Assignees

Comments

@andrewsu
Copy link
Contributor

andrewsu commented Dec 20, 2022

For discussion/consideration.... Right now, there are sections for Overview, Keywords, Description, Provenance, Funding, and Metadata. Would it be worth considering pulling out the Keywords and Description to be a more integrated part of the page (under the resource title, perhaps)? I think this could bring information higher that would be useful to the user at first glance.

@gtsueng
Copy link
Collaborator

gtsueng commented Jul 31, 2024

The coverage of description is generally very high >90%, but I'm not sure about the coverage of keywords. @leandrocollares can you check the coverage of the keywords field currently on production? This will let us know how much impact this UI improvement might have.

@gtsueng
Copy link
Collaborator

gtsueng commented Aug 7, 2024

Keyword coverage is here: https://docs.google.com/presentation/d/173iq49N4FRulkf7R-s-L3defEDY2hjEzxnQvBQKo7Kk/edit#slide=id.g2f10e6bde1c_0_158

@DylanWelzel to provide a dump of description for @leandrocollares to analyze. Description length will play an important role in determining whether or not ChatGPT can be used to generate a short description of the Dataset. (It tends to over hallucinate for topicCategory and measurementTechnique) when the name+description of a dataset is too short

@leandrocollares
Copy link
Contributor

@gtsueng
Copy link
Collaborator

gtsueng commented Aug 12, 2024

Other considerations:

  • length limitations of ChatGPT before hallucinating results for measurementTechniques: https://docs.google.com/spreadsheets/d/1crfLDl5_c7jZ47JefhOCf6tx_cM-u8AkK6rsXttM6s8/edit?gid=1639648736#gid=1639648736
  • Title length optimization for SEO is ~70-80 characters
  • Description length optimization for SEO is ~140 - 160 characters
  • The ~<400 character range appears to have around the same number of records as the ~<50 word range (and there are ~1 million records within this character/word length range)
  • The current character limitation for the display of the description in the current search results/card view is ~180 characters. Descriptions greater than this length are displayable after clicking on the interactive element.

Since keyword coverage is so low/messy at this point in time, I'm going to split this issue into 2. We can revisit the placement of keywords in the future (when we've standardized/augmented this field). We can move forward with generating improved descriptions (NIAID-Data-Ecosystem/nde-crawlers#159)

This issue has been split into the following issues:

I am marking this issue as pending close out and will close it after a week. If you have additional comments on the prominence of keywords or descriptions, please add them to the appropriate ticket, rather than this one.

@gtsueng gtsueng closed this as completed Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants