Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change all URLs to new CDN #7

Open
Mauville opened this issue Aug 14, 2024 · 2 comments
Open

Change all URLs to new CDN #7

Mauville opened this issue Aug 14, 2024 · 2 comments

Comments

@Mauville
Copy link
Owner

Medpix has moved all their content off to a cdn

Old links looked like
https://medpix.nlm.nih.gov/images/full/synpic52419.jpg

Now they look like
https://d168r5mdg5gtkq.cloudfront.net/medpix/img/full/synpic17159.jpg

The dataset links need to be modified. It appears that a simple rename should work, but if the cdn is constantly changing, then this could become a reoccurring problem.

A simple fix for the scraper is adding the following line

filename = url.split("/")[-1]
    url= f"https://d168r5mdg5gtkq.cloudfront.net/medpix/img/full/{filename}"
    urllib.request.urlretrieve(url, f"/content/drive/Shareddrives/DeepLearning/data/output/{filename}")
@Qasim-Latrobe
Copy link

Thanks for the prompt response and guidance. Yes, I am able to download the madpix dataset 🙂

@SID-6921
Copy link

how can i get the dataset
help me please ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants