Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: it works, is ugly #30

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

alvarmaciel
Copy link
Collaborator

@alvarmaciel alvarmaciel commented Jun 16, 2023

Add the ability to download from any public github URL

@alvarmaciel alvarmaciel linked an issue Jun 16, 2023 that may be closed by this pull request
@alvarmaciel alvarmaciel marked this pull request as ready for review June 16, 2023 16:49
@github-actions
Copy link

github-actions bot commented Jun 16, 2023

Coverage

Coverage Report
FileStmtsMissCoverMissing
shbin.py1471589%65–66, 76, 97–98, 101, 139–141, 151, 216–217, 224–225, 256
TOTAL1471589% 

Tests Skipped Failures Errors Time
46 0 💤 0 ❌ 0 🔥 8.639s ⏱️

Co-authored-by: Martín Gaitán <[email protected]>
@alvarmaciel alvarmaciel requested a review from mgaitan June 16, 2023 20:03
Comment on lines +71 to +76
result = re.findall(pattern, url)
if result:
extracted_string = result[0]
return extracted_string
else:
raise DocoptExit(f"Ensure your path is from github repository. (error {e})")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here e in the message is not defined.

Also, it's considered more pythonic the EAFP approach

So, it should something like

Suggested change
result = re.findall(pattern, url)
if result:
extracted_string = result[0]
return extracted_string
else:
raise DocoptExit(f"Ensure your path is from github repository. (error {e})")
try:
return re.findall(pattern, url)[0]
except TypeError:
raise DocoptExit(f"Ensure the URL is from github repository")

Comment on lines +70 to +71
pattern = r"https://github.com/.*?/(?:tree|blob)/(.*)"
result = re.findall(pattern, url)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not use findall for a single potential match. also it would be a little better to compile the pattern at module level

PATTERN = re.compile(r"https://github.com/.*?/(?:tree|blob)/(.*)")

def get_path(url):
    try:
         return re.match(pattern, url).group(1)
    except ...     

if result:
full_name_repo = result[0][0]

is_from_my_repo = "https://" not in url_or_path or full_name_repo == repo.full_name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few things here.

  • the first condition should be not url_or_path.startswith("https://")

  • is it enough to assume that if it doesn't start with https:// then is from repo?

  • wording: "my_repo" . I'm the user but the repo could someone else and I just have permissions, right? like mgaitan downloading from Shiphero/pastebin. I would use is_from_the_repo or something like that

Comment on lines +125 to +129
if is_from_my_repo:
path = re.sub(rf"^https://github\.com/{repo.full_name}/(blob|tree)/{repo.default_branch}/", "", url_or_path)
else:
repo = get_repo(full_name_repo)
path = get_path(url_or_path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we are trying to parse with a regex, couldn't we generalize it an accept urls or user/repo/path forms for any repo and then check if it's the our not?

suppose shbin zauberzeug/nicegui/examples

Comment on lines +145 to +149
url = f"https://raw.githubusercontent.com/{repo.full_name}/{path}"
response = requests.get(url)
if response.status_code > 200:
raise Exception("There was a problem with your download, please check url")
content = response.content
Copy link
Collaborator

@mgaitan mgaitan Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we unify own or foreign contents with requests? the only difference it that you need to pass the token in the header.

In [16]: requests.get("https://raw.githubusercontent.com/Shiphero/pastebin/main/mgaitan/15e1VFq87UA.txt",
    ...:     headers={"Authorization": f"token {os.environ['SHBIN_GITHUB_TOKEN']}", "Accept": "application/vnd.github.v3.raw'"}
    ...: )
Out[16]: <Response [200]>

In [17]: Out[16].content[:144]
Out[17]: b"    \n>       mock_sns.push.assert_called_once_with(topic_arn, message)\nE       AssertionError: expected call not found.\nE       Expected: push('"

the good news is that passing a token to a public content doesn't affect, so this could be the only code

In [20]: requests.get("https://raw.githubusercontent.com/facebook/react/main/README.md",
    ...:     headers={"Authorization": f"token {os.environ['SHBIN_GITHUB_TOKEN']}", "Accept": "application/vnd.github.v3.raw'"}
    ...: )
Out[20]: <Response [200]>

Comment on lines +152 to +153
except Exception as e:
print(f"[red]x[/red] {e}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which other exception could be?

Comment on lines +370 to +372
with pytest.raises(Exception) as exc_info:
raise Exception("There was a problem with your download, please check url")
assert str(exc_info.value) == "There was a problem with your download, please check url"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has no sense Alvar. It's a self-fulfilling prophecy: "I will prove that the following code raise an exception " and the following code is... literally raise Exception().

Comment on lines +366 to +367
working_dir = tmp_path / "working_dir"
working_dir.mkdir()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need a subfolder. tmp_path is already a temporary Path instance.

requests_mock.get("https://raw.githubusercontent.com/another_awesome/repository/main/hello.md", status_code=400)
working_dir = tmp_path / "working_dir"
working_dir.mkdir()
os.chdir(working_dir)
Copy link
Collaborator

@mgaitan mgaitan Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a good practice to change the dir in this way. When needed, use monkeypatch.chdir so it's undo automatically on teardown (even if the test fails or errored). https://docs.pytest.org/en/7.1.x/reference/reference.html#pytest.MonkeyPatch.chdir

Comment on lines +357 to +358
os.chdir(working_dir)
main(["dl", "https://github.com/another_awesome/repository/blob/main/hello.md"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would nice to have an option to define where to download the contents, like

shbin dl URL  [-o  path] 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

allow shbin dl to download from any accesible repo
2 participants