Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change to youtube-dl dependency breaks stand-alone module usage #110

Open
ktws opened this issue Jul 28, 2015 · 27 comments
Open

Change to youtube-dl dependency breaks stand-alone module usage #110

ktws opened this issue Jul 28, 2015 · 27 comments

Comments

@ktws
Copy link

ktws commented Jul 28, 2015

Documentation has previously stated (and still does at: http://np1.github.io/pafy/) that Pafy is a small, standalone, single importable module file (pafy.py). The new youtube-dl dependency breaks all software currently using pafy.py as a stand-alone module.

In addition youtube-dl is a resource hog, both in memory usage and start-up time, and not suitable for use on lower end machines. A good example would be the original Raspberry Pi where youtube-dl takes up to 10 seconds to load and "non youtube-dl pafy" takes a couple of seconds or less, depending on network speed.

I'm sure there's a good reason for this change, however i'd still like to urge you to think again and keep Pafy a fast, lightweight youtube utility by reverting the youtube-dl dependency.

@ids1024
Copy link
Contributor

ids1024 commented Jul 28, 2015

You make a good point, and the dependency on youtube-dl is still open to debate. I suppose I merged the code in too quickly, but it can still be reverted if it is determined that the disadvantages outweigh the advantages. I did not really consider low end machines, since youtube-dl runs just fine on my computer with a quad-core Ivy Bridge processor and 16 GB of ram.

Luckily I have a first generation Raspberry Pi (model B), so I can test it.

Using time one my raspberry pi, ytdl -i seems to take 33 seconds with the youtube-dl version, and 8 with the old one. This is roughly a four-fold difference, so it is definitely significant.

On the other side, here are some reasons for the youtube-dl dependency:

  • Pafy's code is much simpler now. Not only is it shorter, but the most incomprehensible code is now removed.
  • Effort does not need to be duplicated. If something new is implemented or youtube breaks the current api, youtube-dl can just implement the changes with no change or little change to pafy.
  • Youtube-dl has numerous people involved. At present, I am the only one maintaining pafy, although I did not originally develop it and various people have made contributions.
  • I am reluctant to admit this, but I do not really understand much of the code that was removed, which is not good for trying to find any bugs in it. The code has to do some really crazy things like parsing JavaScript. Youtube-dl probably also implements this better, although that could be debated.
  • The youtube-dl-based code fixes at least one issue, support for live streams. I noticed this a while after the change when someone asked about live streams on irc. It may have fixed other issues as well.

So, another option would be to support both youtube-dl and the current code base as backends for handling streams, which would perhaps not be too difficult. This is obviously not preferable, as the latter would likely be as well supported. I am open to any suggestions, and both reverting the change and keeping it are currently options.

It could also be possible to copy some of youtube-dl's code into pafy. I will look into this a little. Obviously there is no need for all the code relating to sources other than youtube, and the code providing the command line is also not needed. Perhaps taking one or two files and integrating them into pafy could provide better performance. I am not sure quite what aspect of youtube-dl makes it noticeably slower.

@ktws
Copy link
Author

ktws commented Jul 28, 2015

Yeh, I don't understand a lot of the code either and i've spent hours trying to dissect & digest! I think realistically, long term, you're right to use youtube-dl because of the developer support base it has. It's slower sure, but I guess it'll be more stable long term.

It might be an idea to move pafy-youtube-dl to a different file and keep pafy-original as "pafy.py" though. That way you can add a warning in pafy-original that it's no longer supported and developers should use the new version instead. A few months down the line (or when it breaks) pafy-original can be removed altogether. Something like that anyway, you get the idea.

Might be worth getting in touch with a youtube-dl developer to see if there is a more efficient way of fetching youtube only data for pafy. Worth a shot anyway.

@lol768
Copy link
Contributor

lol768 commented Jul 30, 2015

I am not sure quite what aspect of youtube-dl makes it noticeably slower.

Probably the re-encode that takes place after a video download? I could see that being slow with the Broadcom SoC on the rPi.

@ids1024
Copy link
Contributor

ids1024 commented Jul 30, 2015

@lol768 that may be slow, but it is not the issue here. I have not changed the actual downloading code, and just a metadata retrieval is slow as well.

I added a commit that makes it a bit faster at least.

@ids1024
Copy link
Contributor

ids1024 commented Jul 31, 2015

On the bright side, it seems a portion of the delay on the raspberry pi is just importing youtube_dl, so multiple sequential queries within the same program will not be quite as bad as the numbers above suggest.

Perhaps it would be best to make the new youtube-dl version a fork (pafy-ng?), while the old version could be maintained as is to a limited extent, with bug-fixes but probably no new features. The new version could drop python 2 support and make other changes, possibly breaking api compatibility is some minor ways if that is advantageous. (It would make sense to somewhat change the output of get_playlist() in porting it to gdata3, which would probably not break many programs, if any).

Or perhaps instead of pafy-ng, it should be pafy 2.0. There could be a 0.3.76 etc. as minor fixes to the current youtube-dl-less version as well. This model of creating a new version with breaking changes while maintaining (at least somewhat) the older version is not uncommon, for example with python 2 and 3. This seems like the best option.

@ghost
Copy link

ghost commented Aug 23, 2015

@ktws @ids1024 Hi, can anyone try this lite version of youtube-dl?, removed support for other sites than youtube.com. It should be faster on Rpi...

@ids1024
Copy link
Contributor

ids1024 commented Aug 23, 2015

@pulpe That appears to be a binary file, and I don't like running binaries from random websites. This is something you created yourself? I see you have a repository at https://github.com/pulpe/youtube-dl-lite, but that seems to just have the commits from youtube-dl. Could you push your changes to that repo so I can test it without downloading a binary file?

It could also help to use pypy instead of cpython. The pypy website has downloads for rasbian, but I haven't tested it on my pi since it doesn't seem seem to be packaged by archlinux-arm.

@ghost
Copy link

ghost commented Aug 23, 2015

It's a zipped python file, is executable. Commit is now pushed to repo...

@ids1024
Copy link
Contributor

ids1024 commented Aug 23, 2015

@pulpe Oh, that's not so bad. file youtube-dl-lite gave me youtube-dl-lite: a /usr/bin/env python script executable (binary data), so I though it was precompiled python bytecode or something like that. It's good to have it in git though, for easy diffs between it and standard youtube-dl and whatnot.

Here are the timings I get running time ytdl -i 4BdVs0K6094 on my pi.:

Last pafy release: 4.392 seconds
Current pafy git with youtube-dl: 33.903 seconds
Current pafy git with youtube-dl-lite: 14.423 seconds

So it is indeed substantially faster, though not quite as fast as one might like.

@ghost
Copy link

ghost commented Aug 23, 2015

Looks like complete rewrite is needed.

@ids1024
Copy link
Contributor

ids1024 commented Aug 28, 2015

It would be good to have more input on this change. I still think it is better to use youtube-dl as a backend, but it would be good to hear how many people find this a problem, exactly why (slow on slow hardware, prefer to avoid additional dependencies, anything I might not have considered) as well as any other ideas.

This is a somewhat major change, so, as I believe I mentioned earlier, it should probably has a bit of a version bump, so either 0.4 or 1.0. I'm currently leaning more toward just 0.4.

@ids1024
Copy link
Contributor

ids1024 commented Sep 26, 2015

I'm thinking about releasing a new version of pafy soon, with the youtube-dl changes. If anyone has something to say about this change before the release, speak up now.

@zgrimshell
Copy link
Contributor

hey @ids1024 well, it's big change indeed as pafy looses probably one of the most attracting points: being a stand-alone python module. On the other hand, it's up to you and active contributors to decide is a change needed for making future development and usage of pafy and mps-youtube easier that way. If it doesn't make it a lot easier, then I would lean against adding youtube-dl as dependency - but as I said that call should be on you and current active contributors.

Also 1.0 vs 0.4 - for sure 0.4 as 1.0 should show some stable API or a release on which people can rely (dependency on youtube-dl makes a huge change for people that wanted a stand-alone module for their software, so unless youtube-dl is there to stick and proves stable/good enough, then it should be 0,4).

Just my 2c

P.S. ($LIFE is taking too much of my time so the result is my huge inactivity but I read mails from time to time)

@ids1024
Copy link
Contributor

ids1024 commented Sep 26, 2015

This does make pafy much simpler and fixes several bugs, but there certainly are some arguments against the change.

As for the version number, I think it would be good to have a 1.0 release at some point, but that can be put off until later. So make it 0.4.

So, I think what would make sense is to release pafy 0.4 with the youtube-dl changes, but also release a 0.3.76 with the other minor bug fixes but with the old stream code instead of youtube-dl. Bug fixes (and features, if straightforward) can be backported 0.3.x for at least some time.

I will do that if no one voices further concerns or suggests something better.

@JKatzwinkel
Copy link
Member

I am not happy with relying on youtube-dl of the speed and resources issue and simply the concern about having a dependency. Part of the beauty of pafy is that it's standalone, as zgrimshell points out, and as I tell people while promoting mpsyt. I'd prefer pafy to not require youtube-dl.

On the other hand, I've also had some difficulties comprehending the pafy code to some extent, which kind of kept me from get involved in maintenance. So I really get it when you feel like maintaining pafy will be easier with the youtube-dl changes.

I think I can live with the pafy-ng/pafy_3.7.x model. Hopefully I will be able to keep up with the development of both versions.

@ids1024
Copy link
Contributor

ids1024 commented Sep 26, 2015

It does seem that some people oppose this change, though I personally don't see that it matters much if pafy is standalone or has an added dependency. I must admit that the speed difference could be significant for some uses on slow hardware though.

An alternative to having separate versions would be to have the code support both backends. So, youtube-dl would be the default but the old stream code could be used if youtube-dl is not installed or if the default is overridden by an argument to the Pafy class. I think this would not be too difficult. Any thoughts?

@ids1024
Copy link
Contributor

ids1024 commented Oct 4, 2015

I have released 0.3.76 and 0.4.0 today, the first with and the second without youtube-dl changes. This does not have to be a permanent change, and more discussion is encouraged.

@JKatzwinkel
Copy link
Member

You probably mean the other way around: pafy 0.4.0 being the one with the youtube-dl dependency, and 0.3.76 from 0.3.x branch the one without it.
Anyway, thanks for putting thought in and taking care of this. This is probably an ok intermediate solution and I hope further discussion will come up to help sort things out.

@scolby33
Copy link

scolby33 commented Oct 4, 2015

I am having an issue with the 0.4.0 release of pafy. Here is an example of the issue using the example code from the readme:

>>> import pafy
>>> v = pafy.new("https://www.youtube.com/watch?v=bMt47wvK6u0")
>>> v.title
'Richard Jones: Introduction to game programming - PyCon 2014'
*snip*
>>> v.streams
[]
>>> v.getbest()
*returns nothing*

Python 3.5.0 (compiled from Homebrew) running in a clean virtualenv
Mac OS X 10.11

$ pip freeze
pafy==0.4.0
wheel==0.24.0
youtube-dl==2015.9.28

This sequence of actions works perfectly with pafy 0.3.76:

*snip*
>>> v.streams
[normal:mp4@1280x720, normal:webm@640x360, normal:mp4@640x360, normal:flv@320x240, normal:3gp@320x240, normal:3gp@176x144]
>>> v.getbest()
normal:mp4@1280x720

Any ideas?

@ids1024
Copy link
Contributor

ids1024 commented Oct 4, 2015

@scolby33 I found the problem and fixed it in commit 12be97e.

I'll release a version 0.4.1 by the end of the day with this fix, since it is a fairly major issue.

@ids1024
Copy link
Contributor

ids1024 commented Dec 21, 2015

Commit c7fceb8 on the 0.3.x branch pulls in a single file from youtube_dl, which is used for javascript parsing. This fixes the recent errors, and makes the code simpler and more maintainable. If something breaks, that file can just be replaced with the changes in youtube_dl, if it is that code that was broken (as it seems to generally be). I will release a new 0.3.x version soon.

There are still advantages to the youtube_dl dependency, so the 0.4.x versions will still have it for now, but this is still open for discussion.

@ids1024
Copy link
Contributor

ids1024 commented Jan 2, 2016

I think it may be good to merge the 0.4.x and 0.3.x versions, and allow changing the backend at run time. This is how I think it would work:

  • If youtube-dl is installed, use that as a backend, with the built in (current 0.3.x) backend as a fallback
  • Allow changing this by setting an environmental variable (something like export PAFY_BACKEND=internal).

This should be fairly simple to implement. Does this seem like a reasonable way? Is there some other way the backend should be set?

@JKatzwinkel
Copy link
Member

Why use env vars instead of config file?

@ids1024
Copy link
Contributor

ids1024 commented Jan 8, 2016

I'm not really sure whether or not an environmental variable would be better. An environmental variable seems like a somewhat simpler method, and can be set either permanently (.bashrc/.zshrc, /etc/profile.d, etc.) or on a per call basis (but then, perhaps that does not help).

@madalu
Copy link

madalu commented Jan 9, 2016

I would kindly ask that the dependency on youtube-dl be made optional. With the youtube-dl dependency, mps-youtube stalls for 20-30 seconds between songs on my pogoplug running arch linux, making it unworkable. Previously there was just a few seconds pause between songs.

@ids1024
Copy link
Contributor

ids1024 commented Jan 10, 2016

@madalu You can use version 0.3.82, which has the internal stream code instead of using youtube_dl but is otherwise the same. I probably will merge the 0.3.x version and 0.4.x as stated above, as soon as I get around to it. It shouldn't be too hard.

@ids1024
Copy link
Contributor

ids1024 commented Jan 22, 2016

I have created a branch with backend switching, using an environmental variable as stated above:
https://github.com/mps-youtube/pafy/tree/backend_switching

It probably has some bugs, since fairly substantial changes were required. Once it is working properly, it can be merged into develop and then released as version 0.5.0.

0-wiz-0 pushed a commit to NetBSD/pkgsrc-wip that referenced this issue Aug 27, 2017
pkgsrc changes:
- Add support for ALTERNATIVES
- Rework the appending of ${PYVERSSUFFIX} to bin/ytdl. Instead of using
  `post-extract' and a patch for setup.py directly do that via `post-install'
  target, similarly to other Python multi-packages (this is simpler and less
  intrusive).

Changes:
15 January 2017
Version 0.5.3.1

[Bugfix] - Fix issue when stdout.encoding is None (#165)

-------------------------------------------------------------------------------

13 January 2017
Version 0.5.3

[Bugfix] - Handle occassional issue reading category (@regisb)
[Bugfix] - Encode filename to system locale (#163)
[Update] - Support m.youtube.com urls (@HouseK) (#159)
[Update] - Support gaming.youtube.com urls (@Phosphorus-M) (#161)
[Update] - Fix running under Boost.Python (#162)

-------------------------------------------------------------------------------

18 August 2016
Version 0.5.2

[Bugfix] - Fix internal backend (#140)
[Update] - Support liked video playlist (#149)
[Update] - Add spaces to file name format (@ghost) (#145)

-------------------------------------------------------------------------------
31 May 2016
Version 0.5.1

[Bugfix] - Fix python 2 import (#134)
[Bugfix] - Fix two bugs in the internal backend (major bug remains unfixed)
[Update] - Remove deprecated signature argument

-------------------------------------------------------------------------------

28 February 2016
Version 0.5.0

[Feature] - Support both youtube-dl backend and internal backend (old 0.3.x
            branch. This makes youtube-dl an optional dependency, though it
            is more stable and thus recommended. There will be no more 0.3.x
            releases, as this eliminates the need for them.

            To use the internal backend even if youtube-dl is installed, set
            the environmental variable PAFY_BACKEND to "internal".
[Bugfix] - Fix UnicodeDecodeError(krishnawhite) (#129)
[Bugfix] - Fix quality property

-------------------------------------------------------------------------------

1 January 2016
Version 0.4.3

[Feature] - get_playlist2() command with new interface and support for
  playlists longer than 200 videos (ids1024)
[Bugfix] - Fix ZeroDivisionError (cpnielsen) (#125)

-------------------------------------------------------------------------------

11 November 2015
Version 0.4.2

[Bugfix] - Fix crash when snippet has no tags property
[Bugfix] - Fix issue with get_filesize on some streams
[Bugfix] - Fix bigthumb and bigthumbhd (#118)
[Feature] - Allow specifying youtube-dl arguments (#115 #120)

-------------------------------------------------------------------------------

4 October 2015
Version 0.4.1

[Bugfix] - Fix Pafy.streams

-------------------------------------------------------------------------------

4 October 2015
Version 0.4.0

[Major Update!] - Pull in youtube-dl as a dependency, replacing current
                  stream code [#109]

This is a major change, and will likely prove controversial, so version
0.3.76 has been released concurrently with all the changes not related to
youtube-dl. This makes the code simpler and easier to maintain while fixing
various bugs, but has some disadvantages.

If this is a problem for you, please use version 0.3.76 AND state your reasons
and suggestions at issue #110 on Github.
mps-youtube/pafy#110

-------------------------------------------------------------------------------

4 October 2015
Version 0.3.76

[Bugfix] - Fix bug in ytdl (#101 #102)
[Update] - Use better method for matching urls (#104)
[Bugfix] - Use .opus extension for opus audio (#108)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants