Skip to content
This repository has been archived by the owner on Sep 28, 2020. It is now read-only.

YoutubeExtractor.YoutubeParseException: Could not parse the Youtube page #368

Open
Bazmoc opened this issue Nov 9, 2019 · 15 comments
Open

Comments

@Bazmoc
Copy link

Bazmoc commented Nov 9, 2019

tested on 3 link
all had the same error

@mwindischbauer91
Copy link

me too ...

@tmontney
Copy link

tmontney commented Dec 8, 2019

Same issue:

at System.Text.RegularExpressions.Match.Result(String replacement) at YoutubeExtractor.DownloadUrlResolver.GetHtml5PlayerVersion(JObject json) at YoutubeExtractor.DownloadUrlResolver.GetDownloadUrls(String videoUrl, Boolean decryptSignature)

@noammaoz
Copy link

noammaoz commented Jan 1, 2020

I have found that the json don't contains args url_encoded_fmt_stream_map and adaptive_fmts so the functions GetStreamMap and GetAdaptiveStreamMap failed to provide the urls.

Any idea how to solve this?

@noammaoz
Copy link

noammaoz commented Jan 2, 2020

I have tried to change the GetStreamMap from
json["args"]["url_encoded_fmt_stream_map"];
to
json["args"]["player_response"];
and then navigate to ["streamingData"]["formats"]
the same thing in GetAdaptiveStreamMap json["args"]["player_response"]; -> ["streamingData"]["adaptiveFormats"];
The function ExtractDownloadUrls is still failed because of the change of the structure.
Any idea how to solve this?

@noammaoz
Copy link

noammaoz commented Jan 5, 2020 via email

@rizksobhi
Copy link

rizksobhi commented Jan 5, 2020

I was able to get around this by doing the following:
1- Replace ExtractDownloadUrls by the following:

private static IEnumerable<ExtractionInfo> ExtractDownloadUrls(JObject json)
        {
            var info = new List<ExtractionInfo>();

            var formats = GetStreamMap(json);
            var adaptiveFormats = GetAdaptiveStreamMap(json);

            ExtractInfo(info, formats);
            ExtractInfo(info, adaptiveFormats);

            return info;
        }
        

2- Add the following helper function:

        private static void ExtractInfo(List<ExtractionInfo> info, JArray items)
        {
            if (items != null)
            {
                foreach (var item in items)
                {
                    bool requiresDecryption = false;
                    var url = item["url"]?.ToString();
                    info.Add(new ExtractionInfo { RequiresDecryption = requiresDecryption, Uri = new Uri(url) });
                }
            }
        }

3- Replace GetAdaptiveStreamMap by the following

        private static JArray GetAdaptiveStreamMap(JObject json)
        {
            JArray adaptiveFormat = null;
            JToken streamMap = json["args"]["player_response"];

            string streamMapString = streamMap == null ? null : streamMap.ToString();

            if (streamMapString != null)
            {
                dynamic playerResponse = JsonConvert.DeserializeObject(streamMapString);
                adaptiveFormat = playerResponse?.streamingData?.adaptiveFormats;
            }

            return adaptiveFormat;
        }

4- Replace GetStreamMap by the following:

        private static JArray GetStreamMap(JObject json)
        {
            JToken streamMap = json["args"]["player_response"];

            string streamMapString = streamMap == null ? null : streamMap.ToString();

            if (streamMapString == null || streamMapString.Contains("been+removed"))
            {
                throw new Exception("Video is removed or has an age restriction.");
            }

            dynamic playerResponse = JsonConvert.DeserializeObject(streamMapString);

            return playerResponse.streamingData?.formats;
        }

5- Replace GetHtml5PlayerVersion by the following:

        private static string GetHtml5PlayerVersion(JObject json)
        {
            var regex = new Regex(@"player[-|_](.+?).js");

            string js = json["assets"]["js"].ToString();

            return regex.Match(js).Result("$1");
        }

My fix will work only for unencrypted contents. We still need to fix the isEncrypted attribute
Cheers

@noammaoz
Copy link

noammaoz commented Jan 7, 2020

Thanks a lot for the response and solution.
the URL is not in var url = item["url"]?.ToString(); it should be
var url = item["url"]?.ToString();
if (url == null)
{
url = item["cipher"]?.ToString();
}

The Decipherer.cs - DecipherWithVersion need to change
from string jsUrl = string.Format("http://s.ytimg.com/yts/jsbin/player{0}.js", cipherVersion);
to string jsUrl = string.Format("http://s.ytimg.com/yts/jsbin/player_{0}.js", cipherVersion);

I have still having problem to download and i think it related to encrypted/signature. i'm getting 403 error.

@thehighboy
Copy link

Whats the link with the issue or are you having problems with all in general ?

@noammaoz
Copy link

noammaoz commented Jan 8, 2020

for example: https://www.youtube.com/watch?v=DkeiKbqa02g

@thehighboy
Copy link

for example: https://www.youtube.com/watch?v=DkeiKbqa02g

i get it no problem
Annotation 2020-01-08 213157

i will download this library and see if i can fix for you.

@noammaoz
Copy link

noammaoz commented Jan 9, 2020

Which library you are using?

@thehighboy
Copy link

Which library you are using?

im using my own code but when i started my project i used the decipher from this one.i am working on this one as i type and should be able to upload it tomorrow.

@LinPolly
Copy link

LinPolly commented Jan 9, 2020

LinPolly/YoutubeExtractor

I fixed the structure change of Youtube. You can try it.

@thehighboy
Copy link

LinPolly/YoutubeExtractor

I fixed the structure change of Youtube. You can try it.

took a quick look at it and it did download noammaoz link.i see you never changed the get title does it ever return a title that way ? and if you get the cipher once you can reuse for the lifetime of the app.Thank you now i will continue my own app as it needs love too ;)

@noammaoz
Copy link

noammaoz commented Jan 9, 2020

LinPolly/YoutubeExtractor

This library is working fine. Thanks a lot.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants