-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the parser doesn't work fine in many sites especially forums #1
Comments
here some sites won't get any proxies from them http://nntime.com/ |
Hi, thanks. Old v 1.3.0 BETA parsed sites such as: All these sites required JS, in 1.3.0 was Headless chrome with cloudflare bypassing. May be i add this in next patches. About: Proxies parse with simple regex (ip:port), but i add this in next patch. |
**tbh this version is great but now u mention old version have the ability to parse from such site like this, it will be great if the current update apply to parse from all sites even the proxies with port without ":" also, I don't see the download link for the previous versions can't wait to see the next update, Thank You.** |
hello @assnctr can you update the Proxy Parser to be able to scrape proxies with type of ip port like the sites i sent you above http://nntime.com/ i searched for v 1.3.0 BETA and i couldn't fight a download link :( |
hello @assnctr
I tried your proxy parser, and I can say it's the best Proxy Parser I ever found.
But they're a problem that the parser doesn't scrape proxy from many sites
If any site has the proxies like this the parser don't scrape them
example:
113.120.189.184 | 9999 | 高匿名 | HTTP | 山东省济宁市 电信 | 3秒 | 2019-01-05 23:30:59
124.94.196.188 | 9999 | 高匿名 | HTTP | 辽宁省阜新市 联通 | 1秒 | 2019-01-05 22:30:59
110.52.235.76 | 9999 | 高匿名 | HTTP | 湖南省岳阳市 联通 | 0.3秒 | 2019-01-05 21:30:56
39.137.107.98 | 80 | 高匿名 | HTTP | 中国 移动 | 2秒 | 2019-01-05 20:30:58
Also is there a way to make your parser take all the proxies from http://proxydb.net/ without going so heavy or take ages? because I tried all and it won't work too.
I hope you fix this and make it more advanced when it scrapes these types
Overall great work and thank you In #Advance.
The text was updated successfully, but these errors were encountered: