Skip to content

Commit 59ef5df

Browse files
committed
Initial commit
0 parents  commit 59ef5df

26 files changed

+5510
-0
lines changed

.gitignore

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
/node_modules
2+
/vendor
3+
/logs
4+
.env.exemple
5+
phpunit.xml
6+
.phpunit.result.cache
7+
Homestead.json
8+
Homestead.yaml
9+
npm-debug.log
10+
yarn-error.log
11+
*.sublime-*

CHANGELOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Changelog
2+
3+
All notable changes to **[snippet-bot](https://github.com/snippetify/snippet-sniffer)** will be documented in this file
4+
5+
## 1.0.0 - 2020-06-26
6+
7+
- initial release
8+

CONTRIBUTING.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Contributing
2+
3+
Contributions are **welcome** and will be fully **credited**.
4+
5+
Please read and understand the contribution guide before creating an issue or pull request.
6+
7+
## Etiquette
8+
9+
This project is open source, and as such, the maintainers give their free time to build and maintain the source code held within. They make the code freely available in the hope that it will be of use to other developers. It would be extremely unfair for them to suffer abuse or anger for their hard work.
10+
11+
Please be considerate towards maintainers when raising issues or presenting pull requests. Let's show the world that developers are civilized and selfless people.
12+
13+
It's the duty of the maintainer to ensure that all submissions to the project are of sufficient quality to benefit the project. Many developers have different skillsets, strengths, and weaknesses. Respect the maintainer's decision, and do not be upset or abusive if your submission is not used.
14+
15+
## Viability
16+
17+
When requesting or submitting new features, first consider whether it might be useful to others. Open source projects are used by many developers, who may have entirely different needs to your own. Think about whether or not your feature is likely to be used by other users of the project.
18+
19+
## Procedure
20+
21+
Before filing an issue:
22+
23+
- Attempt to replicate the problem, to ensure that it wasn't a coincidental incident.
24+
- Check to make sure your feature suggestion isn't already present within the project.
25+
- Check the pull requests tab to ensure that the bug doesn't have a fix in progress.
26+
- Check the pull requests tab to ensure that the feature isn't already in progress.
27+
28+
Before submitting a pull request:
29+
30+
- Check the codebase to ensure that your feature doesn't already exist.
31+
- Check the pull requests to ensure that another person hasn't already submitted the feature or fix.
32+
33+
## Requirements
34+
35+
If the project maintainer has any additional requirements, you will find them listed here.
36+
37+
- **[PSR-2 Coding Standard](https://github.com/php-fig/fig-standards/blob/master/accepted/PSR-2-coding-style-guide.md)** - The easiest way to apply the conventions is to install [PHP Code Sniffer](http://pear.php.net/package/PHP_CodeSniffer).
38+
- **Add tests!** - Your patch won't be accepted if it doesn't have tests.
39+
- **Document any change in behaviour** - Make sure the `README.md` and any other relevant documentation are kept up-to-date.
40+
- **Consider our release cycle** - We try to follow [SemVer v2.0.0](http://semver.org/). Randomly breaking public APIs is not an option.
41+
- **One pull request per feature** - If you want to do more than one thing, send multiple pull requests.
42+
- **Send coherent history** - Make sure each individual commit in your pull request is meaningful. If you had to make multiple intermediate commits while developing, please [squash them](http://www.git-scm.com/book/en/v2/Git-Tools-Rewriting-History#Changing-Multiple-Commit-Messages) before submitting.
43+
44+
**Happy coding**!

LICENSE.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2020 Snippetify LLC
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
# Snippet sniffer
2+
3+
**Snippet sniffer** allows you to extract code snippets from any websites.
4+
5+
## What it does
6+
7+
This library allows you
8+
9+
1. To get url seeds from search engine api (Google)
10+
2. Get code snippets from any web page by crawling url seeds.
11+
12+
## How to use it
13+
14+
```bash
15+
$ composer require snippetify/snippet-sniffer
16+
```
17+
18+
```php
19+
use Snippetify\SnippetSniffer\SnippetSniffer;
20+
21+
// Configurations
22+
$config = [
23+
// Required
24+
// Search engine api configuration keys
25+
'provider' => [
26+
"cx" => "your google Search engine ID",
27+
"key" => "your google API key"
28+
'name' => 'provider name (google)',
29+
],
30+
// Optional
31+
// Useful for adding meta information to each snippet
32+
'app' => [
33+
"name" => "your App name",
34+
'version' => 'your App version',
35+
],
36+
// Optional
37+
// Useful for logging
38+
'logger' => [
39+
"name" => "logger name",
40+
'file' => 'logger file path',
41+
]
42+
];
43+
44+
// Required
45+
// Your query
46+
$query = "your query";
47+
48+
// Optional
49+
// Meta params
50+
$meta = [
51+
"page" => 1,
52+
"limit" => 10,
53+
];
54+
55+
// Fetch snippets
56+
// @return Snippetify\SnippetSniffer\Common\Snippet[]
57+
$snippets = SnippetSniffer::create($config)->fetch($query, $meta);
58+
/*
59+
* Snippet object public attributes [
60+
* title: string,
61+
* code: string,
62+
* description: string,
63+
* tags: array, // Array of string, also contains the snippet language
64+
* meta: array
65+
*]
66+
*/
67+
```
68+
69+
#### Providers
70+
71+
Providers allow you to get a **stack of seeds**(urls to scrape) from search engine API. Only Google search engine API is supported at this time, but you can create your own.
72+
73+
```php
74+
use Snippetify\SnippetSniffer\Providers\GoogleProvider;
75+
76+
// Search engine api configuration keys
77+
$config = [
78+
"cx" => "your google Search engine ID",
79+
"key" => "your google API key"
80+
];
81+
82+
// Your query
83+
$query = "your query";
84+
85+
// Meta params
86+
$meta = [
87+
"page" => 1,
88+
"limit" => 10,
89+
];
90+
91+
// url seeds
92+
// @return GuzzleHttp\Psr7\Uri[]
93+
$urlSeeds = GoogleProvider::create($config)->fetch($query, $meta);
94+
```
95+
96+
##### Add new providers
97+
98+
1. Git clone the project
99+
2. Create your new class in the ``Snippetify\SnippetSniffer\Providers` folder
100+
3. Each provider implements `Snippetify\SnippetSniffer\Providers\ProviderInterface`
101+
4. Take a look at `Snippetify\SnippetSniffer\Providers\GoogleProvider` to get you helped
102+
5. Your fetch method must return an array of `GuzzleHttp\Psr7\Uri`
103+
6. Add it in the providers stacks in the `Snippetify\SnippetSniffer\Core.php`
104+
7. Write tests. Take a look at `Snippetify\SnippetSniffer\Tests\Providers\GoogleProviderTest` to get you helped
105+
8. Send a pull request to us
106+
107+
#### Scrapers
108+
109+
Scrappers allow you to scrape html page and extract the snippets.
110+
111+
```php
112+
use GuzzleHttp\Psr7\Uri;
113+
use Snippetify\SnippetSniffer\Scrapers\DefaultScraper;
114+
115+
// Configurations
116+
$config = [
117+
// Optional
118+
// Useful for adding meta information to each snippet
119+
'app' => [
120+
"name" => "your App name",
121+
'version' => 'your App version',
122+
],
123+
// Optional
124+
// Useful for logging
125+
'logger' => [
126+
"name" => "logger name",
127+
'file' => 'logger file path',
128+
]
129+
];
130+
131+
// Your url
132+
$urlSeed = "website url to scrape";
133+
134+
// Fetch snippets
135+
// @return Snippetify\SnippetSniffer\Common\Snippet[]
136+
$snippets = (new DefaultScraper($config))->fetch(new Uri($urlSeed));
137+
```
138+
139+
##### Add new scrapers
140+
141+
1. Git clone the project
142+
2. Create your new class in the ``Snippetify\SnippetSniffer\Scrapers` folder
143+
3. Each scraper implements `Snippetify\SnippetSniffer\Scrapers\ScraperInterface`
144+
4. Take a look at `Snippetify\SnippetSniffer\Scrapers\StackoverflowScraper` to get you helped
145+
5. Your fetch method must return an array of `Snippetify\SnippetSniffer\Common\Snippet`
146+
6. Add it in the scrapers stacks in the `Snippetify\SnippetSniffer\Core.php`
147+
7. Write tests. Take a look at `Snippetify\SnippetSniffer\Tests\Scrapers\StackoverflowScraperTest` to get you helped
148+
8. Send a pull request to us
149+
150+
## Changelog
151+
152+
Please see [CHANGELOG](https://github.com/snippetify/snippet-sniffer/blob/master/CHANGELOG.md) for more information what has changed recently.
153+
154+
## Testing
155+
156+
You must set the **PROVIDER_NAME**, **PROVIDER_CX**, **PROVIDER_KEY** keys in phpunit.xml file before running tests.
157+
158+
```bash
159+
composer test
160+
```
161+
162+
## Contributing
163+
164+
Please see [CONTRIBUTING](https://github.com/snippetify/snippet-sniffer/blob/master/CONTRIBUTING.md) for details.
165+
166+
## Credits
167+
168+
1. [Evens Pierre](https://github.com/pierrevensy)
169+
170+
## License
171+
172+
The MIT License (MIT). Please see [License File](https://github.com/snippetify/snippet-sniffer/blob/master/LICENSE.md) for more information.
173+

composer.json

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
{
2+
"name": "snippetify/snippet-sniffer",
3+
"description": "Crawling and scraping web pages to extract snippets",
4+
"type": "library",
5+
"homepage": "https://snippetify.com",
6+
"keywords": [
7+
"bot",
8+
"snippet",
9+
"library",
10+
"scraping",
11+
"crawling"
12+
],
13+
"license": "MIT",
14+
"authors": [
15+
{
16+
"name": "Evens Pierre",
17+
"email": "[email protected]"
18+
},
19+
{
20+
"name": "Snippetify Community",
21+
"homepage": "https://github.com/snippetify"
22+
}
23+
],
24+
"minimum-stability": "dev",
25+
"autoload": {
26+
"psr-4": {
27+
"Snippetify\\SnippetSniffer\\": "src/"
28+
}
29+
},
30+
"autoload-dev": {
31+
"psr-4": {
32+
"Snippetify\\SnippetSniffer\\Tests\\": "tests/"
33+
}
34+
},
35+
"require": {
36+
"php": "^7.3.0",
37+
"fabpot/goutte": "^4.0@dev",
38+
"guzzlehttp/psr7": "^2.0@dev",
39+
"monolog/monolog": "^2.0@dev"
40+
},
41+
"require-dev": {
42+
"phpunit/phpunit": "^9.0"
43+
},
44+
"scripts": {
45+
"test": "phpunit",
46+
"post-package-install": [
47+
"php -r \"copy('phpunit.xml.dist', 'phpunit.xml');\""
48+
]
49+
}
50+
}

0 commit comments

Comments
 (0)