Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI mode return & minor fixes #5

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
/build/*
/src/drivers/**/categories.json
/src/drivers/**/technologies/*
/src/drivers/**/wappalyzer.js
/src/images/icons/converted/*
/src/manifest.json
/src/manifest.bak.json
Expand Down
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ yarn install

## Usage

### Command line

```sh
node src/drivers/npm/cli.js https://example.com
```

### Chrome extension

- Go to `about:extensions`
Expand Down Expand Up @@ -507,4 +513,4 @@ Application version information can be obtained from a pattern using a capture g
</td>
</tr>
</tbody>
</table>
</table>
11 changes: 5 additions & 6 deletions bin/build.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,17 @@ const currentVersion = JSON.parse(
fs.readFileSync('./src/manifest-v3.json')
).version

const version = process.argv[2]
version = process.argv[2]

if (!version) {
// eslint-disable-next-line no-console
console.error(
`No version number specified. Current version is ${currentVersion}.`
console.warn(
`No version number specified. Current version is ${currentVersion}: will it use it.`
)

process.exit(1)
version = currentVersion
}

;['./src/manifest-v2.json', './src/manifest-v3.json'].forEach((file) => {
;['./src/drivers/npm/package.json', './src/manifest-v2.json', './src/manifest-v3.json'].forEach((file) => {
const json = JSON.parse(fs.readFileSync(file))

json.version = version
Expand Down
21 changes: 21 additions & 0 deletions bin/link.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
const fs = require('fs')

const link = (src, dest) => {
if (fs.existsSync(dest)) {
fs.unlinkSync(dest)
}

fs.linkSync(src, dest)
}

link('./src/js/wappalyzer.js', './src/drivers/npm/wappalyzer.js')
link('./src/categories.json', './src/drivers/npm/categories.json')

for (const index of Array(27).keys()) {
const character = index ? String.fromCharCode(index + 96) : '_'

link(
`./src/technologies/${character}.json`,
`./src/drivers/npm/technologies/${character}.json`
)
}
5 changes: 3 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,14 @@
"terminal-overwrite": "^2.0.1"
},
"scripts": {
"link": "node ./bin/link.js; node ./bin/manifest.js v3",
"lint": "eslint src/**/*.{js,json}",
"lint:fix": "eslint --fix src/**/*.{js,json}",
"validate": "yarn run lint && jsonlint -qV ./schema.json ./src/technologies/ && node ./bin/validate.js",
"convert": "node --no-warnings ./bin/convert.js",
"prettify": "jsonlint -si --trim-trailing-commas --enforce-double-quotes ./src/categories.json ./src/technologies/*.json",
"build": "yarn run validate && yarn run prettify && yarn run convert && node ./bin/build.js",
"build:safari": "xcrun safari-web-extension-converter --swift --project-location build --force src",
"build": "yarn run link && yarn run validate && yarn run prettify && yarn run convert && node ./bin/build.js",
"build:safari": "xcrun safari-web-extension-converter --swift --project-location build --force src/drivers/webextension",
"manifest": "node ./bin/manifest.js"
}
}
31 changes: 31 additions & 0 deletions src/drivers/npm/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
FROM node:14-alpine

MAINTAINER Wappalyzer <[email protected]>

ENV WAPPALYZER_ROOT /opt/wappalyzer
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true
ENV CHROMIUM_BIN /usr/bin/chromium-browser

RUN apk update && apk add -u --no-cache \
nodejs \
udev \
chromium \
ttf-freefont \
yarn

RUN mkdir -p "$WAPPALYZER_ROOT/browsers"

WORKDIR "$WAPPALYZER_ROOT"

COPY technologies ./technologies
COPY \
cli.js \
categories.json \
driver.js \
package.json \
wappalyzer.js \
yarn.lock ./

RUN yarn install

ENTRYPOINT ["node", "cli.js"]
155 changes: 155 additions & 0 deletions src/drivers/npm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Wappalyzer

[Wappalyzer](https://www.wappalyzer.com/) indentifies technologies on websites.

*Note:* The [wappalyzer-core](https://www.npmjs.com/package/wappalyzer-core) package provides a low-level API without dependencies.

## Command line

### Installation

```shell
$ npm i -g wappalyzer
```

### Usage

```
wappalyzer <url> [options]
```

#### Options

```
-b, --batch-size=... Process links in batches
-d, --debug Output debug messages
-t, --delay=ms Wait for ms milliseconds between requests
-h, --help This text
-H, --header Extra header to send with requests
--html-max-cols=... Limit the number of HTML characters per line processed
--html-max-rows=... Limit the number of HTML lines processed
-D, --max-depth=... Don't analyse pages more than num levels deep
-m, --max-urls=... Exit when num URLs have been analysed
-w, --max-wait=... Wait no more than ms milliseconds for page resources to load
-p, --probe=[basic|full] Perform a deeper scan by performing additional requests and inspecting DNS records
-P, --pretty Pretty-print JSON output
--proxy=... Proxy URL, e.g. 'http://user:pass@proxy:8080'
-r, --recursive Follow links on pages (crawler)
-a, --user-agent=... Set the user agent string
-n, --no-scripts Disabled JavaScript on web pages
-N, --no-redirect Disable cross-domain redirects
-e, --extended Output additional information
--local-storage=... JSON object to use as local storage
--session-storage=... JSON object to use as session storage
--defer=ms Defer scan for ms milliseconds after page load

```


## Dependency

### Installation

```shell
$ npm i wappalyzer
```

### Usage

```javascript
const Wappalyzer = require('wappalyzer')

const url = 'https://www.wappalyzer.com'

const options = {
debug: false,
delay: 500,
headers: {},
maxDepth: 3,
maxUrls: 10,
maxWait: 5000,
recursive: true,
probe: true,
proxy: false,
userAgent: 'Wappalyzer',
htmlMaxCols: 2000,
htmlMaxRows: 2000,
noScripts: false,
noRedirect: false,
};

const wappalyzer = new Wappalyzer(options)

;(async function() {
try {
await wappalyzer.init()

// Optionally set additional request headers
const headers = {}

// Optionally set local and/or session storage
const storage = {
local: {}
session: {}
}

const site = await wappalyzer.open(url, headers, storage)

// Optionally capture and output errors
site.on('error', console.error)

const results = await site.analyze()

console.log(JSON.stringify(results, null, 2))
} catch (error) {
console.error(error)
}

await wappalyzer.destroy()
})()
```

Multiple URLs can be processed in parallel:

```javascript
const Wappalyzer = require('wappalyzer');

const urls = ['https://www.wappalyzer.com', 'https://www.example.com']

const wappalyzer = new Wappalyzer()

;(async function() {
try {
await wappalyzer.init()

const results = await Promise.all(
urls.map(async (url) => {
const site = await wappalyzer.open(url)

const results = await site.analyze()

return { url, results }
})
)

console.log(JSON.stringify(results, null, 2))
} catch (error) {
console.error(error)
}

await wappalyzer.destroy()
})()
```

### Events

Listen to events with `site.on(eventName, callback)`. Use the `page` parameter to access the Puppeteer page instance ([reference](https://github.com/puppeteer/puppeteer/blob/main/docs/api.md#class-page)).

| Event | Parameters | Description |
|-------------|--------------------------------|------------------------------------------|
| `log` | `message`, `source` | Debug messages |
| `error` | `message`, `source` | Error messages |
| `request` | `page`, `request` | Emitted at the start of a request |
| `response` | `page`, `request` | Emitted upon receiving a server response |
| `goto` | `page`, `url`, `html`, `cookies`, `scriptsSrc`, `scripts`, `meta`, `js`, `language` `links` | Emitted after a page has been analysed |
| `analyze` | `urls`, `technologies`, `meta` | Emitted when the site has been analysed |
Loading