-
Notifications
You must be signed in to change notification settings - Fork 0
/
sitecheck.cabal
64 lines (61 loc) · 2.91 KB
/
sitecheck.cabal
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
Name: sitecheck
Version: 0.0.5
Cabal-Version: >= 1.10
Author: Brenton Ashworth
Maintainer: brenton at formpluslogic dot com
Synopsis: Single domain web site testing crawler library.
Description: SiteCheck is a Haskell library for web developers and web site
administrators which allows for the creation of crawlers which
can find and report problems in a single domain.
.
Because it is a crawler, its deafault behavior is to report any
pages which return non 200 status codes. New behaviors may be
added to the crawler to allow it to report any information that
can be obtained from analyzing the contents of a web page.
.
SiteCheck will crawl a domain and record the status of every
page which is retrieved. Any URL which points outside of the
domain being searched will be sent a HEAD request so that it can
be reported but not crawled.
.
Every web site is different and so the crawling process can be
scripted. A Script allows the developer to set output options,
curl options, add seed URLs, specify decisions which are to be
made when a specific URL is encountered and provide mappings
which allow scraped links to be filtered and transformed.
.
Decisions include submitting a form, adding the links on a page
to the crawl stack or ignoring a page. SiteCheck is built on top
of Network.Curl. This combined with Decisions allow it to log in
and crawl secure areas of a site.
.
SiteCheck will automatically follow redirects until either a non
redirecting status code is returned or a cycle is detected.
Redirect status codes are not reported unless a cycle is detected.
Otherwise the final status code is reported. Output can be
configured to show redirects.
Category: Web
Homepage: http://github.com/brentonashworth/sitecheck
Bug-Reports: http://github.com/brentonashworth/sitecheck
Copyright: (c) 2011 Brenton Ashworth
License: BSD3
License-File: LICENSE
Stability: alpha
Build-Type: Simple
Tested-With: GHC ==7.0.3
Library
Default-Language: Haskell98
Build-Depends: base >= 4.0,
tagsoup ==0.12.*,
curl ==1.3.*,
url ==2.1.*,
regex-posix,
regex-compat,
containers
Exposed-Modules: Network.SiteCheck,
Network.SiteCheck.Filter,
Network.SiteCheck.URL,
Network.SiteCheck.Data
Other-modules: Network.SiteCheck.Util,
Network.SiteCheck.Output
Hs-Source-Dirs: src