Skip to content

Scripts to (mass-)scan URLs for implementation of P3P privacy policies.

License

Notifications You must be signed in to change notification settings

maroulb/P3Pscan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

P3Pscan

Scripts to (mass-)scan URLs for implementation of P3P privacy policies.

With regard to the proposed W3C standard, the script:

  • scans the 'well known location' (URL + '/w3c/p3p.xml') for a Policy Reference File ('p3p.xml'). If a reference file is found, the script tries to fetch the corresponding policy and evaluates it against the standard,
  • scans the HTTP headers from the main site and the 'well knokwn location' for the so called Compact Policy. If found, the validity of the policy is evaluated,
  • scans the HTML from the main site to find a '<link rel="P3Pv1" href='-pattern. If such link is found, the script tries to fetch the corresponding policy and evaluates the validity regarding the standard.

The result of a scan is a <p3pScanReport-[name_of_URL(-List)].csv> file that contains a detailed report about each scanned URL with information on WKL, Headers, Links and the validity of the found (compact) policies. Furthermore, a <domainreports[_name_of_URL(-List)]> folder is created which contains records for each URL where a P3P artefact was found. The artefacts are recorded for later evaluation inside URL specific text files.

Usage single thread version (P3PscanST.py)

python P3PscanST.py [option] [ URL | file ]

Options and arguments:
-h: print a help message and exit
-u URL: test a single URL
file: test a list of URLs from a textfile

The provided text file must contain a list of URLs, one per line. See the testlist folder for examples.

Usage multi thread version (P3PscanMT.py)

python P3PscanMT.py [option] [ file ]

Options and arguments:
-h: print a help message and exit
file: test a list of URLs from a textfile

To test a list of URLS provide a text file where each line consists of the tuple "rank,url" (like e.g. "top-1m.csv" from alexa). See the testlist folder for examples.

Notes

  • This is just a little playground to get an idea about the adoption of P3P on the web.

  • Particulary, the multi threaded version is just a PoC that needs some improvement. E.g. for large lists of URLs, the filling of the qURL queue should be done in batches performed by another group of multithreaded workers.

About

Scripts to (mass-)scan URLs for implementation of P3P privacy policies.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages