discrmetapath

Usage:

Please clone this repo to your local machine.
Download Wikipedia's database from here.
Import this database into your MySQL Server.
Edit DiscrMetaPath/src/main/java/edu/nd/dsg/util/ConnectionPool.java, change URL, USER, and PASS to yours.
Unzip DiscrMetaPath/data.tar.gz to DiscrMetaPath/, after this you should have all the data under DiscrMetaPath/data
Build the project by make wikibuild. The jar file will be generated under DiscrMetaPath/target/
Run generated jar file by java -jar JAR_FILE_YOU_GENERATED

The command line arguments are:

    Usage
    Generate paths: -GEN [-NoSQL cache types first to speedup] [-all get all paths instead of pathLength == 2] [-p build patent]
    Translate paths: -TRANS [-a output all paths] [-nd do not get most discri/similar paths] [-oNum get NUM paths between discri&similar paths] [-p build patent]
    Generate Term frequency: -TERM [-BuildWikiTF generate term frequency] [-BuildPatentTF generate term frequency] [-BuildWikiDF generate document frequency] [-BuildPatentDF generate document frequency]
    Generate Cos distance frequency(sequential): -COS [-p build patent]
    Generate BM25 score: -BM [-ACC accumulative (x,y),(x+y,z),...] [-NODE  sequential (x,y),(y,z),...] [-p build patent]

Results:

If you only interested in the results we get, you can get the data from result folder. The data format for each file is:

For CrowdFlower result files:

  _unit_id,
  _golden,
  _canary,
  _unit_state,
  _trusted_judgments,
  _last_judgment_at,
  choose_path,    // Path that chosen by human
  choose_path:confidence,
  end,    // End article
  path_1, // Path between start and end, generated by our algorithm
  path_2,
  path_3,
  path_4,
  path_5,
  start   // Start article

For other csv files:

  groupId, // Each unique groupId represent for a CrowdFlower task
  pathId, //  Equivalent to CrowFlower's path_*
  nodeId, //  Score of the node at position `i` in path_*

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
DiscrMetaPath		DiscrMetaPath
kdd2014		kdd2014
result		result
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

discrmetapath

Usage:

Results:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

discrmetapath

Usage:

Results:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages