Skip to content

TeamCohen/GuineaPig

Folders and files

NameName
Last commit message
Last commit date
Sep 24, 2015
Dec 2, 2015
Sep 26, 2016
Oct 19, 2017
Oct 14, 2014
Jul 15, 2015
Dec 3, 2015
Oct 19, 2017
Feb 13, 2018
Dec 2, 2015
Dec 2, 2015
Dec 2, 2015
Sep 15, 2017
Dec 2, 2015
Sep 30, 2015
Nov 28, 2015
Dec 2, 2015
Dec 2, 2015
Dec 2, 2015

Repository files navigation

GuineaPig

Guinea Pig is (yet another) workflow language for Hadoop. For more information, including a tutorial, see: http://curtis.ml.cmu.edu/w/courses/index.php/Guinea_Pig

As the name suggests, Guinea Pig is similar to Pig, with some important differences.

  • Guinea Pig is pure Python, and embedded in Python, so there's less new stuff to learn.

  • Guinea Pig is simple. Programs use only ten pre-defined classes (like Join and Flatten), and the full implementation is less than 1500 non-comment-source lines.

  • Guinea Pig programs can be executed incrementally, and you can inspect and/or re-use partially constructed outputs - similar to the way that you might use make to implement a workflow.

  • Guinea Pig programs can be executed with or without a Hadoop backend, so you can use it for smaller-to-medium sized workflows, and then migrate these easily to a cluster.

About

Pure python PIG-like language

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages