Skip to content

Latest commit



151 lines (134 loc) · 5.43 KB

File metadata and controls

151 lines (134 loc) · 5.43 KB

Loading Baseball Statistics Using Clojure

Beginning Baseball Stats Analysis with Clojure

Who pitched for the Pirates last year?

That’s my use case.

Free stats online!

text in standard, documented formats!

Download the data.

Load players.

We need a data structure to hold players. Use a map that maps directly to the data format. Write a function that, given a line of data, returns a player.

(defn player [data]
  "Returns a player bursted out of `data`."
  (let [[id lname fname bats throws team-id position]
        (clojure.string/split data #",")]
    {:id id :lname lname :fname fname :bats bats :throws throws
     :team-id team-id :position position}))

Now write a function that, given a list of lines, returns a list of players:

(defn players-for-lines [lines]
  "Returns a list of players created from `lines`."
  (if (empty? lines)
    (conj (players-for-lines (rest lines))
          (player (clojure.string/trim (first lines))))))

Now a function to read a file and return a list of players:

(defn load-players [file]
  "Returns a roster of players loaded from file `file`."
        (players-from-lines (clojure.string/split (slurp file)


(def pirates (load-players (str data-directory "PIT2013.ROS")))
(count pirates)


(first pirates)

Answer the question.

(def pirates-pitchers (filter #(= (:position %1) "P") pirates))
(pprint (last pirates-pitchers))

Answer some other questions.

What’s the ratio of left to right handed pirate pitchers?

(def southpaws (filter #(= (:throws %1) "L") pirates-pitchers))
(def righties  (filter #(= (:throws %1) "R") pirates-pitchers))
(/ (count southpaws) (count righties))


Does Francisco Liriano bat right or left?

(:bats (first (filter #(and (= (:fname %1) "Francisco")
                            (= (:lname %1) "Liriano")) pirates-pitchers)))


Load the teams.

We need a data structure to hold the team data.

Team data is provided as CSV with the following format:
Here’s a sample:
How about a map with keys

Each line in the file represents one team, so write a function that takes a line as input and returns a team.

(defn team [data]
  "Returns a team map bursted out of `data`."
  (let [[id league city name] (clojure.string/split data #",")]
    {:id id :league league :city city :name name}))

There are a number of teams in one file, one per line, so write a function that takes a list of lines and returns a list of teams.

(defn teams-for-lines [lines]
  "Returns a map of teams created from `lines`, indexed by id."
  (if (empty? lines)
    (let [this-team (team (clojure.string/trim (first lines)))
          this-team-id (:id this-team)]
      (conj (teams-for-lines (rest lines))
            (hash-map this-team-id this-team)))))

Now a function to read a file, break it into lines, and pass it to teams-for-lines:

(defn load-teams [file]
  "Returns teams loaded from file `file`".
  (teams-for-lines (clojure.string/split (slurp file)


(def teams (load-teams "TEAMS2013"))
(keys teams)

(“COL” “NYN” “MIA” “BOS” “TEX” “CIN” “CHN” “KCA” “WAS” “BAL” “HOU” “SEA” “MIL” “PHI” “MIN” “TBA” “DET” “ANA” “SLN” “NYA” “TOR” “ARI” “OAK” “LAN” “ATL” “SFN” “PIT” “CHA” “CLE” “SDN”)

#+name: testing-load-teams-get-pirates
  (get teams "PIT")

{:id “PIT”, :league “N”, :city “Pittsburgh”, :name “Pirates”}

Assign players to a Team’s roster.

#+name: load-roster
  (defn load-roster [team players]
    (filter #(= (:team-id %1) pirates))