Skip to content

jmastr/gokogiri

This branch is 11 commits ahead of, 10 commits behind jbowtie/gokogiri:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

4eff335 · Jan 5, 2025
Jan 5, 2025
Jan 4, 2025
Jan 4, 2025
Jan 4, 2025
Oct 25, 2013
Jan 4, 2025
Jan 4, 2025
Oct 3, 2014
Mar 1, 2019
Feb 27, 2012
Aug 20, 2012
Jan 5, 2025
Mar 1, 2019
Jan 5, 2025
Jan 5, 2025
Jan 4, 2025
Jan 4, 2025

Repository files navigation

Gokogiri

Build Status codecov Go Report Card GoDoc

LibXML bindings for the Go programming language.

The gokogiri package provides a Go interface to the libxml2 library.

It is inspired by the ruby-based Nokogiri API, and allows one to parse, manipulate, and create HTML and XML documents. Nodes can be selected using either CSS selectors (in much the same fashion as jQuery) or XPath 1.0 expressions, and a simple DOM-like interface allows for building up documents from scratch.

It uses parsing default options that ignore errors or warnings, making it suitable for the poorly-formed 'tag soup' often found on the web. The xml.StrictParsingOption is conveniently provided for standards-compliant behaviour.

This fork incorporates changes required to compile on Go 1.4, libxml 2.13 and oniguruma 6 and above.

To install:

  • sudo apt-get install libxml2-dev libonig-dev
  • go get github.com/jmastr/gokogiri

To run test:

  • go test github.com/jmastr/gokogiri/html
  • go test github.com/jmastr/gokogiri/xml

Basic example:

package main

import (
  "io"
  "net/http"

  "github.com/jmastr/gokogiri"
)

func main() {
  // fetch and read a web page
  resp, _ := http.Get("http://www.google.com")
  page, _ := io.ReadAll(resp.Body)

  // parse the web page
  doc, _ := gokogiri.ParseHtml(page)
  defer doc.Free()

  // perform operations on the parsed page -- consult the tests for examples
}

Original upstream version by Zhigang Chen and Hampton Catlin.

About

A light libxml wrapper for Go

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 74.4%
  • HTML 20.1%
  • C 5.5%