hpricot
most popular brand new
#66 ([PATCH] for RakeFile: rlcodegen name changed in ragel v5.18) on Hpricot
The latest version of ragel (5.18) has removed rlcodegen and split the functionality into four differently named programs. I've included a patch for the RakeFile? which checks the ragel version numberHpricot Documentation
Hpricot is a fast, flexible HTML parser written in C. It‘s designed to be very accommodating (like Tanaka Akira‘s HTree) and to have a very helpful library (like some JavaScript libs — JQuery, PrototyscRUBYt!
Mechanize and Hpricot on Steroids. scRUBYt! is a simple to learn and use, yet powerful web scraping toolkit written in Ruby. The idea behind making scRUBYt! was to show a few simple concepts of Web exscRUBYt!
Mechanize and Hpricot on Steroids. scRUBYt! is a simple to learn and use, yet powerful web scraping toolkit written in Ruby. The idea behind making scRUBYt! was to show a few simple concepts of Web exScrapes
Scrapes is a framework for crawling and scraping multi-page web sites. Unlike other scraping frameworks, Scrapes is designed to work with “dirty” web sites. That is, web sites that were not designed tRagel State Machine Compiler
Ragel compiles finite state machines from regular languages into executable C, C++, Objective-C, or D code. Ragel state machines can not only recognize byte sequences as regular expression machines doHpricot Roadmap
Hpricot is a very flexible HTML parser, based on Tanaka Akira's HTree, but with the scanner recoded in C. I've also borrowed a number of ideas from JQuery to make traversing and altering the HTML alot