プログラミング + アカデミック + 何か面白いこと

  1. Programming
  2. 52 view

[node.js]what libraries we can use for crawling website : cheerio

I’ll use libraries which are “request” and “cheerio”.

npm install request
npm install cheerio

“request” allows us to get data from target URL.
And then, “cheerio” allows us to analyse the retrieved data with DOM.

Sample is following.

#!/usr/bin/env node

var request = require("request");
var cheerio = require("cheerio");

var request_url = "http://www.google.com";

request({url: request_url}, function(error, response, body)
{
  if (!error && response.statusCode == 200) {
    $ = cheerio.load(body);

    var url = response.request.href;
    var title = $("title").text();

    console.log(url);
    console.log(title);
  } else {
    console.log(response.statusCode);
  }
});

Programming recent post

  1. Install sbt 1.0.0 and run sample template

  2. [Machine Learning]Created docker image includ…

  3. [Node.js]How to write batch script with Node.…

  4. [Play][Scala]Develop Request Driven Batch Usi…

  5. [OpenCV][Ruby]Auto check web page design corr…

関連記事

PAGE TOP