Skip to content

Speed up jekyll related posts functionality (--lsi, classifier-reborn, gsl, nmatrix, narray, Numo:NArray, Numo:GSL) #83

@0xdevalias

Description

@0xdevalias

(See also: #1)

Jekyll can "create an index for related posts" using the --lsi build command option, which uses the classifier-reborn gem to create a site variable of related posts:

More info on Jekyll's usage of LSI:

The gsl gem can make use of nmatrix and narray:

narray is in maintenance mode, and directs to numo-narray:

numo-gsl provides a GSL interface for Ruby with Numo::NArray:

I'm unsure if the numo gems can be used with classifier-reborn, and which of nmatrix/narray provide better speed; but I created an issue asking:

As noted in jekyll/classifier-reborn#193, i'm not sure if classifier-reborn is actively updated/maintained.


nmatrix was last updated in 2018, and at least one issue claims that Numo::NArray outperforms NMatrix

Several years have passed since the new version of NArray came out.

It appeared that NMatrix was not being maintained well.
And I think Numo::NArray now outperforms NMatrix in almost every way. (benchmark needed)

Newcomers try NMatrix first. After a while, they notice that NArray is far better in performance.
And they begin to make libraries dependent on NArray.


rb-gsl was last updated in 2017, and claims compatibility only with GSL versions up to v2.1:

Ruby/GSL is compatible with GSL versions upto 2.1.

I've asked if it is still maintained, but my guess is probably not:


My comment in reply to the following StackOverflow question:

The `--lsi` option comes from the [`classifier-reborn`][1] gem, which includes the following note about increasing speed under the [dependencies][2] heading:

> To speed up LSI classification by at least 10x consider installing
> following libraries.
> 
> [GSL - GNU Scientific Library][3]
>
> [Ruby/GSL Gem][4]
> 
> Note that LSI will work without these libraries, but as soon as they
> are installed, classifier will make use of them. No configuration
> changes are needed, we like to keep things ridiculously easy for you.

The [`gsl` gem's installation docs][5] mentions:

> the GSL libraries must already be installed before Ruby/GSL can be installed:
>
> - Debian/Ubuntu: +libgsl0-dev+
> - Fedora/SuSE: +gsl-devel+
> - Gentoo: +sci-libs/gsl+
> - OS X: `brew install gsl`

The [`gsl` gem can also make use of `nmatrix` or `narray`][6], which I believe may further increase the speed/efficiency:

> In order to use rb-gsl with NMatrix you must first set the NMATRIX
> environment variable and then install rb-gsl:
> - `gem install nmatrix`
> - `export NMATRIX=1`
> - `gem install rb-gsl`
> 
> This will compile rb-gsl with NMatrix specific functions.
> 
> For using rb-gsl with NArray:
> - `gem install narray`
> - `export NARRAY=1`
> - `gem install rb-gsl`
> 
> Note that setting both `NMATRIX` and `NARRAY` variables will lead to
> undefined behaviour. Only one can be used at a time.

I'm not sure whether `nmatrix` or `narray` is the better/faster choice, though I did open `https://github.com/jekyll/classifier-reborn/issues/192` on the `classifier-reborn` repo.

I did notice that the old [narray GitHub repo][7] mentions that the package is no longer maintained, and instead links to a new version: [Ruby/Numo::NArray][8]

> Numo::NArray is an Numerical N-dimensional Array class for fast processing and easy manipulation of multi-dimensional numerical data, similar to numpy.ndaray. This project is the successor to Ruby/NArray.

Numo::NArray also links to [`numo-gsl`][9], which appears to be related gsl bindings:

> GSL interface for Ruby/Numo::NArray

At this stage i'm not sure if `classifier-reborn` is able to make use of any of these numo dependencies, but if it can, my guess is that they are going to be faster/more actively maintained.

  [1]: https://jekyll.github.io/classifier-reborn/
  [2]: https://jekyll.github.io/classifier-reborn/#dependencies
  [3]: http://www.gnu.org/software/gsl
  [4]: https://rubygems.org/gems/gsl
  [5]: https://github.com/SciRuby/rb-gsl#installation
  [6]: https://github.com/SciRuby/rb-gsl#nmatrix-and-narray-usage
  [7]: https://github.com/masa16/narray#new-version-is-under-development---rubynumonarray
  [8]: https://github.com/ruby-numo/narray
  [9]: https://github.com/ruby-numo/numo-gsl

See Also

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions