Introduction

A Hopf, Skip and a Jump is a musicological tool which finds note onset times and frequencies in music.

This document covers the following components:

DetectorBank: a class for running many note-onset detectors in parallel;
slidingbuffer::SlidingBuffer: a template class which can call upon a data producer to generate data on demand caching its recent history thus permitting backtracking; and
detectorcache::DetectorCache: a derived class of SlidingBuffer which is tailored to the detection of note-onsets by permitting pseudo-2D operation (channel number and sample number).

DetectorBank

Suppose we have an 1d (numpy) array of input samples. Because the background of DetectorBank is audio analysis, the type of the input array is inputSample_t (as distributed: float). We want to run a plethora of similar tasks ("detectors") each with possibly different parameters on the input samples either in part or in their entirety. The output of each task is independent, and is written to a given 2d array of type discriminator_t (complex<double>) when it is supplied to the method getZ(). The detector object responsible for applying the discriminator will maintain the discriminator's states between calls, permitting piece-wise or real-time usage.

The work is divided as equally as possible among a given number of threads. Repeated calls cause results to be generated from subsequent input values. So if there are 2 discriminators, and results are to be written on the first call into a 4 x 2 array the subsequently into a 3 x 2 array:

Behaviour of a DetectorBank on successive calls

and the next call will start processing at the 7th sample.

As well as writing into the given output array, getZ() returns the number of input samples actually passed to the detectors. It will not attempt to read pass the end of the input array, so the number of samples converted may be less than the number implicitly requested by supplying an output array of given size, and will be 0 once all of the input samples have been processed.

A new input array may be supplied at any time using the setInputBuffer method. Doing so causes the next call to getZ to start from the beginning of the new input array.

See also: DetectorBank

SlidingBuffer

We provide a generic method of continually holding in memory the recent history of data read from a stream. For example, if a large number of pitches are being detected by a DetectorBank object in order to analyse note onsets in a recording lasting even a few minutes, the amount of system resource consumed to hold the (complex) output of each detector in addition to the audio input becomes infeasibly large for even modestly-sized input files.

In almost all cases, it is sufficient to cache only the recent samples emerging from the analysis, which reduced the total storage, including temporary storage required by the DetectorBank to produce the results, substantially. Using a SlidingBuffer, it becomes possible to read an entire audio sample file into memory, but retain only recenlty used analysis results. Hence one an onset is detected, it is possible to back-track for a time through recent output samples without the requirement to store the entire history of the detector output. As the samples in the sliding buffer exceed a specified age, they are silently discarded.

The SlidingBuffer class exposes an interface which accesses historical entities though the index operator [].

Greater detail on the internal workings of the SlidingBuffer can be found here.

See also: slidingbuffer::SlidingBuffer

DetectorCache

slidingbuffer::SlidingBuffer generically caches the recent history of a (one-dimensional) stream, but DetectorBank objects produce several concurrent output streams from a single audio input. A DetectorCache is a specialisation of a SlidingBuffer which is accessed through both its channel number and sample number.

The result history of a single detector channel may be copied into a contiguous block of memory, regardless of the underlying implementation within SlidingBuffer, provided that sufficient history of the channel output is available.

See also: detectorcache::DetectorCache

Installation

On a Debian system, the following packages should be installed:

python3-dev
swig (to build python3 bindings)
build-essential
autoconf-archive
libcereal-dev
librapidxml-dev
libfftw3-dev
doxygen (for the documentation)
python (for the documentation)
graphviz (for enhanced documentation)

Invoke setup.py as follows:

mkdir -p build
cd build
../configure
make
# Optionally build and run the unit tests
make check
# The results of the checks are written in test/test-suite.log
sudo make install

Full documentation in LaTeX and HTML are generated at build time. PDF format can by obtained by running

make doxygen-docs

Python Usage

Please see Using the DetectorBank in Python for a full description of calling the DetectorBank from Python.

Some short Python examples can be found here.

Customisation

As distributed, the package prints out debugging messages. These can be disabled by altering setup.py or the Makefile depending on which method you are using for building.

To you ''have'' to change the type of the input or output data, edit the typedefs in detectorbank.h to reflect the desired types, and the apply statements in detectorbank.i to make sure the SWIG binding still works, then rebuild.

You'll also have to persuade the build system to link the fftw library with a resolution appropriate to new input sample precision (used in the FFT version of the hilbert transform). Good luck with that.