The Arborist

We offer accelerated, scalable decision-tree solutions either in the cloud or in the user’s own environment.  Our implementation of the Random Forest® algorithm accelerates in both multicore and GPU hardware environments.  Rapidics employs a data-driven approach which, in a nutshell, maps the problem to the hardware best suited for the characteristics of the data at hand.

Our debut product, the Arborist, is available through Nimbix, the high-performance cloud provider. The Arborist is tuned to train data sets with large row counts, especially those with thousand to hundreds of millions of rows.  Training time scales very-nearly linearly with row count.

The Arborist can be invoked through the R statistical language using a standard package interface we provide.  The implementation is highly versatile and extends easily to other numerical-language front ends such as Python.  The Arborist is also available in library form, embeddable within users’ own custom solutions.  Finally, the Arborist can itself be customized to user specification including, for example, support for proprietary data types and specialized decision methods.

The Arborist supports both regression and categorical decision trees, with both numerical and factor data.  There is no limit on the number of categories in either the response or the observations.  In addition to standard options, the Arborist also offers user-specified predictor and sample weighting.  Quantile regression is provided as an integral part of the software, and can be invoked by option during training and prediction.

We are eager to make the Arborist available in a way best suited to your needs.  Please contact us to learn how we can help.

Random Forest® is a registered trademark of Leo Breiman and Adele Cutler.