High-Performance Computing at the NIH

RSS Feed
Random Jungle on Helix

Random Jungle is a fast implementation of RandomForest(TM) for high dimensional data. It was developed by Leo Breiman and Adele Cutler. In genetics, it can be used for analysing big Genome Wide Association (GWA) data. Random Forests (TM) is a powerful machine learning method. Most interesting features are variable selection, missing value imputation, classifier creation, generalization error estimation and sample proximities between pairs of cases.

Random Jungle website.

Usage

Type 'rjungle' or 'rjunglesparse' to run the programs. Adding the '-h' switch will give a list of the available options.

For large numbers of rjungle jobs (> 3 simultaneous jobs), use Random Jungle on the Biowulf cluster

Documentation

Random Jungle documentation