H2O Neural Network

H2O Deep Learning is based on a multi-layer feedforward artificial neural network that is trained with stochastic gradient descent using back-propagation.

Input

It takes in a DataFrame as input

Type

ml-estimator

Class

fire.nodes.h2o.NodeH2ONeuralNetwork

Fields

Name

Title

Description

isResponseIsCategorical

isResponseColIsCategorical

Specify a response column type(numeric or categorical). Separates the Classification and Regression

labelCol

Label Column

Response variable column.

ignoredCols

Ignored Columns

Features to be ignored for Modelling

splitRatio

Split Ratio

Split Ratio

columnsToCategorical

Columns to Categorical

Columns to be Categorical encoded

seed

Seed

Seed for pseudo random number generator (if applicable).

balanceClasses

Balance Classes

Balance training data class counts via over/under-sampling (for imbalanced data).

maxAfterBalanceSize

Max After Balance Size

Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.

activation

Activation

Activation function.

hidden

Hidden

Specify the hidden layer sizes (value must be positive)

epochs

Epochs

SHow many times the dataset should be iterated (streamed), can be fractional.

nfolds

Number of Folds

Number of folds for K-fold cross-validation (0 to disable or >= 2).

trainSamplesPerIteration

Train Samples Per Iteration

Number of training samples (globally) per MapReduce iteration. Special values are 0: one epoch, -1: all available data (e.g., replicated training data), -2: automatic.

targetRatioCommToComp

Target ratio comm to comp

Target ratio of communication overhead to computation. Only for multi-node operation and train_samples_per_iteration = -2 (auto-tuning).

categoricalEncoding

Categorical Encoding

Specify one of the various encoding schemes for handling categorical features

ignoreConstCols

Ignore Const Columns

Ignore constant columns.

scoreEachIteration

Score Each Iteration

Whether to score during each iteration of model training.

stoppingRounds

Stopping Rounds

Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable).

maxRuntimeSecs

Max Runtime Secs

his argument specifies the maximum time that the AutoML process will run for. If both max_runtime_secs and max_models are specified, then the AutoML run will stop as soon as it hits either of these limits. If neither max_runtime_secs nor max_models are specified, then max_runtime_secs defaults to 3600 seconds (1 hour).

stoppingMetric

StoppingMetric

Metric to use for early stopping (AUTO: logloss for classification, deviance for regression)

stoppingTolerance

StoppingTolerance

Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)

standardize

Standardize

If enabled, automatically standardize the data. If disabled, the user must provide properly scaled input data.

loss

Loss

SLoss function.

adaptiveRate

Adaptive Rate

Adaptive learning rate.

rho

Rho

Adaptive learning rate time decay factor (similarity to prior updates).

advanced

Advanced

epsilon

Epsilon

Adaptive learning rate smoothing factor (to avoid divisions by zero and allow progress).

rate

Rate

Learning rate (higher => less stable, lower => slower convergence).

rateAnnealing

Rate Annealing

Learning rate annealing: rate / (1 + rate_annealing * samples).

rateDecay

Rate Decay

Learning rate decay factor between layers (N-th layer: rate * rate_decay ^ (n - 1)

momentumStart

Momentum Start

Specify the initial momentum at the beginning of training; we suggest 0.5 (Applicable only if adaptive_rate is disabled)

momentumRamp

Momentum Ramp

Number of training samples for which momentum increases.

momentumStable

Momentum Stable

Final momentum after the ramp is over (try 0.99).

nesterovAcceleratedGradient

Nesterov Accelerated Gradient

Use Nesterov accelerated gradient (recommended).

inputDropoutRatio

Input Dropout Ratio

Input layer dropout ratio (can improve generalization, try 0.1 or 0.2).

inputDropoutRatio

Hidden Dropout Ratios

Hidden layer dropout ratios (can improve generalization), specify one value per hidden layer, defaults to 0.5.

l1

L1

L1 regularization (can add stability and improve generalization, causes many weights to become 0).

l2

L2

L2 regularization (can add stability and improve generalization, causes many weights to be small.

maxW2

Max W2

Constraint for squared sum of incoming weights per unit (e.g. for Rectifier).

initialWeightDistribution

Initial Weight Distribution

Initial weight distribution.

initialWeightScale

Initial Weight Scale

Uniform: -value…value, Normal: stddev.

scoreInterval

Score interval

Shortest time interval (in seconds) between model scoring.

scoreTrainingSamples

Score Training Samples

Number of training set samples for scoring (0 for all).

scoreValidationSamples

Score Validation Samples

Number of validation set samples for scoring (0 for all).

scoreDutyCycle

Score Duty Cycle

Maximum duty cycle fraction for scoring (lower: more training, higher: more scoring).

classificationStop

Classification Stop

Stopping criterion for classification error fraction on training data (-1 to disable).

regressionStop

Regression Stop

Stopping criterion for regression error (MSE) on training data (-1 to disable).

quietMode

Quiet mode

Enable quiet mode for less output to standard output.

scoreValidationSampling

Score Validation Sampling

Method used to sample validation dataset for scoring.

overwriteWithBestModel

Overwrite With Best Model

If enabled, override the final model with the best model found during training.

useAllFactorLevels

Use All Factor Levels

Use all factor levels of categorical variables. Otherwise, the first factor level is omitted (without loss of accuracy). Useful for variable importances and auto-enabled for autoencoder.

diagnostics

Diagnostics

Enable diagnostics for hidden layers.

calculateFeatureImportances

Calculate Feature Importances

Compute variable importances for input features (Gedeon method) - can be slow for large networks.

fastMode

Fast Mode

SEnable fast mode (minor approximation in back-propagation).

forceLoadBalance

Force Load Balance

Force extra load balancing to increase training speed for small datasets (to keep all cores busy).

replicateTrainingData

Replicate Training Data

Replicate the entire training dataset onto every node for faster training on small datasets.

singleNodeMode

Single Node Mode

Run on a single node for fine-tuning of model parameters.

shuffleTrainingData

Shuffle Training Data

Enable shuffling of training data (recommended if training data is replicated and train_samples_per_iteration is close to #nodes x #rows, of if using balance_classes).

missingValuesHandling

Missing Values Handling

Handling of missing values. Either MeanImputation or Skip.

sparse

Sparse

Sparse data handling (more efficient for data with lots of 0 values).

averageActivation

Average Activation

Average activation for sparse auto-encoder. #Experimental.

sparsityBeta

Sparsity Beta

Sparsity regularization. #Experimental.

maxCategoricalFeatures

Max Categorical Features

Max. number of categorical features, enforced via hashing. #Experimental.

reproducible

Reproducible

Force reproducibility on small data (will be slow - only uses 1 thread).

exportWeightsAndBiases

Export Weights And Biases

Whether to export Neural Network weights and biases to H2O Frames.

miniBatchSize

Mini Batch Size

SMini-batch size (smaller leads to better fit, larger can speed up and generalize better).

elasticAveraging

Elastic Averaging

Elastic averaging between compute nodes can improve distributed model convergence. #Experimental.

elasticAveragingMovingRate

Elastic Averaging Moving Rate

Elastic averaging moving rate (only if elastic averaging is enabled).

elasticAveragingRegularization

Elastic Averaging Regularization

Elastic averaging regularization strength (only if elastic averaging is enabled).

keepCrossValidationModels

Keep Cross Validation Models

Whether to keep the cross-validated models. Keeping cross-validation models may consume significantly more memory in the H2O cluster.

keepCrossValidationPredictions

Keep Cross Validation Predictions

Whether to keep the predictions of the cross-validation predictions. This needs to be set to TRUE if running the same AutoML object for repeated runs because CV predictions are required to build additional Stacked Ensemble models in AutoML.

keepCrossValidationFoldAssignment

Keep Cross Validation Fold Assignment

Whether to keep cross-validation assignments.

distribution

Distribution

Distribution function.)

tweediePower

Tweedie Power

Tweedie power for Tweedie regression, must be between 1 and 2.

quantileAlpha

Quantile Alhpa

Desired quantile for Quantile regression, must be between 0 and 1.

huberAlpha

Huber Alpha

Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).

weightCol

Weight Column

Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.

offsetCol

Offset Column

Offset column. This will be added to the combination of columns before applying the link function.

foldCol

Fold Column

Column with cross-validation fold index assignment per observation.

foldAssignment

Fold Assignment

Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.

aucType

AUC Type

Set default multinomial AUC type.

confusionMatrix

Confusion Matrix

output_confusion_matrix_chart

Output Confusion Matrix Chart

whether to display confusion matrix chart.

cm_chart_title

Confusion Matrix Chart Title

Title name to display in Confusion Matrix Chart

cm_chart_description

Confusion Matrix Chart Description

Description to display in Confusion Matrix CHart

confusionMatrixTargetLegend

Confusion Matrix Target Legend

Legend name to display for Target in Confusion Matrix

confusionMatrixPredictedLabelLegend

Confusion Matrix PredictedLabel Legend

Legend name to display for Predicted Label in Confusion Matrix

ROC Curve

ROC Curve

output_roc_curve

Output ROC Curve

Whether to display confusion matrix chart.

roc_title

ROC Curve Chart Title

Title name to display in ROC Curve Chart

roc_description

ROC Curve Chart Description

Add Description for ROC Curve Chart

xlabel

X Label

X label

ylabel

Y Label

Y Label

Grid Search

Grid Search

paramKeys

Param Name

Param Names. eg: l1 ,hidden

paramValues

Param Value

Enter comma separated values.eg: 0, 1e-5, eg: 50,50;100,100

gridStrategy

Grid Search Strategy

Strategy to use for model hyperparameter search. Cartesian does exhaustive search; RandomDiscrete searches randomly within given time or model limits.

gridMaxModels

Grid Max Models

Maximum number of models to build in the grid search (0 for unlimited).

gridMaxRuntimeSecs

Grid Max Runtime Seconds

Maximum runtime in seconds for the grid search (0 for unlimited).

gridStoppingRounds

Grid Stopping Rounds

Early stopping based on convergence of the metric during grid search (0 to disable).

gridStoppingTolerance

Grid Stopping Tolerance

Tolerance for metric-based stopping criterion during grid search.

gridStoppingMetric

Grid Stopping Metric

Metric to use for early stopping during grid search (AUTO: logloss for classification, deviance for regression).

gridParallelism

Grid Parallelism

Level of parallelism to use when building models in the grid.

gridSelectBestModelBy

Grid Select Best Model By

Metric used to select the best model from the grid.

Details

H2O’s Deep Learning is based on a multi-layer feedforward artificial neural network that is trained with stochastic gradient descent using back-propagation. The network can contain a large number of hidden layers consisting of neurons with tanh, rectifier, and maxout activation functions.

More details are available at : http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/deep-learning.html