R Formula¶
RFormula feature selection, RFormula selects columns specified by an R model formula. Currently we support a limited subset of the R operators, including ‘~’, ‘.’, ‘:’, ‘+’, and ‘-‘
Type¶
ml-transformer
Class¶
fire.nodes.ml.NodeRFormula
Fields¶
Name |
Title |
Description |
|---|---|---|
featuresCol |
Features Column |
The features column name |
formula |
Formula |
formula |
labelCol |
Label Column |
The label column name |
Details¶
RFormula selects columns specified by an R model formula. Currently we support a limited subset of the R operators, including ‘~’, ‘.’, ‘:’, ‘+’, and ‘-‘.
More details are available at : https://spark.apache.org/docs/latest/ml-features.html#rformula
Examples¶
The below example is available at : https://spark.apache.org/docs/latest/ml-features.html#rformula¶
import org.apache.spark.ml.feature.RFormula
val dataset = spark.createDataFrame(Seq(
(7, “US”, 18, 1.0),
(8, “CA”, 12, 0.0),
(9, “NZ”, 15, 0.0)
)).toDF(“id”, “country”, “hour”, “clicked”)
val formula = new RFormula()
.setFormula(“clicked ~ country + hour”)
.setFeaturesCol(“features”)
.setLabelCol(“label”)
val output = formula.fit(dataset).transform(dataset)
output.select(“features”, “label”).show()