Gradient-Boosted Trees (GBTs) learning algorithm for regression
# Gradient-Boosted Trees (GBTs) learning algorithm for regression
from numpy import allclose
from pyspark.ml.linalg import Vectors
df = spark.createDataFrame([
(1.0, Vectors.dense(1.0)),
(0.0, Vectors.sparse(1, [], []))], ["label", "features"])
gbt = GBTRegressor(maxIter=5, maxDepth=2, seed=42)
print(gbt.getImpurity())
# variance
model = gbt.fit(df)
model.featureImportances
# SparseVector(1, {0: 1.0})
model.numFeatures
# 1
allclose(model.treeWeights, [1.0, 0.1, 0.1, 0.1, 0.1])
# True
test0 = spark.createDataFrame([(Vectors.dense(-1.0),)], ["features"])
model.transform(test0).head().prediction
# 0.0
test1 = spark.createDataFrame([(Vectors.sparse(1, [0], [1.0]),)], ["features"])
model.transform(test1).head().prediction
# 1.0
gbtr_path = temp_path + "gbtr"
gbt.save(gbtr_path)
gbt2 = GBTRegressor.load(gbtr_path)
gbt2.getMaxDepth()
# 2
model_path = temp_path + "gbtr_model"
model.save(model_path)
model2 = GBTRegressionModel.load(model_path)
model.featureImportances == model2.featureImportances
# True
model.treeWeights == model2.treeWeights
# True
model.trees
# [DecisionTreeRegressionModel (uid=...) of depth..., DecisionTreeRegressionModel...]
Are there any code examples left?
New code examples in category Python
-
Python 2023-04-11 03:04:20
-
Python 2022-03-27 22:40:04 pycharm no module named
-
Python 2022-03-27 22:25:05 assign multiple variablesin one line
-
Python 2022-03-27 22:20:02 levenshtein distance
-
Python 2022-03-27 21:35:09 get text from url python last slash
-
Python 2022-03-27 21:30:30 df concatenate df
-
Python 2022-03-27 21:25:09 python odd or even
-
Python 2022-03-27 21:15:32 python include function from another file
-
Python 2022-03-27 21:10:01 color module python
-
Python 2022-03-27 21:00:27 python tkinter cursor types