Skip to content

FastTreeRegressionTrainer leaves memory allocation happens before actually yes knowing how many leaves are actually yes needed by the model #7435

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
superichmann opened this issue Apr 6, 2025 · 0 comments
Assignees
Labels
untriaged New issue has not been triaged

Comments

@superichmann
Copy link

superichmann commented Apr 6, 2025

my purpose is to allow the model to create as many leaves as it wants. is there a way to achieve that without allocating 8gb of memory for each tree trainer?

var options = new FastTreeRegressionTrainer.Options
{
NumberOfLeaves=int.MaxValue
}
System.OutOfMemoryException: Array dimensions exceeded supported range.
   at Microsoft.ML.Trainers.FastTree.DocumentPartitioning..ctor(Int32 numDocuments, Int32 maxLeaves)
   at Microsoft.ML.Trainers.FastTree.TreeLearner..ctor(Dataset trainData, Int32 numLeaves)
   at Microsoft.ML.Trainers.FastTree.LeastSquaresRegressionTreeLearner..ctor(Dataset trainData, Int32 numLeaves, Int32 minDocsInLeaf, Double entropyCoefficient, Double featureFirstUsePenalty, Double featureReusePenalty, Double softmaxTemperature, Int32 histogramPoolSize, Int32 randomSeed, Double splitFraction, Boolean filterZeros, Boolean allowEmptyTrees, Double gainConfidenceLevel, Int32 maxCategoricalGroupsPerNode, Int32 maxCategoricalSplitPointPerNode, Double bsrMaxTreeOutput, IParallelTraining parallelTraining, Double minDocsPercentageForCategoricalSplit, Bundle bundling, Int32 minDocsForCategoricalSplit, Double bias, IHost host)
   at Microsoft.ML.Trainers.FastTree.BoostingFastTreeTrainerBase`3.ConstructTreeLearner(IChannel ch)
   at Microsoft.ML.Trainers.FastTree.BoostingFastTreeTrainerBase`3.ConstructOptimizationAlgorithm(IChannel ch)
   at Microsoft.ML.Trainers.FastTree.FastTreeRegressionTrainer.ConstructOptimizationAlgorithm(IChannel ch)
   at Microsoft.ML.Trainers.FastTree.FastTreeTrainerBase`3.Initialize(IChannel ch)
   at Microsoft.ML.Trainers.FastTree.FastTreeTrainerBase`3.TrainCore(IChannel ch)
   at Microsoft.ML.Trainers.FastTree.FastTreeRegressionTrainer.TrainModelCore(TrainContext context)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.Fit(IDataView input)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged label Apr 6, 2025
@superichmann superichmann changed the title FastTreeRegressionTrainer Array dimensions exceeded supported range Exception FastTreeRegressionTrainer leaves memory allocation happens before actually yes knowing how many leaves are actually yes needed by the model Apr 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
untriaged New issue has not been triaged
Projects
None yet
Development

No branches or pull requests

2 participants