Welcome to rustrees’s documentation!

Indices and tables

Decision trees

class rustrees.decision_tree.DecisionTree(min_samples_leaf=1, max_depth: int = 10, max_features: int = None, random_state=None)

Bases: BaseEstimator

A decision tree model implemented using Rust. Options for regression and classification are available.

Parameters:

min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.
max_depth (int, optional) – The maximum depth of the tree. The default is 10.
max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.
random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y)

Fit the model according to the given training data.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.
y (list, Numpy array, or Pandas Series) – The target.

predict(X) → List

Predict values (regression) or class (classification) for X.

Parameters:: X (pd.DataFrame or 2D array-like object) – The features.
Returns:: The predicted values or classes.
Return type:: List

predict_proba(X) → List

Predict class probabilities for X.

Parameters:: X (pd.DataFrame or 2D array-like object) – The features.
Returns:: The predicted class probabilities.
Return type:: List

class rustrees.decision_tree.DecisionTreeClassifier(**kwargs)

Bases: DecisionTree, RegressorMixin

Decision tree classifier implemented using Rust. Usage should be similar to scikit-learn’s DecisionTreeClassifier.

Parameters:

min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.
max_depth (int, optional) – The maximum depth of the tree. The default is 10.
max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.
random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y) → DecisionTreeClassifier

Fit the model according to the given training data.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.
y (list, Numpy array, or Pandas Series) – The target.

predict(X, threshold: float = 0.5) → List

Predict values (regression) or class (classification) for X.

Parameters:: X (pd.DataFrame or 2D array-like object) – The features.
Returns:: The predicted values or classes.
Return type:: List

predict_proba(X) → List

Predict class probabilities for X.

Parameters:: X (pd.DataFrame or 2D array-like object) – The features.
Returns:: The predicted class probabilities.
Return type:: List

class rustrees.decision_tree.DecisionTreeRegressor(**kwargs)

Bases: DecisionTree, ClassifierMixin

Decision tree regressor implemented using Rust. Usage should be similar to scikit-learn’s DecisionTreeRegressor.

Parameters:

min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.
max_depth (int, optional) – The maximum depth of the tree. The default is 10.
max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.
random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y) → DecisionTreeRegressor

Fit the model according to the given training data.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.
y (list, Numpy array, or Pandas Series) – The target.

predict(X) → List

Predict values (regression) or class (classification) for X.

Parameters:: X (pd.DataFrame or 2D array-like object) – The features.
Returns:: The predicted values or classes.
Return type:: List

Random forests

class rustrees.random_forest.RandomForest(n_estimators: int = 100, min_samples_leaf=1, max_depth: int = 10, max_features: int = None, random_state=None)

Bases: BaseEstimator

A random forest model implemented using Rust. Options for regression and classification are available.

Parameters:

n_estimators (int, optional) – The number of trees in the forest. The default is 100.
min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.
max_depth (int, optional) – The maximum depth of the tree. The default is 10.
max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.
random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y)

Fit the model according to the given training data.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.
y (list, Numpy array, or Pandas Series) – The target.

predict(X) → List

Predict values (regression) or class (classification) for X.

Parameters:: X (pd.DataFrame or 2D array-like object) – The features.
Returns:: The predicted values or classes.
Return type:: List

predict_proba(X) → List

Predict class probabilities for X.

Parameters:: X (pd.DataFrame or 2D array-like object) – The features.
Returns:: The predicted class probabilities.
Return type:: List

class rustrees.random_forest.RandomForestClassifier(**kwargs)

Bases: RandomForest, ClassifierMixin

A random forest classifier implemented using Rust. Usage should be similar to scikit-learn’s RandomForestClassifier.

Parameters:

n_estimators (int, optional) – The number of trees in the forest. The default is 100.
min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.
max_depth (int, optional) – The maximum depth of the tree. The default is 10.
max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.
random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y) → RandomForestClassifier

Fit the model according to the given training data.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.
y (list, Numpy array, or Pandas Series) – The target.

predict(X, threshold: float = 0.5) → List

Predict values (regression) or class (classification) for X.

Parameters:: X (pd.DataFrame or 2D array-like object) – The features.
Returns:: The predicted values or classes.
Return type:: List

predict_proba(X) → List

Predict class probabilities for X.

Parameters:: X (pd.DataFrame or 2D array-like object) – The features.
Returns:: The predicted class probabilities.
Return type:: List

class rustrees.random_forest.RandomForestRegressor(**kwargs)

Bases: RandomForest, RegressorMixin

A random forest regressor implemented using Rust. Usage should be similar to scikit-learn’s RandomForestRegressor.

Parameters:

n_estimators (int, optional) – The number of trees in the forest. The default is 100.
min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.
max_depth (int, optional) – The maximum depth of the tree. The default is 10.
max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.
random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y) → RandomForestRegressor

Fit the model according to the given training data.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.
y (list, Numpy array, or Pandas Series) – The target.

predict(X) → List

Predict values (regression) or class (classification) for X.

Parameters:: X (pd.DataFrame or 2D array-like object) – The features.
Returns:: The predicted values or classes.
Return type:: List

Utils

rustrees.utils.from_pandas(df: DataFrame) → Dataset

Convert a Pandas DataFrame to a Rustrees Dataset.

Parameters:: df (pd.DataFrame) – The DataFrame to convert.
Returns:: The Rustrees Dataset.
Return type:: rt.Dataset

rustrees.utils.prepare_dataset(X, y=None) → Dataset

Prepare a Rustrees Dataset from a Pandas DataFrame or a 2D array-like object.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.
y (list, Numpy array, or Pandas Series, optional) – The target. The default is None.

Returns:

The Rustrees Dataset.

Return type:

rt.Dataset

Raises:

ValueError – If X is not a Pandas DataFrame or a 2D array-like object. If y is not a list, Numpy array, or Pandas Series.