Welcome to rustrees’s documentation!

Indices and tables

Decision trees

class rustrees.decision_tree.DecisionTree(min_samples_leaf=1, max_depth: int = 10, max_features: int = None, random_state=None)

Bases: BaseEstimator

A decision tree model implemented using Rust. Options for regression and classification are available.

Parameters:
  • min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.

  • max_depth (int, optional) – The maximum depth of the tree. The default is 10.

  • max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.

  • random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y)

Fit the model according to the given training data.

Parameters:
  • X (pd.DataFrame or 2D array-like object) – The features.

  • y (list, Numpy array, or Pandas Series) – The target.

predict(X) List

Predict values (regression) or class (classification) for X.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.

Returns:

The predicted values or classes.

Return type:

List

predict_proba(X) List

Predict class probabilities for X.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.

Returns:

The predicted class probabilities.

Return type:

List

class rustrees.decision_tree.DecisionTreeClassifier(**kwargs)

Bases: DecisionTree, RegressorMixin

Decision tree classifier implemented using Rust. Usage should be similar to scikit-learn’s DecisionTreeClassifier.

Parameters:
  • min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.

  • max_depth (int, optional) – The maximum depth of the tree. The default is 10.

  • max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.

  • random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y) DecisionTreeClassifier

Fit the model according to the given training data.

Parameters:
  • X (pd.DataFrame or 2D array-like object) – The features.

  • y (list, Numpy array, or Pandas Series) – The target.

predict(X, threshold: float = 0.5) List

Predict values (regression) or class (classification) for X.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.

Returns:

The predicted values or classes.

Return type:

List

predict_proba(X) List

Predict class probabilities for X.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.

Returns:

The predicted class probabilities.

Return type:

List

class rustrees.decision_tree.DecisionTreeRegressor(**kwargs)

Bases: DecisionTree, ClassifierMixin

Decision tree regressor implemented using Rust. Usage should be similar to scikit-learn’s DecisionTreeRegressor.

Parameters:
  • min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.

  • max_depth (int, optional) – The maximum depth of the tree. The default is 10.

  • max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.

  • random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y) DecisionTreeRegressor

Fit the model according to the given training data.

Parameters:
  • X (pd.DataFrame or 2D array-like object) – The features.

  • y (list, Numpy array, or Pandas Series) – The target.

predict(X) List

Predict values (regression) or class (classification) for X.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.

Returns:

The predicted values or classes.

Return type:

List

Random forests

class rustrees.random_forest.RandomForest(n_estimators: int = 100, min_samples_leaf=1, max_depth: int = 10, max_features: int = None, random_state=None)

Bases: BaseEstimator

A random forest model implemented using Rust. Options for regression and classification are available.

Parameters:
  • n_estimators (int, optional) – The number of trees in the forest. The default is 100.

  • min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.

  • max_depth (int, optional) – The maximum depth of the tree. The default is 10.

  • max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.

  • random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y)

Fit the model according to the given training data.

Parameters:
  • X (pd.DataFrame or 2D array-like object) – The features.

  • y (list, Numpy array, or Pandas Series) – The target.

predict(X) List

Predict values (regression) or class (classification) for X.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.

Returns:

The predicted values or classes.

Return type:

List

predict_proba(X) List

Predict class probabilities for X.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.

Returns:

The predicted class probabilities.

Return type:

List

class rustrees.random_forest.RandomForestClassifier(**kwargs)

Bases: RandomForest, ClassifierMixin

A random forest classifier implemented using Rust. Usage should be similar to scikit-learn’s RandomForestClassifier.

Parameters:
  • n_estimators (int, optional) – The number of trees in the forest. The default is 100.

  • min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.

  • max_depth (int, optional) – The maximum depth of the tree. The default is 10.

  • max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.

  • random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y) RandomForestClassifier

Fit the model according to the given training data.

Parameters:
  • X (pd.DataFrame or 2D array-like object) – The features.

  • y (list, Numpy array, or Pandas Series) – The target.

predict(X, threshold: float = 0.5) List

Predict values (regression) or class (classification) for X.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.

Returns:

The predicted values or classes.

Return type:

List

predict_proba(X) List

Predict class probabilities for X.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.

Returns:

The predicted class probabilities.

Return type:

List

class rustrees.random_forest.RandomForestRegressor(**kwargs)

Bases: RandomForest, RegressorMixin

A random forest regressor implemented using Rust. Usage should be similar to scikit-learn’s RandomForestRegressor.

Parameters:
  • n_estimators (int, optional) – The number of trees in the forest. The default is 100.

  • min_samples_leaf (int, optional) – The minimum number of samples required to be at a leaf node. The default is 1.

  • max_depth (int, optional) – The maximum depth of the tree. The default is 10.

  • max_features (int, optional) – The maximum number of features per split. Default is None, which means all features are considered.

  • random_state (int, optional) – The seed used by the random number generator. The default is None.

fit(X, y) RandomForestRegressor

Fit the model according to the given training data.

Parameters:
  • X (pd.DataFrame or 2D array-like object) – The features.

  • y (list, Numpy array, or Pandas Series) – The target.

predict(X) List

Predict values (regression) or class (classification) for X.

Parameters:

X (pd.DataFrame or 2D array-like object) – The features.

Returns:

The predicted values or classes.

Return type:

List

Utils

rustrees.utils.from_pandas(df: DataFrame) Dataset

Convert a Pandas DataFrame to a Rustrees Dataset.

Parameters:

df (pd.DataFrame) – The DataFrame to convert.

Returns:

The Rustrees Dataset.

Return type:

rt.Dataset

rustrees.utils.prepare_dataset(X, y=None) Dataset

Prepare a Rustrees Dataset from a Pandas DataFrame or a 2D array-like object.

Parameters:
  • X (pd.DataFrame or 2D array-like object) – The features.

  • y (list, Numpy array, or Pandas Series, optional) – The target. The default is None.

Returns:

The Rustrees Dataset.

Return type:

rt.Dataset

Raises:

ValueError – If X is not a Pandas DataFrame or a 2D array-like object. If y is not a list, Numpy array, or Pandas Series.