Memory Profiler Instruments¶

To use memory-profiler instrumentation, install with the memory-profiler extra:

pip install sklearn-instrumentation[memory-profiler]

Example usage:

from sklearn_instrumentation import SklearnInstrumentor
from sklearn_instrumentation.instruments.memory_profiler import MemoryProfiler

profiler = MemoryProfiler()
instrumentor = SklearnInstrumentor(instrument=profiler)

instrumentor.instrument_instance(classification_model)

classification_model.fit(X, y)
classification_model.predict(X)

Example output (partial):

ForestClassifier.predict_proba
Filename: /Users/user/projects/sklearn-instrumentation/.venv/lib/python3.8/site-packages/sklearn/ensemble/_forest.py

Line #    Mem usage    Increment  Occurences   Line Contents
============================================================
   70.3 MiB     70.3 MiB           1       def predict_proba(self, X):
                                               \"\"\"
                                               Predict class probabilities for X.
   651
                                               The predicted class probabilities of an input sample are computed as
                                               the mean predicted class probabilities of the trees in the forest.
                                               The class probability of a single tree is the fraction of samples of
                                               the same class in a leaf.
   656
                                               Parameters
                                               ----------
                                               X : {array-like, sparse matrix} of shape (n_samples, n_features)
                                                   The input samples. Internally, its dtype will be converted to
                                                   ``dtype=np.float32``. If a sparse matrix is provided, it will be
                                                   converted into a sparse ``csr_matrix``.
   663
                                               Returns
                                               -------
                                               p : ndarray of shape (n_samples, n_classes), or a list of n_outputs
                                                   such arrays if n_outputs > 1.
                                                   The class probabilities of the input samples. The order of the
                                                   classes corresponds to that in the attribute :term:`classes_`.
                                               \"\"\"
   70.3 MiB      0.0 MiB           1           check_is_fitted(self)
                                               # Check data
   70.3 MiB      0.0 MiB           1           X = self._validate_X_predict(X)
   674
                                               # Assign chunk of trees to jobs
   70.3 MiB      0.0 MiB           1           n_jobs, _, _ = _partition_estimators(self.n_estimators, self.n_jobs)
   677
                                               # avoid storing the output of every estimator by summing them here
   70.3 MiB      0.0 MiB           6           all_proba = [np.zeros((X.shape[0], j), dtype=np.float64)
   70.3 MiB      0.0 MiB           2                        for j in np.atleast_1d(self.n_classes_)]
   70.3 MiB      0.0 MiB           1           lock = threading.Lock()
   70.4 MiB      0.0 MiB           3           Parallel(n_jobs=n_jobs, verbose=self.verbose,
   70.4 MiB      0.0 MiB         105                    **_joblib_parallel_args(require="sharedmem"))(
   70.4 MiB      0.0 MiB         300               delayed(_accumulate_prediction)(e.predict_proba, X, all_proba,
   70.4 MiB      0.0 MiB         100                                               lock)
   70.4 MiB      0.0 MiB         101               for e in self.estimators_)
   687
   70.4 MiB      0.0 MiB           2           for proba in all_proba:
   70.4 MiB      0.0 MiB           1               proba /= len(self.estimators_)
   690
   70.4 MiB      0.0 MiB           1           if len(all_proba) == 1:
   70.4 MiB      0.0 MiB           1               return all_proba[0]
                                               else:
                                                   return all_proba

class sklearn_instrumentation.instruments.memory_profiler.MemoryProfiler[source]¶

Instrument which measures memory usage over function calls.

Uses the memory-profiler library. Outputs line-by-line memory usage for instrumented function.

dkwargs are passed to the memory_profiler.profile function decorator.