Model Classes

DeepChem maintains an extensive collection of models for scientific applications. DeepChem’s focus is on facilitating scientific applications, so we support a broad range of different machine learning frameworks (currently scikit-learn, xgboost, TensorFlow, and PyTorch) since different frameworks are more and less suited for different scientific applications.

Model Cheatsheet

If you’re just getting started with DeepChem, you’re probably interested in the basics. The place to get started is this “model cheatsheet” that lists various types of custom DeepChem models. Note that some wrappers like SklearnModel and GBDTModel which wrap external machine learning libraries are excluded, but this table is otherwise complete.

As a note about how to read this table, each row describes what’s needed to invoke a given model. Some models must be applied with given Transformer or Featurizer objects. Some models also have custom training methods. You can read off what’s needed to train the model from the table below.

Model

Type

Input Type

Transformations

Acceptable Featurizers

Fit Method

AtomicConvModel

Classifier/ Regressor

Tuple

ComplexNeighborListFragmentAtomicCoordinates

fit

ChemCeption

Classifier/ Regressor

Tensor of shape (N, M, c)

SmilesToImage

fit

CNN

Classifier/ Regressor

Tensor of shape (N, c) or (N, M, c) or (N, M, L, c)

fit

DTNNModel

Classifier/ Regressor

Matrix of shape (N, N)

CoulombMatrix

fit

DAGModel

Classifier/ Regressor

ConvMol

DAGTransformer

ConvMolFeaturizer

fit

GraphConvModel

Classifier/ Regressor

ConvMol

ConvMolFeaturizer

fit

MPNNModel

Classifier/ Regressor

WeaveMol

WeaveFeaturizer

fit

MultitaskClassifier

Classifier

Vector of shape (N,)

CircularFingerprint, RDKitDescriptors, CoulombMatrixEig, RdkitGridFeaturizer, BindingPocketFeaturizer, ElementPropertyFingerprint,

fit

MultitaskRegressor

Regressor

Vector of shape (N,)

CircularFingerprint, RDKitDescriptors, CoulombMatrixEig, RdkitGridFeaturizer, BindingPocketFeaturizer, ElementPropertyFingerprint,

fit

MultitaskFitTransformRegressor

Regressor

Vector of shape (N,)

Any

CircularFingerprint, RDKitDescriptors, CoulombMatrixEig, RdkitGridFeaturizer, BindingPocketFeaturizer, ElementPropertyFingerprint,

fit

MultitaskIRVClassifier

Classifier

Vector of shape (N,)

IRVTransformer

CircularFingerprint, RDKitDescriptors, CoulombMatrixEig, RdkitGridFeaturizer, BindingPocketFeaturizer, ElementPropertyFingerprint,

fit

ProgressiveMultitaskClassifier

Classifier

Vector of shape (N,)

CircularFingerprint, RDKitDescriptors, CoulombMatrixEig, RdkitGridFeaturizer, BindingPocketFeaturizer, ElementPropertyFingerprint,

fit

ProgressiveMultitaskRegressor

Regressor

Vector of shape (N,)

CircularFingerprint, RDKitDescriptors, CoulombMatrixEig, RdkitGridFeaturizer, BindingPocketFeaturizer, ElementPropertyFingerprint,

fit

RobustMultitaskClassifier

Classifier

Vector of shape (N,)

CircularFingerprint, RDKitDescriptors, CoulombMatrixEig, RdkitGridFeaturizer, BindingPocketFeaturizer, ElementPropertyFingerprint,

fit

RobustMultitaskRegressor

Regressor

Vector of shape (N,)

CircularFingerprint, RDKitDescriptors, CoulombMatrixEig, RdkitGridFeaturizer, BindingPocketFeaturizer, ElementPropertyFingerprint,

fit

ScScoreModel

Classifier

Vector of shape (N,)

CircularFingerprint, RDKitDescriptors, CoulombMatrixEig, RdkitGridFeaturizer, BindingPocketFeaturizer, ElementPropertyFingerprint,

fit

SeqToSeq

Sequence

Sequence

fit_sequences

Smiles2Vec

Classifier/ Regressor

Sequence

SmilesToSeq

fit

TextCNNModel

Classifier/ Regressor

String

fit

WGAN

Adversarial

Pair

fit_gan

CGCNNModel

Classifier/ Regressor

GraphData

CGCNNFeaturizer

fit

GATModel

Classifier/ Regressor

GraphData

MolGraphConvFeaturizer

fit

GCNModel

Classifier/ Regressor

GraphData

MolGraphConvFeaturizer

fit

AttentiveFPModel

Classifier/ Regressor

GraphData

MolGraphConvFeaturizer

fit

LCCNModel

Regressor

GraphData

LCNNFeaturizer

fit

Model

Scikit-Learn Models

Scikit-learn’s models can be wrapped so that they can interact conveniently with DeepChem. Oftentimes scikit-learn models are more robust and easier to train and are a nice first model to train.

SklearnModel

Gradient Boosting Models

Gradient Boosting Models (LightGBM and XGBoost) can be wrapped so they can interact with DeepChem.

GBDTModel

Deep Learning Infrastructure

DeepChem maintains a lightweight layer of common deep learning model infrastructure that can be used for models built with different underlying frameworks. The losses and optimizers can be used for both TensorFlow and PyTorch models.

Losses

Optimizers

Keras Models

DeepChem extensively uses Keras to build deep learning models.

KerasModel

Training loss and validation metrics can be automatically logged to Weights & Biases with the following commands:

# Install wandb in shell
pip install wandb

# Login in shell (required only once)
wandb login

# Start a W&B run in your script (refer to docs for optional parameters)
wandb.init(project="my project")

# Set `wandb` arg when creating `KerasModel`
model = KerasModel(…, wandb=True)

MultitaskRegressor

MultitaskFitTransformRegressor

MultitaskClassifier

TensorflowMultitaskIRVClassifier

RobustMultitaskClassifier

RobustMultitaskRegressor

ProgressiveMultitaskClassifier

ProgressiveMultitaskRegressor

WeaveModel

DTNNModel

DAGModel

GraphConvModel

MPNNModel

BasicMolGANModel

ScScoreModel

SeqToSeq

GAN

WGAN

CNN

TextCNNModel

AtomicConvModel

Smiles2Vec

ChemCeption

NormalizingFlowModel

The purpose of a normalizing flow is to map a simple distribution (that is easy to sample from and evaluate probability densities for) to a more complex distribution that is learned from data. Normalizing flows combine the advantages of autoregressive models (which provide likelihood estimation but do not learn features) and variational autoencoders (which learn feature representations but do not provide marginal likelihoods). They are effective for any application requiring a probabilistic model with these capabilities, e.g. generative modeling, unsupervised learning, or probabilistic inference.

PyTorch Models

DeepChem supports the use of PyTorch to build deep learning models.

TorchModel

You can wrap an arbitrary torch.nn.Module in a TorchModel object.

CGCNNModel

GATModel

GCNModel

AttentiveFPModel

MPNNModel

Note that this is an alternative implementation for MPNN and currently you can only import it from deepchem.models.torch_models.

LCNNModel