Object classification

Object classification#

What is classification?#

Simply put, phenotypic classification is about categorizing objects into different groups based on their features (aka measurements).

⚠️ Where can things go wrong?

Valid measurements are still important Classification can be simple or complex, but results always depend on the validity of your measurements. For this reason, all the caveats of earlier measurement sections also apply here.
Machines are lazy Machine learning classifiers aren’t necessarily going to learn the biologically relevant features that distinguish objects from distinct groups. Confounding features, or features that vary with your phenotype but are not biologically related to it, can limit the usefulness of your classifier and lead to incorrect conclusions. For instance, if clinicians often put rulers next to malignant looking moles and not next to benign moles and try to train a machine learning classifier to distinguish malignant vs. benign, the model might learn to classify images with rulers as malignant without tapping into any of the relevant features of the moles. This is a real example ²⁴. If possible, examining which features your model is relying on to classify objects can be a way to check for this. It’s also important to standardize how you capture images of your different classes of objects and include a large enough training set with images with lots of variation. You wouldn’t want all your positive cells to come from samples you imaged in March and all your negative cells from samples you imaged in January, for example.
Violating model assumptions If using a machine learning classifier, different models come with different baked-in assumptions. If you’re starting out, it can be difficult to know which to pick. There are interactive tools such as CellProfiler Analyst ²⁵ and Piximi that make training a classifier easier, especially if you don’t know how to code.
Messy boundaries Most methods of supervised classification, where the user assigns objects to a score or to a bin, ultimately treat each bin as a totally separate entity; biology is rarely so neat. For example, a supervised classifier for cell cycle phase must assign a cell to one phase, but in fact progression through the cell cycle is not a perfectly switch-like process, as can be visualized by measurements of individual cells (colored by their class given by a human observer). More sophisticated methods may be needed to classify more continuous phenotypes

a continuous distribution of cell cycle states — Fig. 7 **Strict division into supervised classes can be tricky for continuous biological processes**. Adapted from Eulenberg, P., Köhler, N., Blasi, T. *et al*. Reconstructing cell cycle and disease progression using deep learning. *Nat Commun* 8, 463 (2017) ²⁶#

Object classification

Contents

Object classification#

What is classification?#