Paint by numbers: Algorithm reconstructs processes from individual images

Researchers at the Helmholtz Zentrum München have developed a new method for reconstructing continuous biological processes, such as disease progression, using image data. The study was published in ‘Nature Communications’.

The new method is able to reconstruct biological processes using image data. Source: Helmholtz Zentrum München

Modern life sciences generate a constantly growing amount of data in shorter and shorter cycles. Making such data controllable and suitable for evaluation is the objective of Dr. Dr. Alexander Wolf and his colleagues at the Helmholtz Zentrum München’s Institute of Computational Biology (ICB). With this in mind, the researchers are attempting to develop software that handles this evaluation. But of course there are various hurdles to clear.

“In the current study, we dealt with the problem that software cannot assign image data to continuous processes,” explains study leader Wolf. “For example, it is possible to classify image information according to clearly defined categories, but in disease progression and developmental biology, the limits are quickly reached because the processes are continuous and not individual steps.”

In order to take this into account, the Helmholtz team employed methods from so-called Deep Learning* (i.e. machine learning processes). “Using artificial neural networks, we can now combine individual pictures into processes and additionally display them in a way that humans understand,” say Philipp Eulenberg and Niklas Köhler, former Master’s students at the ICB and the study’s first authors.

Blood cells and retinas as sparring partners

In order to demonstrate the method’s capability, the scientists selected two examples. In the first approach, the software reconstructed the continuous cell cycle of white blood cells using images from an imaging flow cytometer (producing pictures in a fluorescence microscope). “A further advantage of this examination is that our software is so fast that it is possible to extract the cell development on the fly, meaning while the analysis in the cytometer is still running,” explains Wolf. “In addition, our software makes six times less errors than previous approaches.”

In the second experiment, the researchers reconstructed the progress of diabetic retinopathy.** “We did this by feeding our software 30,000 individual images of retinas as sparring partners, so to speak,” explains Niklas Köhler. “Since it automatically compiles these data into a continuous process, the software allows us to predict the disease progression on a continuous scale.”

And if the data are not part of a continuous biological process? “In such a case, the software recognizes that individual categories are involved and assigns the measured data to individual clusters,” Wolf explains. In addition to further applications for the method, in the future Wolf and his colleagues want to solve other problems involving the evaluation of biological data using machine learning.


Further Information

* Deep Learning algorithms simulate the learning processes in people using artificial neural networks. The principle functions particularly well when large quantities of data (Big Data) are available for training. Image recognition is one of Deep Learning's strengths. More decision layers are placed between the input and the output than usually found in neuronal networks, which is why the term "deep" is used.

** Diabetic retinopathy is the main cause of early vision loss in the Western world. The diagnosis is usually made by an expert, who assigns it to one of the four stages healthy, mild, medium and severe. Working with 8,000 images, the software was able to describe the progression or increasing severity of the disease without being provided with the ordering information.


Alex Wolf and the team recently took one of the top places in the Data Science Bowl, one of the world’s highest endowed competitions in Big Data. For their entry, the team programmed an algorithm that recognizes lung cancer on the basis of 300 slices from a three-dimensional computer tomography scan in less than a few milliseconds, a process that can take a radiologist several hours in the worst case.

The ICB also deals with the subject of Deep Learning in other contents: The scientists recently introduced an algorithm in ‘Nature Methods’ that predicts hematopoietic stem cell development. In the video “Deep Learning Predicts Stem Cell Development”, they explain how this works.

Original Publication:

Eulenberg, P. et al. (2017): Reconstructing cell cycle and disease progression using deep learning. Nature Communications, DOI: 10.1038/s41467-017-00623-3

As German Research Center for Environmental Health, Helmholtz Zentrum München pursues the goal of developing personalized medical approaches for the prevention and therapy of major common diseases such as diabetes mellitus and lung diseases. To achieve this, it investigates the interaction of genetics, environmental factors and lifestyle. The Helmholtz Zentrum München has about 2,300 staff members and is headquartered in Neuherberg in the north of Munich. Helmholtz Zentrum München is a member of the Helmholtz Association, a community of 18 scientific-technical and medical-biological research centers with a total of about 37,000 staff members. 

The Institute of Computational Biology (ICB) develops and applies methods for the model-based description of biological systems, using a data-driven approach by integrating information on multiple scales ranging from single-cell time series to large-scale omics. Given the fast technological advances in molecular biology, the aim is to provide and collaboratively apply innovative tools with experimental groups in order to jointly advance the understanding and treatment of common human diseases.