HTML
-
Our deep neural network approach for phase retrieval and holographic image reconstruction is schematically described in Figure 1 (see also Supplementary Figs. S1–S4). In this work, we chose to demonstrate the proposed framework using lens-free digital in-line holography of transmissive samples, including human tissue sections and blood and Pap smears (see Matrials and Methods). Due to the dense and connected nature of these samples that we imaged, their holographic in-line imaging requires the acquisition of multiple holograms for accurate and artifact-free object recovery52. A schematic of our experimental setup is shown in Supplementary Fig. S5, where the sample is positioned very close to a CMOS sensor chip with a < 1 mm sample-to-sensor distance, which provides an important advantage in terms of the sample field of view that can be imaged. However, due to this relatively short sample-to-sensor distance, the twin-image artifact of the in-line holography, which is a result of the lost phase information, is strong and severely obstructs the spatial features of the sample in both the amplitude and phase channels, as illustrated in Figures 1 and 2.
Fig. 1
Following its training phase, the deep neural network blindly outputs artifact-free phase and amplitude images of the object using only one hologram intensity. This deep neural network is composed of convolutional layers, residual blocks and upsampling blocks (see Supplementary Information for additional details) and rapidly processes a complex-valued input image in a parallel, multi-scale manner.Fig. 2
Comparison of the holographic reconstruction results for different types of samples: (a-h) Pap smear, (i-p) breast tissue section. (a, i) Zoomed-in regions of interest from the acquired holograms. (b, c, j, k) Amplitude and phase images resulting from free-space back-propagation of a single hologram intensity, shown in a and i, respectively. These images are contaminated with twin-image and self-interference-related spatial artifacts due to the missing phase information in the hologram detection process. (d, e, l, m) Corresponding amplitude and phase images of the same samples obtained by the deep neural network, demonstrating the blind recovery of the complex object image without twin-image and self-interference artifacts using a single hologram. (f, g, n, o) amplitude and phase images of the same samples reconstructed using multi-height phase retrieval with 8 holograms acquired at different sample-to-sensor distances. (h, p) corresponding bright-field microscopy images of the same samples, shown for comparison. The yellow arrows point to artifacts in f, g, n, o (due to out-of-focus dust particles or other unwanted objects) that are significantly suppressed by the network reconstruction, as shown in d, e, l, m.The first step in our deep learning-based phase retrieval and holographic image reconstruction framework consists of 'training' the neural network. This training involves learning the statistical transformation between a complex-valued image that results from the back-propagation of a single intensity-only hologram of the object and the same object's image that is reconstructed using a multi-height phase retrieval algorithm (treated as the gold standard for the training phase). This algorithm uses 8 hologram intensities acquired at different sample-to-sensor distances (see Materials and Methods as well as Supplementary Information). As illustrated in Figures 1, 2, 3, a simple back-propagation of the object's hologram, without phase retrieval, contains severe twin-image and self-interference-related artifacts, hiding the phase and amplitude information of the object. This training/learning process (which is performed only once) results in a fixed deep neural network that is used to blindly reconstruct the phase and amplitude images of any object, free from twin-image and other undesired interference-related artifacts, using a single hologram intensity.
Fig. 3
Red blood cell volume estimation using our deep neural network-based phase retrieval. The deep neural network output (e, f), given the input (c, d) obtained from a single hologram intensity (b), shows a good match with the multi-height phase recovery-based cell volume estimation results (a), calculated using Nholo=8 (g, h). Similar to the yellow arrows shown in Figure 2f, 2g, 2n and 2o, the multi-height phase recovery results exhibit an out-of-focus fringe artifact at the center of the field-of-view in (g, h). Refer to Supplementary Information for the calculation of the effective refractive volume of cells.In our holographic imaging experiments, we used three different types of samples: blood smears, Pap smears and breast tissue sections, and separately trained three convolutional neural networks for each sample type, although the network architecture was identical in each case, as shown in Figure 1. To avoid over-fitting the neural network, we stopped the training when the deep neural network performance on the validation image set (which is different from the training image set and the blind testing image set) began to decline. We also accordingly made the network compact and applied pooling approaches53. Following this training process, each deep neural network was blindly tested with different objects that were not used in the training or validation image sets. Figures 1, 2 and 3 show the neural network-based blind reconstruction results for the Pap smears, breast tissue sections and blood smears. These reconstructed phase and amplitude images clearly demonstrate the success of our deep neural network-based holographic image reconstruction approach to blindly infer artifact-free phase and amplitude images of the objects, matching the performance of the multi-height phase recovery. Table 1 further compares the structural similarity54 (SSIM) of our neural network output images (using a single input hologram, that is, Nholo=1) against the results obtained with a traditional multi-height phase retrieval algorithm using multiple holograms (that is, Nholo=2, 3, …, 8) acquired at different sample-to-sensor distances. A comparison of the SSIM index values reported in Table 1 suggests that the imaging performance of the deep neural network using a single hologram is comparable to that of multi-height phase retrieval, closely matching the SSIM performance of Nholo=2 for both Pap smear and breast tissue samples and the SSIM performance of Nholo=3 for blood smear samples. The deep neural network-based reconstruction approach reduces the number of holograms required by 2-3 times. In addition to this reduction in the number of holograms, the computation time for holographic reconstruction using a neural network is also improved by more than three- and four-fold compared with those of the multi-height phase retrieval using Nholo=2 and Nholo=3, respectively (see Table 2).
Reconstruction method Deep network input (Nholo=1) Deep network output (STS) (Nholo=1) Deep network output (Universal) (Nholo=1) Multi-height phase-recovery (Nholo=2) Multi-height phase-recovery (Nholo=3) Multi-height phase-recovery (Nholo=4) Multi-height phase-recovery (Nholo=5) Multi-height phase-recovery (Nholo=6) Multi-height phase-recovery (Nholo=7) Multi-height phase-recovery (Nholo=8) Sample type Pap smear real part 0.726 0.895 0.893 0.875 0.922 0.954 0.979 0.985 0.986 1 Pap smear imaginary part 0.431 0.870 0.870 0.840 0.900 0.948 0.979 0.986 0.987 1 Blood smear real part 0.701 0.942 0.951 0.890 0.942 0.962 0.970 0.975 0.977 1 Blood smear imaginary part 0.048 0.930 0.925 0.46 0.849 0.907 0.935 0.938 0.955 1 Breast tissue real part 0.826 0.916 0.921 0.931 0.955 0.975 0.981 0.983 0.984 1 Breast tissue imaginary part 0.428 0.912 0.916 0.911 0.943 0.970 0.979 0.981 0.982 1 In each case, the SSIM index is separately calculated for the real and imaginary parts of the resulting complex-valued image with respect to the multi-height phase recovery result for Nholo=8, and thus, by definition, the last column on the right has an SSIM index of 1. Due to the presence of twin-image and self-interference artifacts, the first column formed by the input images has the worst performance. Table 1. Comparison of the SSIM index values between the deep neural network output images obtained with a single hologram intensity (for both the sample-type-specific (STS) and universal networks) and the multi-height phase retrieval results for different numbers of input holograms (Nholo) corresponding to Pap smear samples, breast tissue histopathology slides and blood smear samples
Deep network output (STS) (Nholo=1) Deep network output (Universal) (Nholo=1) Multi-height phase-recovery (Nholo=2) Multi-height phase-recovery (Nholo=3) Multi-height phase-recovery (Nholo=4) Multi-height phase-recovery (Nholo=5) Multi-height phase-recovery (Nholo=6) Multi-height phase-recovery (Nholo=7) Multi-height phase-recovery (Nholo=8) Runtime (s) 6.45 7.85 23.20 28.32 32.11 35.89 38.28 43.13 47.43 All the reconstructions were performed on a laptop using a single GPU (see Supplementary Information for details). Of the 6.45 s and 7.85 s required for image reconstruction from a single hologram intensity using sample-type-specific and universal neural networks, respectively, the deep neural network processing time is 3.11 s for the sample-type-specific network and 4.51 s for the universal network, while the rest of the time (that is, 3.34 s for the preprocessing stages) is used for other operations such as pixel super-resolution, auto-focusing and free space back-propagation. Table 2. Comparison of the holographic image reconstruction runtime for a field of view of ~ 1 mm2 for different phase recovery approaches
The phase retrieval performance of our neural network is further demonstrated by imaging red blood cells (RBCs) in a whole blood smear. Using the reconstructed phase images of RBCs, the relative phase delay with respect to the background (where no cells are present) is calculated to reveal the phase integral per RBC (given in units of rad·μm2—see Supplementary Information for details), which is directly proportional to the volume of each cell, V. In Figure 3a, we compare the phase integral values of 127 RBCs in a given region of interest, which were calculated using the phase images of the network input, the network output, and the multi-height phase recovery image obtained with Nholo=8. Due to the twin-image and other self-interference-related spatial artifacts, the effective cell volume and the phase integral values calculated using the network input image demonstrated a highly random behavior. This behavior is shown as the scattered blue dots in Figure 3a and is significantly improved by the network output, shown as the red dots in the same figure.
Next, to evaluate the tolerance of the deep neural network and its holographic reconstruction framework to axial defocusing, we digitally back-propagated the hologram intensity of a breast tissue section to different depths, that is, defocusing distances within a range of z=[−20 μm, +20 μm] with Δz=1 μm increments. After this defocusing, we then fed each resulting complex-valued image as input into the same fixed neural network, which was trained by using in-focus images at z=0 μm. The amplitude SSIM index of each network output was evaluated with respect to the multi-height phase recovery image with Nholo=8 used as the reference (Figure 4). Although the deep neural network was trained with in-focus images, Figure 4 demonstrates the ability of the network to blindly reconstruct defocused holographic images with a negligible drop in image quality across the imaging system's depth of field, which is ~4 μm.
Fig. 4
Estimation of the depth defocusing tolerance of the deep neural network. (a) SSIM index for the neural network output images when the input image is defocused (that is, deviates from the optimal focus used in the training of the network). The SSIM index compares the network output images in d, f and h, with the image obtained by using the multi-height phase recovery algorithm with Nholo=8, shown in b.In a digital in-line hologram, the intensity of the light incident on the sensor array can be written as
$$ \begin{aligned} I(x, y) &=|A+a(x, y)|^{2} \\ &=|A|^{2}+|a(x, y)|^{2}+A^{*} a(x, y)+A a^{*}(x, y) \end{aligned} $$ (1) where A is the uniform reference wave that is directly transmitted and a(x, y) is the complex-valued light wave that is scattered by the sample. Under plane wave illumination, we can assume that A has zero phase at the detection plane, without loss of generality, that is, A=|A|. For a weakly scattering object, the self-interference term |a(x, y)|2 can be ignored compared with the other terms in Equation (1) because $|a(x, y){|^2}$. As detailed in our Supplementary Information, none of the samples that we imaged in this work satisfies this weakly scattering assumption. More specifically, the root-mean-squared (RMS) modulus of the scattered wave was measured to be approximately 28%, 34% and 37% of the reference wave RMS modulus for breast tissue, Pap smear and blood smear samples, respectively. This is why, for in-line holographic imaging of such strongly scattering and structurally dense samples, self-interference-related terms, in addition to twin-image terms, form strong image artifacts in both the phase and amplitude channels of the sample, making it difficult to apply object support-based constraints for phase retrieval. This necessitates additional holographic measurements for traditional phase recovery and holographic image reconstruction methods such as the multi-height phase recovery approach that we used for comparison in this work. Without increasing the number of holographic measurements, our deep neural network-based phase retrieval technique can learn to separate/clean the phase and amplitude images of the objects from twin-image and self-interference-related spatial artifacts, as illustrated in Figures 1, 2, 3. In principle, one could also use off-axis interferometry55-57 to image strongly scattering samples. However, this would create a penalty in the resolution or field of view of the reconstructed images due to the reduction in the space-bandwidth product of an off-axis imaging system.
Another important property of this deep neural network-based holographic reconstruction framework is that it significantly suppresses out-of-focus interference artifacts, which frequently appear in holographic images due to dust particles or other imperfections in various surfaces or optical components of the imaging setup. These naturally occurring artifacts are also highlighted in Figure 2f, 2g, 2n, 2o with yellow arrows and cleaned in the corresponding network output images of Figure 2d, 2e, 2l, 2m. From the perspective of our trained neural network, this property to suppress out-of-focus interference artifacts stems from the fact that these holographic artifacts fall into the same category as twin-image artifacts due to the spatial defocusing operation, helping the trained network reject such artifacts in the reconstruction process. This is especially important for coherent imaging systems because various unwanted particles and features form holographic fringes on the sensor plane and superimpose on the object's hologram, degrading the perceived image quality after image reconstruction.
In this study, we used the same neural network architecture depicted in Figure 1 and Supplementary Figs. S1–S2 for all object types, and based on this design, we separately trained the convolutional neural network for different types of objects (for example, breast tissue vs Pap smear). The neural network was then fixed after the training process to blindly reconstruct the phase and amplitude images of any object of the same type. If a different type of sample (for example, a blood smear image) was used as an input for a specific network trained on a different sample type (for example, Pap smear images), reconstruction artifacts would appear, as exemplified in Supplementary Fig. S6. However, this does not pose a limitation because in most imaging experiments, the type of the sample is known, although its microscopic features are unknown and must be revealed with a microscope. This is the case for biomedical imaging and pathology since the samples are prepared (for example, stained and fixed) with the correct procedures, tailored for the type of the sample. Therefore, the use of an appropriately trained neural network for a given type of sample can be considered well aligned with traditional uses of digital microscopy tools.
We also created and tested a universal neural network that can reconstruct different types of objects after its training, based on the same architecture used in our earlier networks. To handle different object types using a single neural network, we increased the number of feature maps in each convolutional layer from 16 to 32 (Supplementary Information), which also increased the complexity of the network, leading to increased training times. However, the reconstruction runtime (after the network was fixed) increased marginally from approximately 6.45 s to 7.85 s for a field of view of 1 mm2 (Table 2). Table 1 also compares the SSIM index values achieved using this universal network, which performed similarly to the individual object-type-specific networks. A further comparison between the holographic image reconstructions achieved by this universal network and the object-type-specific networks is also provided in Figure 5, confirming the same conclusion as in Table 1.
Fig. 5
Comparison of the holographic image reconstruction results for the sample-type-specific and universal deep networks for different types of samples. The deep neural network used a single hologram intensity as input, whereas Nholo=8 was used in the column on the right. (a–f) Blood smear. (g–l) Papanicolaou smear. (m–r) Breast tissue section.