Mark Pickering

Research Portfolio

Welcome to my research portfolio!

My research interests are in the areas of image and video processing, computer vision, and artificial intelligence.

In this portfolio, you will find a collection of recent research projects that demonstrate my ability to find solutions to complex challenges through innovative problem-solving. .

Research Projects

Orthovis (2D – 3D alignment of CT scans and X-rays)

I developed the OrthoVis software package for a long-standing collaboration with the Canberra Hospital. The main aim of this collaboration is to develop a more effective means of measuring the relative motion of the bones in human joints using standard hospital imaging equipment. This package allows the user to import a CT scan and a video x-ray of the joint to be analysed and produces 3D motion data for the motion of the joint depicted in the video x-ray.

An example of the results of the multi-modal registration procedure is shown below. In the figure on the left, the edges from the 3D CT scan, at its estimated 3D position, are projected onto the corresponding fluoroscopy frame. the figure on the right shows an example of the 3D surface rendering produced by the code.

The OrthoVis software is now an intrinsic part of a major clinical trial to evaluate the performance of three different artificial knee designs. The software package has been used to measure the knee motion data for over 170 patients and, when completed, the data collected from the trial will form the largest ever study of knee motion.

The following publications describe methods and experimental results that were generated using the OrthoVis software:

  • Saadat S., Perriman D., Scarvell J. M., Smith P. N., Galvin C. R., Lynch J. & Pickering M. R. (2022), An efficient hybrid method for 3D to 2D medical image registration, International Journal of Computer Assisted Radiology and Surgery, Vol. 17, pp. 1313 – 1320, doi: 10.1007/s11548-022-02624-0.
  • Saadat S., Asikuzzaman M., Pickering M. R., Perriman D. M., Scarvell J. M. & Smith P. N. (2021)  A Fast and Robust Framework for 3D/2D Model to Multi-Frame Fluoroscopy Registration, IEEE Access, Vol. 9, pp. 134223 – 134239, doi: 10.1109/ACCESS.2021.3114366.
  • Lynch J. T., Perriman D.M., Scarvell J.M., Pickering M. R., Galvin C. R., Neeman T. & Smith P. N. (2021) The influence of total knee arthroplasty design on kneeling kinematics: A prospective randomized clinical trial, Bone and Joint Journal, Vol. 103, pp. 105 – 112, 10.1302/0301-620X.103B1.BJJ-2020-0958.R1.
  • Ward T. R., Hussain M. M., Pickering M. R., Perriman D., Burns A., Scarvell J. & Smith P. N. (2021) Validation of a method to measure three-dimensional hip joint kinematics in subjects with femoroacetabular impingement, HIP International, Vol. 31, pp. 133 – 139, doi: 10.1177/1120700019883548.
  • Lynch J. T., Perriman D. M., Scarvell J. M., Pickering M. R., Warmenhoven J., Galvin C. R., Neeman T., Besier T. F. & Smith P. N. (2020) Shape is only a weak predictor of deep knee flexion kinematics in healthy and osteoarthritic knees, Journal of Orthopaedic Research, Vol. 38, pp. 2250 – 2261, doi: 10.1002/jor.24622.
  • Lynch, J. T., Scarvell, J. M., Pickering, M. R., Warmenhoven, J., Galvin, C. R., Neeman, T., . . . Perriman, D. (2020). Shape is only a weak predictor of deep knee flexion kinematics in healthy and osteoarthritic knees. Journal of Orthopaedic Research, accepted 30 January.
  • Galvin, C. R., Perriman, D., Scarvell, J. M., Lynch, J. T., Pickering, M. R., Smith, P. N., & Newman, P. (2019). Age has a minimal effect on knee kinematics: a cross-sectional 3D/2D image-registration study of kneeling. The Knee, accepted 20 July.
  • Ward, T. R., Hussain, M. M., Pickering, M. R., Perriman, D., Burns, A., Scarvell, J., & Smith, P. N. (2019). Validation of a method to measure three-dimensional hip joint kinematics in subjects with femoroacetabular impingement. HIP International. doi:10.1177/1120700019883548
  • Scarvell, J. M., Hribar, N., Galvin, C. R., Pickering, M. R., Perriman, D. M., Lynch, J. T., & Smith, P. N. (2019). Analysis of kneeling by medical imaging shows the femur moves back to the posterior rim of the tibial plateau, prompting review of the concave-convex rule. Physical Therapy, 99(3), 311-318. doi:https://doi.org/10.1093/ptj/pzy144
  • Akter, M., Lambert, A. J., Pickering, M. R., Scarvell, J. M., & Smith, P. N. (2014). Robust initialisation for single-plane 3D CT to 2D fluoroscopy image registration. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, Taylor & Francis, DOI: 10.1080/21681163.21682014.21897649. doi:10.1080/21681163.2014.897649
  • Scarvell, J. M., Pickering, M. R., & Smith, P. N. (2009). New registration algorithm for determining 3D knee kinematics using CT and single-plane fluoroscopy with improved out-of-plane translation accuracy. Journal of Orthopaedic Research, 28(3), 334-340.

Medical Image Segmentation using Convolutional Neural Networks

Convolutional neural networks (CNNs) have achieved expert-level performance in many image processing applications. However, CNNs face the vanishing gradient problem when the number of layers are increased beyond a certain threshold. In this project, a new two-stage U-Net++ (TS-UNet++) architecture was developed to address the vanishing gradient problem.

The new architecture uses two different types of deep CNNs rather than a traditional multi-stage network, the U-Net++ and U-Net architectures in the first and second stages respectively. An extra convolutional block was added before the output layer of the multi-stage network to better extract high-level features.

A new concatenation-based fusion structure was incorporated in this architecture to enable deep supervision. More convolutional layers were added after each concatenation of the fusion structure to extract more representative features. An example of the results provided by the new network is shown below.

The first row shows the original MRI images, the second row shows the manual segmentations and the third row shows the  automatic segmentations for the two-stage U-Net++ (TS-UNet++) model with contours extracted from the ground truths shown as gray lines superimposed on the automatic segmentation results.

The following publication describes the algorithm and experimental results in more detail:

  1. Suman AA; Khemchandani Y; Asikuzzaman M; Webb AL; Perriman DM; Tahtali M; Pickering MR, 2020, ‘Evaluation of U-Net CNN Approaches for Human Neck MRI Segmentation’, in 2020 Digital Image Computing: Techniques and Applications, DICTA 2020, doi: 10.1109/DICTA51227.2020.9363385.

Deformable 3D-3D Registration for Neck MRI Volumes

Neck pain is one of the most common symptoms of cervical spine disease, and segmenting neck muscles to create volumetric measurements may assist clinical diagnosis. While image registration is used to segment medical images, registration is highly challenging for neck muscles due to their tight proximity, shape and size variations among subjects, and similar appearance. These challenges cause conventional multi resolution-based registration methods to be trapped in local minima due to their low degree of freedom geometrical transforms.

A novel object-constrained hierarchical registration algorithm for aligning inter-subject neck muscles was developed for this project. First, to handle large scale local minima, the algorithm uses a coarse registration technique, which optimizes the new edge position difference (EPD) similarity measure, to align large mismatches. Also, a new transformation based on the discrete periodic spline wavelet (DPSW), affine and free-form-deformation (FFD) transformations are exploited.

Second, to avoid the monotonous nature of using transformations in multiple stages, a fine registration technique was designed for aligning small mismatches. This technique uses a double-pushing system by changing edges in the EPD and switching transformation resolutions. The EPD helps in both coarse and fine techniques to implement object-constrained registration by allowing the control of edges, which is not possible when using traditional similarity measures. The new method achieves better accuracy, robustness and consistency than existing methods.

The Figure below shows example registration results for each stage of the new algorithm. The two images ont he top left show the fixed image and the moving image before registration, the remaining images show the registered moving image with superimposed edges of the fixed image after: Affine-EPD, DPSW-EPD, Coarse-EPD, Fine-EPD and the Final stage respectively.

The following publications describe the algorithm and experimental results in more detail:

  1. Al Suman, A., Asikuzzaman, M., Webb, A. L., Perriman, D. M., Tahtali, M. & Pickering, M. R. (2020). A Deformable 3D-3D Registration Framework Using Discrete Periodic Spline Wavelet and Edge Position Difference. IEEE Access, 8, 146116-146133. doi: 10.1109/ACCESS.2020.3015504.
  2. Suman, A. A., Aktar, M. N., Asikuzzaman, M., Webb, A. L., Perriman, D. M., & Pickering, M. R. (2019). Segmentation and reconstruction of cervical muscles using knowledge-based grouping adaptation and new step-wise registration with discrete cosines. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 7(1), 12-25. doi:10.1080/21681163.2017.1356751

Cursive Text Recognition in Natural Scene Images Using Deep Learning

Text recognition in natural scene images is a challenging problem in computer vision. It is more complex than optical character recognition (OCR) due to variations in text size, colors, fonts, orientations, complex backgrounds, occlusion, illuminations and uneven lighting conditions.

In this project, a segmentation-free method based on a deep convolutional recurrent neural network was developed to solve the problem of cursive text recognition, particularly focusing on Urdu text in natural scenes. Compared to non-cursive scripts, Urdu text recognition is more complex due to variations in the writing styles, the occurrence of different shapes for the same character, connected text, ligature overlapping, stretched, diagonal and condensed text.

The new model is based on three components: a deep convolutional neural network (CNN) with shortcut connections to extract and encode the features, a recurrent neural network (RNN) to decode the convolutional features, and a connectionist temporal classification (CTC) to map the predicted sequences into the target labels.

An example of the results provided by the new network is shown below. For each word image, the annotations at the bottom left are the ground truths, while those at the bottom right are the predicted text.

The following publications describe the algorithm and experimental results in more detail:

  • Chandio A. A., Asikuzzaman M., Pickering M. R. & Leghari M. (2022) Cursive Text Recognition in Natural Scene Images Using Deep Convolutional Recurrent Neural Network, IEEE Access, Vol. 10, pp. 10062 – 10078, 10.1109/ACCESS.2022.3144844.
  • Chandio, A. A., Asikuzzaman, M., Pickering, M. R. & Leghari, M. (2020). Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images. Data in Brief, 31, 105749. doi: 10.1016/j.dib.2020.105749
  • Chandio, A. A., Asikuzzaman, M. & Pickering, M. R. (2020). Cursive Character Recognition in Natural Scene Images Using a Multilevel Convolutional Neural Network Fusion. IEEE Access, 8, 109054-109070.

Instance Segmentation Using Deep Learning

In this project, AI technology was developed to monitor how koalas are using under-road tunnels or above-road crossings with the ultimate goal of providing information to help protect the declining population.

Previously, captured images had to be manually checked to see whether the animals filmed using the crossings were koalas or other species. Deep Learning networks are well-suited for tasks such as image recognition, speech recognition, and natural language processing. A convolutional neural network (CNN) is a type of deep learning architecture commonly used for image classification and recognition tasks. It contains multiple convolutional layers that apply filters to the input image to extract features.

The modified Mask R-CNN network was able to accurately detect koalas in the trap camera images when trained with a limited number of training images.

An example of the results provided by the new network is shown in the figure below. Each row contains the detection results on different frames extracted from the same video. Our method detected koalas more accurately than the baseline method that uses only semantic information.