Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More MRI images are understandably complex and data-heavy. Because of this, developers training large language models (LLMs) for MRI analysis have had to slice captured images into 2D. But this results in just an approximation of the original image, thus limiting the model’s ability to analyze intricate anatomical structures. This creates challenges in complex cases involving brain tumors, skeletal disorders or cardiovascular diseases. But GE Healthcare appears to have overcome this massive hurdle, introducing the industry’s first full-body 3D MRI research foundation model (FM) at this year’s AWS re:Invent. For the first time, models can use full 3D images of the entire body. GE Healthcare’s FM was built on AWS from the ground up — there are very few models specifically designed for medical imaging like MRIs — and is based on more than 173,000 images from over 19,000 studies. Developers say they have been able to train the model with five times less compute than previously required. GE Healthcare has not yet commercialized the foundation model; it is still in an evolutionary research phase. An early evaluator, Mass General Brigham, is set to begin experimenting with it soon. “Our vision is to put these models into the hands of technical teams working in healthcare systems, giving them powerful tools for developing research and clinical applications faster, and also more cost-effectively,” GE HealthCare chief AI officer Parry Bhatia told VentureBeat. Enabling real-time analysis of complex 3D MRI data While this is a groundbreaking development, generative AI and LLMs are not new territory for the company. The team has been working with advanced technologies for more than 10 years, Bhatia explained. One of its flagship products is AIR Recon DL, a deep learning-based reconstruction algorithm that allows radiologists to more quickly achieve crisp images. The algorithm removes noise from raw images and improves signal-to-noise ratio, cutting scan times by up to 50%. Since 2020, 34 million patients have been scanned with AIR Recon DL. GE Healthcare began working on its MRI FM at the beginning of 2024. Because the model is multimodal, it can support image-to-text searching, link images and words, and segment and classify diseases. The goal is to give healthcare professionals more details in one scan than ever before, said Bhatia, leading to faster, more accurate diagnosis and treatment. “The model has significant potential to enable real-time analysis of 3D MRI data, which can improve medical procedures like biopsies, radiation therapy and robotic surgery,” Dan Sheeran, GM for health care and life sciences at AWS, told VentureBeat. Already, it has outperformed other publicly-available research models in tasks including classification of prostate cancer and Alzheimer’s disease. It has exhibited accuracy up to 30% in matching MRI scans with text descriptions in image retrieval — which might not sound all that impressive, but it’s a big improvement over the 3% capability exhibited by similar models. “It has come to a stage where it’s giving some really robust results,” said Bhatia. “The implications are huge.” Doing more with (much less) data The MRI process requires a few different types of datasets to support various techniques that map the human body, Bhatia explained. What’s known as a T1-weighted imaging technique, for instance, highlights fatty tissue and decreases the signal of water, while T2-weighted imaging enhances water signals. The two methods are complementary and create a full picture of the brain to help clinicians detect abnormalities like tumors, trauma or cancer. “MRI images come in all different shapes and sizes, similar to how you would have books in different formats and sizes, right?” said Bhatia. To overcome challenges presented by diverse datasets, developers introduced a “resize and adapt” strategy so that the model could process and react to different variations. Also, data may be missing in some areas — an image may be incomplete, for instance — so they taught the model simply to ignore those instances. “Instead of getting stuck, we taught the model to skip over the gaps and focus on what was available,” said Bhatia. “Think of this as solving a puzzle with some missing pieces.” The developers also employed semi-supervised student-teacher learning, which is particularly helpful when there is limited data. With this method, two different neural networks are trained on both labeled and unlabeled data, with the teacher creating labels that help the student learn and predict future labels. “We’re now using a lot of these self-supervised technologies, which don’t require huge amounts of data or labels to train large models,” said Bhatia. “It reduces the dependencies, where you can learn more from these raw images than in the past.” This helps to ensure that the model performs well in hospitals with fewer resources, older machines and different kinds of datasets, Bhatia explained. He also underscored the importance of the models’ multimodality. “A lot of technology in the past was unimodal,” said Bhatia. “It would look only into the image, into the text. But now they’re becoming multi-modal, they can go from image to text, text to image, so that you can bring in a lot of things that were done with separate models in the past and really unify the workflow.” He emphasized that researchers only use datasets that they have rights to; GE Healthcare has partners who license de-identified data sets, and they’re careful to adhere to compliance standards and policies. Using AWS SageMaker to tackle computation, data challenges Undoubtedly, there are many challenges when building such sophisticated models — such as limited computational power for 3D images that are gigabytes in size. “It’s a massive 3D volume of data,” said Bhatia. “You need to bring it into the memory of the model, which is a really complex problem.” To help overcome this, GE Healthcare built on Amazon SageMaker, which provides high-speed networking and distributed training capabilities across multiple GPUs, and leveraged Nvidia A100 and tensor core GPUs for large-scale training. “Because of the size of the data