Case Study: European Research Institute
Solving Scientific Research Bottlenecks with AI
Industry: Life sciences, research, academics
Customer: a prominent scientific research institute in Europe
Products: NF5488A5
Introduction
The customer is a leading scientific research institute in Germany, known for its continuous research and contributions in modern life sciences and medicine. The Institute is committed to integrating the knowledge of chemistry, biology, physics and informatics through advanced analytical technology to provide multi-parameter analysis methods for biomaterials. It fosters research in comprehensive disciplines to promote innovative solutions in life-sciences, medicine, and disease prevention.
Challenge
1) Data collection and processing
Pre-processing all the varieties of medical sample data at the Institute is extremely IO intensive. Users were faced with the challenge of quickly obtaining the quantity of labeled data and improving the efficiency of data pre-processing. A large proportion of the original data consist of medical image, gene map and other image data, the screening and analysis of which is an especially time-consuming process.
2) Difficulties of data analysis
A common research method employed by the Institute is to build mathematical models for simulation and processing using deep learning. Deep learning training processes entail matrix multiplication with millions of parameters and iterations, a highly compute-intensive undertaking that can take up to several months without adequate compute power.
In order to meet this demand and optimize its research processes, the Institute needs a machine that can provide supercomputing performance, image processing and deep learning capabilities.
Solution
1) High performance compute improves efficiency of image analysis
The images needed for the analysis and research include anatomical structure imaging, pathological images, molecular structure, gene map, and other medical reports. Processing and analyzing these images is a major test of the GPU’s performance. Aivres’ NF5488A5 utilizes optimized topology, NUMA adoption, PCIe 4.0 interconnect support, and greatly improves data transmission efficiency among CPUs using the specified communication between the CPU and the nearest GPU. The result is more rapid analysis of medical and biological images, where users can recognize the main features of the image more quickly to derive critical image details such as pathological stage, body structure data, molecular structure, gene arrangement, etc. These data are then further screened and classified using the Lasso analysis, univariate analysis and other commonly used data analysis methods.
2) Training and evaluation of deep learning ability assistance model
After the preliminary analysis of the data, NF5488A5 adopts the modeling principles to further evaluate and improve model accuracy by self-optimization through a large number of calculations. For example, after inputting the relevant data of gene map into the model, the model will continuously adjust and optimize itself according to its calculation results and comparisons with the original data’s labeling results, so as to generate a more accurate gene map analysis model. After learning algorithms like random forest, KNN, and NLP, the Institute went on to build its own protein mass spectrometry analysis model based on its own needs for customized analysis and research, further improving the efficiency of the model.
Moreover, NF5488A5 also meets user needs for energy efficiency and heat dissipation: the 4U size design is suitable for a wide range of data center deployment environments; an optimized power supply design improves power supply stability to reduce energy use and TCO; its advanced heat dissipation system ensures stability and reliability in real time at the operating temperature of 35 ℃.
Results and Impact
The NF5488A5’s capabilities in compute performance, deep learning and inference greatly exceeded the customer’s expectations. The system provided the Institute time savings of over 30% in data processing and image analysis, aiding their research efforts in protein molecular structure analysis, gene mapping analysis, and others. Furthermore, its high compute performance and deep learning ability accelerates the learning speed of KNN, xgboost and other models, allowing researchers to develop a variety of calculation programs and conduct calculation and analysis simultaneously, so as to improve the accuracy and reliability of evaluation model.