Case Study

Case Study: Upstage

High Performance Computing Server Accelerates AI Solution Development

Industry: Artificial intelligence, enterprise

Customer: Upstage, a Korean AI solutions provider

Products: NF5488A5

Introduction

Upstage is a leading AI solutions provider in Korea, developing state-of-the-art AI system solutions which are suitable for enterprises and government agencies. They are also engaged in the business of AI education and training, as well as providing corporate AI consulting services.

Upstage is currently developing an AI-based business-to-business (B2B) no-code or low-code software solution called AI Pack, which is a total AI solution that provides all the system essentials for a company’s application of AI technology, allowing enterprises to quickly implement practical AI solutions to improve business efficiency. As a core solution of AI Pack, OCR Pack enables easy adoption of the high-performance OCR engine that extracts the information customers need from text in documents and images. OCR Pack enables customers to take advantage of high-performance AI services at a cost savings of approximately 80% or more compared to building an OCR system themselves. To continue to support the research and development of OCR Pack, Upstage required a high-performance and reliable AI server platform.

Challenges

1. Program production efficiency

With the rapid advancement of standardization and automation in the AI field, the introduction of no-code and low-code software has become possible. As an AI software solution provider, Upstage has invested heavily in researching a codeless or low-code file recognition system based on AI to develop AI Pack. This development of no-code or low-code software requires massive computing capacity with hardware performance playing a significant role in development.

2. Programs can be affected by security vulnerabilities

Software created using no-code or low-code development is subject to increased security and reliability vulnerabilities due to the use of pre-prepared code. To adequately detect and eliminate any such vulnerabilities during development, a large amount of testing is required, necessitating the need for a high-performance and stable hardware platform. To study program optimization solutions for the vulnerabilities that appear, customers need to continuously follow up and conduct various tests, which require the support of high-performance and very stable hardware systems.

Solution

To meet customer demands, the Aivres NF5488A5 AI server equipped with AMD Rome PCIe 4.0 CPUs and the latest NVIDIA A100 NVLINK GPUs was selected. A single machine is capable of delivering 5 Petaflops of AI performance. A 200Gb/s InfiniBand network interconnection maximizes computing, data access, and communication efficiency between servers to significantly improve training efficiency for AI models in the GPU cluster.

1. Build high performance computing clusters to improve the overall efficiency of application operation.

To meet the high performance computing requirements, a deep learning computing platform with strong AI computing power via the Aivres NF5488A5 AI server equipped with PCIe 4.0 AMD EPYC™ 7002 Series CPUs and the latest NVIDIA A100 NVLINK GPUs for maximum single-machine computing performance was selected. The higher the AI computing performance, the faster the development of OCR Pack. A single machine is capable of delivering 5 Petaflops of AI performance, which is key to why Aivres solutions were selected by Upstage. To distribute this power, a 200Gb/s InfiniBand network interconnection within and across multiple GPU clusters maximizes computing, data access, and communication efficiency between servers to significantly improve training efficiency of AI models.

2. Provide a unified hardware management platform to reduce maintenance costs.

Aivres has built an automated operation and maintenance solution center with the Aivres physical infrastructure management platform (ISPIM), which implements the unified deployment, monitoring, operation and maintenance, and alarm management for different brands of devices in customer data centers. The batch configuration function and out-of-band operating system deployment function of Aivres ISPIM can immensely improve the efficiency of device mounting. The 3D equipment room function can completely restore the space and device layout of the data center and integrate power consumption and alarm information as well. It makes the operation status and parameters of the data center visible at a glance, thus improving the efficiency of fault prediction and issue root cause as well as the overall operation and maintenance efficiency.

Results and Impact

Aivres’ AI architecture solution helped Upstage meet the computing power requirements needed to develop OCR Pack while reducing overall costs. The Aivres solution delivers up to 50 petaflops with a cluster of NF5488A5 servers, maximizing computing power while remaining both efficient and stable. With Aivres, OCR Pack models run faster than ever before, helping R&D personnel to improve development efficiency while also enhancing operation and maintenance efficiency. Additionally, the overall TCO of the Upstage has been significantly reduced. This all results in allowing Upstage to work with industry partners to expand its AI ecosystem and support more companies planning AI transformation.