Deluxe Corporation boosts data processing efficiency and decision-making capabilities with AWS-based MLOps

Business needs

Deluxe Corporation, a leading financial services technology company, wanted to modernize its legacy Machine Learning (ML) workflows. They sought a cloud-based approach to migrate from legacy R-based solutions to Python, unlocking new levels of efficiency and scalability. The objective was to establish an end-to-end Machine Learning Operations (MLOps) solution encompassing data transformation, feature selection, model training, deployment, inference, and monitoring.

To address these business needs while achieving automation and customization objectives, Deluxe was looking for a partner who would help them:

Ensure a seamless transition from the legacy R code to the SageMaker environment while maintaining functional equivalence and performance benchmarks
Automate data preprocessing and transformation pipelines, minimizing manual intervention
Build a customizable model training and deployment pipeline
Deploy the MLOps pipeline onto the AWS cloud

The streamlined ML workflow accelerated project timelines, enabling faster development of AI applications

Solution

After thoroughly analyzing Deluxe’s business needs, the Impetus team developed a scalable and cost-effective solution leveraging AWS SageMaker to manage ML life cycle phases, focusing on feature engineering, model development, deployment, and monitoring.

Here are the key steps outlining our comprehensive solution tailored to optimize their ML workflows:

Transformed legacy R code to Python for data transformation.
Utilized Amazon SageMaker’s processing job for data transformation, feature selection, and model evaluation.
Built a custom Docker container with the training script, environment, and dependencies, and uploaded it to Elastic Container Service.
Configured and ran training jobs on Amazon SageMaker, specifying the ECR container.
Employed Amazon SageMaker Pipelines for workflow orchestration.
Employed AWS Lambda functions to seamlessly trigger model training pipelines and inference jobs, enhancing automation and efficiency.
Utilized Amazon Simple Notification Service (SNS) to monitor the model training pipeline, providing real-time notifications to users regarding pipeline execution status.
Deployed the pipeline across three distinct environments: development, pre-production, and production, facilitating rigorous testing and seamless transition to live production environments.
Engineered both synchronous and asynchronous fast APIs to cater to real-time and batch processing needs, enhancing scalability and responsiveness.
Developed a robust model monitoring framework, enabling dynamic or batch scoring using Apache Airflow.

The architecture diagram of the ML model pipeline is given below:

Architecture diagram

Solution highlights

Automated data preprocessing, data transformation, and feature selection using Amazon SageMaker pipelines
Developed custom training scripts for model building, performance evaluation, and deployment
Leveraged Amazon Elastic Container Registry (ECR) for building and deploying Docker containers with requisite Python packages
Automated AWS SageMaker’s hyperparameter tuning job to train multiple models parallelly
Deployed the best-performing model using customized metrics and stored the required artifacts in Amazon S3 and metadata in Amazon DynamoDB
Utilized the metadata stored in Amazon DynamoDB for model deployment, scoring, and lineage tracking

AWS technologies used

AWS Lambda, Amazon SageMaker Pipelines, AWS SNS, Amazon ECR, AWS CodeBuild, Amazon S3, Amazon DynamoDB, Apache Airflow, AWS Glue

Impact

The implementation of the AutoML pipeline resulted in significant impacts across various dimensions of Deluxe’s operations including:

Accelerated ML projects: The streamlined ML workflow accelerated development time and effortlessly adapted to changing requirements, driving faster development of AI applications
Improved accuracy and efficiency: Automation minimized manual effort and errors, resulting in improved accuracy and efficiency in data processing and decision-making
Ensured adaptability and scalability: Deployment of models for real-time, asynchronous, and batch inference ensured adaptability, scalability, and performance optimization for AI systems
Continuous improvement: Implementation of model monitoring framework continuously evaluates model performance, facilitating timely intervention in cases of drift, ensuring reliability, accountability, and continuous improvement in machine learning applications

Choose a lab aligned to your Data & AI journey

Address your desired use case across critical analytic dimensions

STRATEGY LAB

Collaborate with experts on strategic objectives
Identify and select core technologies
Ensure IP governance and protection
Align business outcomes with goals

$10K value, complimentary for qualified organizations

DESIGN LAB

Explore architecture options with experts
Ensure alignment of business and technology
Architect an ideal solution for a pressing problem

$100K value, complimentary for qualified organizations

BUILD LAB

Validate or refactor existing architecture
Develop a prototype with expert guidance
Establish a roadmap to production

$240K value, offered for a $50K fixed price

Learn more about Data & AI Labs