Deluxe Corporation boosts data processing efficiency and decision-making capabilities with AWS-based MLOps - Impetus

Deluxe Corporation boosts data processing efficiency and decision-making capabilities with AWS-based MLOps

Streamlined workflows and automated Machine Learning processes to drive accuracy, scalability, and continuous improvement

01

Business needs

Deluxe Corporation, a leading financial services technology company, wanted to modernize its legacy Machine Learning (ML) workflows. They sought a cloud-based approach to migrate from legacy R-based solutions to Python, unlocking new levels of efficiency and scalability. The objective was to establish an end-to-end Machine Learning Operations (MLOps) solution encompassing data transformation, feature selection, model training, deployment, inference, and monitoring.

To address these business needs while achieving automation and customization objectives, Deluxe was looking for a partner who would help them:

  • Ensure a seamless transition from the legacy R code to the SageMaker environment while maintaining functional equivalence and performance benchmarks
  • Automate data preprocessing and transformation pipelines, minimizing manual intervention
  • Build a customizable model training and deployment pipeline
  • Deploy the MLOps pipeline onto the AWS cloud

The streamlined ML workflow accelerated project timelines, enabling faster development of AI applications

02

Solution

After thoroughly analyzing Deluxe’s business needs, the Impetus team developed a scalable and cost-effective solution leveraging AWS SageMaker to manage ML life cycle phases, focusing on feature engineering, model development, deployment, and monitoring.

Here are the key steps outlining our comprehensive solution tailored to optimize their ML workflows:

  • Transformed legacy R code to Python for data transformation.
  • Utilized Amazon SageMaker’s processing job for data transformation, feature selection, and model evaluation.
  • Built a custom Docker container with the training script, environment, and dependencies, and uploaded it to Elastic Container Service.
  • Configured and ran training jobs on Amazon SageMaker, specifying the ECR container.
  • Employed Amazon SageMaker Pipelines for workflow orchestration.
  • Employed AWS Lambda functions to seamlessly trigger model training pipelines and inference jobs, enhancing automation and efficiency.
  • Utilized Amazon Simple Notification Service (SNS) to monitor the model training pipeline, providing real-time notifications to users regarding pipeline execution status.
  • Deployed the pipeline across three distinct environments: development, pre-production, and production, facilitating rigorous testing and seamless transition to live production environments.
  • Engineered both synchronous and asynchronous fast APIs to cater to real-time and batch processing needs, enhancing scalability and responsiveness.
  • Developed a robust model monitoring framework, enabling dynamic or batch scoring using Apache Airflow.

The architecture diagram of the ML model pipeline is given below:

Architecture diagram

Solution highlights

  • Automated data preprocessing, data transformation, and feature selection using Amazon SageMaker pipelines​
  • Developed custom training scripts for model building, performance evaluation, and deployment
  • Leveraged Amazon Elastic Container Registry (ECR) for building and deploying Docker containers with requisite Python packages ​
  • Automated AWS SageMaker’s hyperparameter tuning job to train multiple models parallelly
  • Deployed the best-performing model using customized metrics and stored the required artifacts in Amazon S3 and metadata in Amazon DynamoDB
  • Utilized the metadata stored in Amazon DynamoDB for model deployment, scoring, and lineage tracking​

AWS technologies used

AWS Lambda, Amazon SageMaker Pipelines, AWS SNS, Amazon ECR, AWS CodeBuild, Amazon S3, Amazon DynamoDB, Apache Airflow, AWS Glue

03

Impact

The implementation of the AutoML pipeline resulted in significant impacts across various dimensions of Deluxe’s operations including:

  • Accelerated ML projects: The streamlined ML workflow accelerated development time and effortlessly adapted to changing requirements, driving faster development of AI applications
  • Improved accuracy and efficiency: Automation minimized manual effort and errors, resulting in improved accuracy and efficiency in data processing and decision-making
  • Ensured adaptability and scalability: Deployment of models for real-time, asynchronous, and batch inference ensured adaptability, scalability, and performance optimization for AI systems
  • Continuous improvement: Implementation of model monitoring framework continuously evaluates model performance, facilitating timely intervention in cases of drift, ensuring reliability, accountability, and continuous improvement in machine learning applications

Choose a lab aligned to your Data & AI journey

Address your desired use case across critical analytic dimensions

  • Collaborate with experts on strategic objectives

  • Identify and select core technologies

  • Ensure IP governance and protection

  • Align business outcomes with goals


  • Explore architecture options with experts

  • Ensure alignment of business and technology
  • Architect an ideal solution for a pressing problem


  • Validate or refactor existing architecture
  • Develop a prototype with expert guidance

  • Establish a roadmap to production


Learn more about how our work can support your enterprise