Page tree
Skip to end of metadata
Go to start of metadata

Welcome to the Training Project space. 

Project Name:

  • Proposed name for the project: Training

Project description:

The Training project defines and implements the relevant processing pipelines that continuously and iteratively process new raw data, create the required for training datasets and create new or update existing models. 


This project will:

  1. Develop he high-level requirements of the data pipeline (DP) and model training pipeline (MTP). 
  2. Develop the requirements of each stage of those pipelines. 
  3. Develop information models associated with those pipelines. 
  4. Develop APIs that support data augmentation and data or model versioning. 
  5. Work within the Architecture subcommittee to translate these requirements into architecture design documents and associated E6A - Training APIs
  6. Develop the specification and test specification of the produced APIs. 
  7. Develop and Test the APIs and SDKs associated with those pipelines. 

Within the scope of the Training project is the specification and development of the following pipelines:

  1. Data Pipeline (DP)
    1. The input data pipeline cover the creation of datasets for training, validation and testing datasets.
    2. Input to the DP is the raw data sources (in some storage technology e.g. S3 buckets, streaming data sources). 
    3. The output is other data sources that are required for the MTP pipeline to train on these datasets. We need to be able to accommodate DQN architectures and how these architectures learn. 
  2. Model Training Pipeline (MTP)
    1. The input is the datasets created by the DP 
    2. The output is the trained model (predictor) as well as metadata associated with the performance KPIs.  The produced output is then ready for the subsequent phase of on-boarding.  
  3. Model Evaluation & Validation Pipeline (MEVP)
  4. Serving Pipeline (SP)

Architecture alignment and dependencies:

  • How does this integrate with and enhance the Acumos platform architecture?
    • What other Acumos platform components would this project depend on or provide services to?
    • As applicable, include architecture/dataflow diagrams with reference points (for internal and external interfaces/APIs)
  • How does this align with external standards/specifications?
    • APIs/Interfaces
    • Information/data models
  • How does this relate to other open source projects?
    • Functions in scope, as used/enhanced/redeveloped based upon external open source
    • Collaborative/integrative cross-community opportunities
    • etc.

Proposed Release schedule:

  • The training project will be producing requirements and API specs for Athena release. 
  • We are not planning to release Training into Athena release but we may develop a PoC that encompasses an seed code implementation of the designed E6A Training APIs and / or client. We do have for example seed code for the data broker and other code associated with the deployment APIs. 

Release Components Name:

Components Name

Components Repository name

Maven Group ID

Components Description


Blog stream

Create a blog post to share news and announcements with your team and company.

Space contributors


  • No labels

1 Comment

  1. Here are a couple of examples of how we could take the current design of a dedicated training process into the concept of a composable set of pipelines.