🐜
tiny engines
  • Tiny Engines
  • Personal Website Home
  • NFL machine learning capstone
    • project presentation
    • project proposal
    • project approach
    • project structure
    • project workflow
    • project summary
    • project code
  • Onboarding new hires
    • motivation
    • the project
    • the mailer service
      • mailer setup
      • walk-through
      • unit testing
      • testing the controller
      • testing the handler
      • testing the mailer
      • integration testing
      • integration example
      • acceptance testing
      • acceptance example
      • documenting the API
      • test coverage
      • performance testing
      • mutation testing
      • grammar checking
    • the event listener
      • design
      • webhook setup
      • walk-through
      • testing
      • the kafka connector
  • Walk-throughs
    • spark streaming hld
      • background
      • architecture
      • threat
      • project
      • transform-design
      • transform-poc
      • query-poc
    • kafka walkthroughs
    • java futures
      • async servers
      • async clients
      • async streams
Powered by GitBook
On this page
  • Data ingestion
  • Inputs and outputs
  • Data flow

Was this helpful?

  1. Walk-throughs
  2. spark streaming hld

architecture

PreviousbackgroundNextthreat

Last updated 3 years ago

Was this helpful?

The primary components of the solution include:

Data ingestion

Getting the data from AWS into our own s3 buckets is just a matter of configuration and I won't cover that part of the project.

Inputs and outputs

The SES data format is a little different for each event class, so we are reading it into one "uber" SES Input Schema that brings the differences into our transformer as strings. Then, we'll transform the data into a simple flat Required Schema output.

Data flow

SES Input Schema
Required Schema