transform-design

ses-transformer proof of concept (POC)

We'll build the solution on a laptop for development - with the goal of porting it to the Amazon s3 Parquet filesystem, and using Amazon Athena as our database

For the development environment today, well need:

POC code

The code is available here: Github

The files for this project are organized as follows:

  • An input directory for some "input" data that we can test with

  • A sql directory for SQL statements we'll use to query the final data

  • A src directory for the pyspark code that will transform the raw SES events into our records

The program will create:

  • An output_local directory - when we run the non-streaming version

  • An output_streaming directory for the streaming examples

Last updated

Was this helpful?