transform-design
ses-transformer proof of concept (POC)
We'll build the solution on a laptop for development - with the goal of porting it to the Amazon s3 Parquet filesystem, and using Amazon Athena as our database
For the development environment today, well need:
POC code
The code is available here: Github
The files for this project are organized as follows:
An
input
directory for some "input" data that we can test withA
sql
directory for SQL statements we'll use to query the final dataA
src
directory for the pyspark code that will transform the raw SES events into our records
The program will create:
An output_local directory - when we run the non-streaming version
An output_streaming directory for the streaming examples
Last updated