Spectrify

https://img.shields.io/pypi/v/spectrify.svg https://img.shields.io/travis/hellonarrativ/spectrify.svg Documentation Status

A simple yet powerful tool to move your data from Redshift to Redshift Spectrum.

Features

One-liners to:

  • Export a Redshift table to S3 (CSV)
  • Convert exported CSVs to Parquet files in parallel
  • Create the Spectrum table on your Redshift cluster
  • Perform all 3 steps in sequence, essentially “copying” a Redshift table Spectrum in one command.

S3 credentials are specified using boto3. See http://boto3.readthedocs.io/en/latest/guide/configuration.html

Redshift credentials are supplied via environment variables, command-line parameters, or interactive prompt.

Install

$ pip install spectrify

Command-line Usage

Export Redshift table my_table to a folder of CSV files on S3:

Convert exported CSVs to Parquet:

Create Spectrum table from S3 folder:

Transform Redshift table by performing all 3 steps in sequence:

Python Usage

Currently, you’ll have to supply your own SQL Alchemy engine to each of the below commands (pull requests welcome to make this eaiser).

Export to S3:

from spectrify.export import export_to_csv
export_to_csv(sa_engine, table_name, s3_csv_dir)

Convert exported CSVs to Parquet:

from spectrify.convert import convert_redshift_manifest_to_parquet
from spectrify.utils.schema import get_table_schema
sa_table = get_table_schema(sa_engine, source_table_name)
convert_redshift_manifest_to_parquet(s3_csv_manifest_path, sa_table, s3_spectrum_dir)

Create Spectrum table from S3 parquet folder:

from spectrify.create import create_external_table
from spectrify.utils.schema import get_table_schema
sa_table = get_table_schema(sa_engine, source_table_name)
create_external_table(sa_engine, dest_schema, dest_table_name, sa_table, s3_spectrum_path)

Transform Redshift table by performing all 3 steps in sequence:

from spectrify.transform import transform_table
transform_table(sa_engine, table_name, s3_base_path, dest_schema, dest_table, num_workers)

Contribute

Contributions always welcome! Read our guide on contributing here: http://spectrify.readthedocs.io/en/latest/contributing.html

License

MIT License. Copyright (c) 2017, The Narrativ Company, Inc.