Data Export

Overview

Hydra supports exporting your data to S3. You can choose the frequency at which any models are exported to S3, and Hydra will automatically write all updated records to S3 on your chosen interval.

AWS and S3

You’ll first need to set up your AWS account and credentials for an S3 bucket.

After signing in, navigate to “S3” in the services menu.

_images/aws-services.png

You’ll be prompted to create your first S3 bucket. You’ll want to create at least one bucket to upload data to.

_images/s3-create-bucket.png

Now that you have your bucket created, you need to set up an AWS user with permissions to upload content to the bucket. From the services menu, choose “IAM”.

_images/iam-landing.png

From the users tab, create a new user.

_images/iam-create-user.png

After creating the user you’ll be presented with the “Access Key” and “Secret Access Key” credentials you’ll need to provide to Hydra.

_images/iam-user-created.png

With the user created you can set up permissions to upload content to your S3 bucket. Select the user, and in the lower pane, select “Attach User Policy” from the “Permissions” tab. This will open a dialog from which you should select “Policy Generator”.

_images/iam-attach-user-policy.png

You need only give this user PutObject permissions on your bucket by entering arn:aws:s3:::<bucket_name> as the Amazon Resource Name.

_images/iam-edit-permissions.png _images/iam-edit-permissions1.png

You have now finished setting up your AWS account with an S3 bucket. If you’re going to export data to several S3 buckets, you can grant the same user access to each additional bucket with a new policy. In this use case it’s recommended that you name the policies according to the bucket which it corresponds to. You can also manually add more bucket names to the resources list under the same policy.

_images/iam-add-resources.png

Configuration

With your S3 configuration ready, navigate to the environment you want to export. Under the “Data Tools” section you can access the “Data Export” page.

_images/data-export-page.png

Enter the access credentials in the AWS access key and AWS secret key fields, as well as the S3 bucket name where the exports will be saved. Choose the frequency of exports; you can pick an hourly increment from 1 hour to 168 hours (1 week). Next, choose the models you want to export, and click Save settings.

Usage

The first export will occur a few minutes after your configuration is saved and you have chosen at least one model for export. The following exports will occur on the frequency specified, relative to when the first export was run.

Any given model will export all records the first time; subsequent exports will only include records updated since the previous export. It could take several hours for the first export of a model if there are a lot of existing records, or if a lot of records have updated since the last export.

The export will create a top level folder in the S3 bucket according to the date and hour that the export was started, e.g. “201601310400”. Within that folder will be folders with names for each of the models exported. These folders will contain files in the following format:

  • Multiple objects per file
  • Each line is a JSON formatted string for an object
  • The file name follows this pattern: <id of the first object>-<id of the last object>.json

Once exports have run, you will see a listing of the recent exports on the “Data Export” page. If there are any errors they will be listed in this view.

_images/data-export-list.png

Table Of Contents

Previous topic

Localization

Next topic

Glossary