aurora
aurora copied to clipboard
An open source enterprise data warehousing and analysis platform.
Aurora - An Enterprise Data Platform
Description: This repository is a collection of Ansible scripts and other supporting code required to build a scalable, secure, and powerful data processing platform.
- Technology stack: Ansible is used for deployment.
- Status: Under active development. Once we've reached "Alpha", further changes will be tracked in the CHANGELOG.
Dependencies
The Aurora data platform was designed to work on a network of RHEL 6.5 servers, and has only been tested in that environment. Additionally, you must have Ansible installed to deploy, and Vagrant to run locally.
Installation
To install locally, simply run "vagrant up" from the /deploy directory. To deploy to a remote environment, a custom inventory file is required along with a custom group_vars file to go with it. Once that has been added, simply run "ansible-playbook site.yml -i inventories/{{ your_environment }}"
- Note: if there isn't a postgres instance running on your machine, you'll need to pass an environment variable to install it
- Ex:
EXTRA_VARS='{pp_install:true}' vagrant up [server_name]
- Ex:
Configuration
As mentioned above, you can configure the deployment using Ansible's inventory and group_vars functionality.
Usage
TBD - Likely will create more substantial documentation defining what each server is for and how it is meant to be used.
How to test the software
Running Docker on a Macbook
- brew cask install docker-toolbox
- docker-machine start default
- docker-machine create --driver "virtualbox" default
- eval "$(docker-machine env default)"
- docker ps (to validate it works)
If docker starts running out of disk space, connect to the boot2docker VM (or Mac terminal) and run this:
docker ps -a -q | xargs -n 1 -I {} docker rm {}
Command to make sure the exited containers are deleted:
docker rm -v $(docker ps -a -q -f status=exited)
Setting up Test Environment
When developing the Travis CI file, it can be helpful to test in travis's environment as described here: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
- Note: you'll need to install the travis image with --privileged
- docker run --privileged -it quay.io/travisci/travis-ruby /bin/bash
To do this, follow the steps above up to actually running your commands. Before doing so, Docker must be installed in the Travis CI image, like so:
- sudo apt-get install apt-transport-https ca-certificates
- sudo apt-key adv --keyserver hkp://ha.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
- echo "deb https://apt.dockerproject.org/repo ubuntu-precise main" | sudo tee /etc/apt/sources.list.d/docker.list
- sudo apt-get update
- apt-cache policy docker-engine
- sudo apt-get install docker-engine (May neeed --force-yes)
- sudo ln -s /bin/true /sbin/initctl
- sudo service docker start ->>
- docker daemon -H unix:///var/run/docker.sock&>/var/log/docker.log &
- git clone https://github.com/[githubfork]/aurora /aurora
- cd /aurora
- git checkout travis
- Run commands in travis.yml file
TBD
Role-specific documentation
Some Ansible roles in this project have role-specific documentation:
- (postgresql-server)[deploy/roles/postgresql-server/README.md]
- (python27-scl)[deploy/roles/python27-scl/README.md]
- (python36-scl)[deploy/roles/python36-scl/README.md]
Known issues
- Travis-CI hangs when jobs complete - resolution
- R package installation takes too long (currently skipped)
Getting help
Open an issue on Github if you need help, have a feature request, or have code to contribute.
Getting involved
Refer to CONTRIBUTING if you'd like to help!
Open source licensing info
- TERMS
- LICENSE
- CFPB Source Code Policy