PostgreSQL¶
PostgreSQL can be very effective as a data warehouse for smaller to medium data volumes. For enterprise-class data warehouses, PostgreSQL is not typically used as a data warehouse because it is a row-based database better suited for transactional needs.
Prerequisites¶
To connect your PostgreSQL as a destination database in ETLrobot, you need:
- PostgreSQL version 7.3 or above (8.4+ preferred).
- Your database's IP address (e.g.,
1.2.3.4
) or domain (your.server.com
). - Your database's port (usually
5432
). - Database privileges that allow you to create users and grant privileges. This is required to create a database user and role for ETLrobot.
- Note: The default_transaction_read_only parameter value must be set to
0
(off) to be able to create the database and tables in your destination. Off is the default. See PostgreSQL's documentation for more information. - It is recommended to set the database timezone to UTC.
Common supported implementations:¶
- Generic PostgreSQL
- Amazon RDS PostgreSQL
- Amazon Aurora PostgreSQL
- Google Cloud PostgreSQL
- Azure PostgreSQL
Setup instructions¶
-
Database setup¶
-
Connect to your PostgreSQL instance as an admin user using your SQL client.
-
Create a database to use for this data warehouse if you don't have one already.
Possible name:dw
(for data warehouse)CREATE DATABASE <database_name> ENCODING 'UTF8' LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8' TEMPLATE template0;
-
After connecting, execute the following query to create a user for ETLrobot. Choose a memorable username (for example,
etlrobot
). Replace <password> with a strong password of your choice.CREATE USER etlrobot WITH PASSWORD '<password>';
-
Execute the following command to grant the
etlrobot
user the following privileges. The database you specify here must be a pre-existing database.GRANT CREATE, TEMPORARY ON DATABASE <database_name> TO etlrobot;
- CREATE: Allows the user to create new schemas in the database. ETLrobot will run a
CREATE SCHEMA IF NOT EXISTS
command to create the schema you specify. - TEMPORARY: Allows the user to create temporary tables while using the database.
- CREATE: Allows the user to create new schemas in the database. ETLrobot will run a
-
If you restricted access to the system tables, you’ll also need to execite the following commands to grant the
etlrobot
user SELECT permissions.GRANT SELECT ON ALL TABLES IN SCHEMA information_schema TO etlrobot; GRANT SELECT ON pg_catalog.pg_constraint, pg_catalog.pg_class, pg_catalog.pg_namespace, pg_catalog.pg_attribute, pg_catalog.pg_proc, pg_catalog.pg_available_extensions, pg_catalog.pg_statio_all_tables, pg_catalog.pg_description TO etlrobot;
prior to loading, ETLrobot reads (always read only) from these system schemas to verify the existence and structure of the integration schemas and tables.
- information_schema: metadata information about the tables and columns.
- pg_catalog: metadata information about the database and cluster.
-
-
Choose your connection method¶
Decide whether to connect to your PostgreSQL database directly or via an SSH tunnel.
Connect directly
If your database is publicly accessible you may connect directly. You must whitelist ETLrobot's IPs by creating a rule in a security group that allows ETLrobot access to your database port and instance.
Configure your firewall and/or other access control systems to allow:
- incoming connections to your host and port (usually
5432
) from ETLrobot's IPs for your database's region - outgoing connections from all ports (
1024
to65535
) to ETLrobot's IPs
Connect via an SSH tunnel
If your database is not publicly accessible you will need to connect using an SSH tunnel.
SSH requires a publicly accessible SSH server to act as an intermediary between ETLrobot and your private database. The SSH server will forward traffic from ETLrobot through a secured encrypted tunnel to the private database.
To connect using SSH, do the following:
- Select the SSH option.
- Copy ETLrobot's public SSH key.
- Add the public key to the
authorized_keys
(usually $user/.ssh/authorized_keys) file of your SSH server. The key must be all on one line, so make sure that you don't introduce any line breaks when cutting and pasting.
Click here detailed instructions on connecting via an SSH tunnel.
- incoming connections to your host and port (usually
-
Complete the destination setup¶
Be sure to fill out all the required fields.
ETLrobot tests and validates your PostgreSQL destination connection and database setup. If successful, the sync to your database will automatically start.