Skip to content

PostgreSQL

PostgreSQL can be very effective as a data warehouse for smaller to medium data volumes. For enterprise-class data warehouses, PostgreSQL is not typically used as a data warehouse because it is a row-based database better suited for transactional needs.

Prerequisites

To connect your PostgreSQL as a destination database in ETLrobot, you need:

  • PostgreSQL version 7.3 or above (8.4+ preferred).
  • Your database's IP address (e.g., 1.2.3.4) or domain (your.server.com).
  • Your database's port (usually 5432).
  • Database privileges that allow you to create users and grant privileges. This is required to create a database user and role for ETLrobot.
  • Note: The default_transaction_read_only parameter value must be set to 0 (off) to be able to create the database and tables in your destination. Off is the default. See PostgreSQL's documentation for more information.
  • It is recommended to set the database timezone to UTC.

Common supported implementations:


Setup instructions

  1. Database setup

    1. Connect to your PostgreSQL instance as an admin user using your SQL client.

    2. Create a database to use for this data warehouse if you don't have one already.
      Possible name: dw (for data warehouse)

      CREATE DATABASE <database_name>
      ENCODING 'UTF8'
      LC_COLLATE = 'en_US.UTF-8'
      LC_CTYPE = 'en_US.UTF-8'
      TEMPLATE template0;
      
    3. After connecting, execute the following query to create a user for ETLrobot. Choose a memorable username (for example, etlrobot). Replace <password> with a strong password of your choice.

      CREATE USER etlrobot WITH PASSWORD '<password>';
      
    4. Execute the following command to grant the etlrobot user the following privileges. The database you specify here must be a pre-existing database.

      GRANT CREATE, TEMPORARY ON DATABASE <database_name> TO etlrobot;
      
      • CREATE: Allows the user to create new schemas in the database. ETLrobot will run a CREATE SCHEMA IF NOT EXISTS command to create the schema you specify.
      • TEMPORARY: Allows the user to create temporary tables while using the database.
    5. If you restricted access to the system tables, you’ll also need to execite the following commands to grant the etlrobot user SELECT permissions.

      GRANT SELECT ON ALL TABLES IN SCHEMA information_schema TO etlrobot;
      GRANT SELECT ON pg_catalog.pg_constraint, pg_catalog.pg_class, pg_catalog.pg_namespace, pg_catalog.pg_attribute, pg_catalog.pg_proc, pg_catalog.pg_available_extensions, pg_catalog.pg_statio_all_tables, pg_catalog.pg_description TO etlrobot;
      

      prior to loading, ETLrobot reads (always read only) from these system schemas to verify the existence and structure of the integration schemas and tables.

      • information_schema: metadata information about the tables and columns.
      • pg_catalog: metadata information about the database and cluster.
  2. Choose your connection method

    Decide whether to connect to your PostgreSQL database directly or via an SSH tunnel.

    Connect directly

    If your database is publicly accessible you may connect directly. You must whitelist ETLrobot's IPs by creating a rule in a security group that allows ETLrobot access to your database port and instance.

    Configure your firewall and/or other access control systems to allow:

    1. incoming connections to your host and port (usually 5432) from ETLrobot's IPs for your database's region
    2. outgoing connections from all ports (1024 to 65535) to ETLrobot's IPs
    Connect via an SSH tunnel

    If your database is not publicly accessible you will need to connect using an SSH tunnel.

    SSH requires a publicly accessible SSH server to act as an intermediary between ETLrobot and your private database. The SSH server will forward traffic from ETLrobot through a secured encrypted tunnel to the private database.

    To connect using SSH, do the following:

    1. Select the SSH option.
    2. Copy ETLrobot's public SSH key.
    3. Add the public key to the authorized_keys (usually $user/.ssh/authorized_keys) file of your SSH server. The key must be all on one line, so make sure that you don't introduce any line breaks when cutting and pasting.

    Click here detailed instructions on connecting via an SSH tunnel.

  3. Complete the destination setup

    Be sure to fill out all the required fields.

    ETLrobot tests and validates your PostgreSQL destination connection and database setup. If successful, the sync to your database will automatically start.