top of page

Data Analysis on RocketBI with PostgreSQL

  • Writer: Duc Pham
    Duc Pham
  • Jun 17, 2024
  • 4 min read

Before We Begin:

Before connecting PostgreSQL to RocketBI, you should have the necessary information at hand. This includes:

  1. Hostname

  2. Port number: default port is 5432

  3. Username & password: with the read & write authority to your PostgreSQL database.

Additionally, ensure that your PostgreSQL server is reachable from the machine where RocketBI is running. If you are using the SaaS version of RocketBI, it might be necessary to configure your server's firewall to allow access from the specific IP address: 35.185.190.64 . If you're unsure about any of these steps, don't hesitate to seek assistance from your IT or technical team.


Setting Up the Connection:

To establish a connection between your PostgreSQL database and RocketBI, follow these straightforward steps:

  1. Navigating to the top left corner & selecting Organization Settings.

  2. In the right-hand panel, choose DataSource Config & click on Edit.

  3. Within the pop-up panel, select PostgreSQL as your database type.

  4. In the subsequent panel, complete the PostgreSQL database connection details as follows:

    • Host: This is the IP address or hostname of your PostgreSQL database.

    • Port: Specify the port number where your PostgreSQL database operates.

    • Username: Provide the username of the database user that has read and write permissions for RocketBI to access your data.

    • Password: Enter the password associated with the specified database user.


Using an SSH Tunnel:

In cases where direct access to your database is not feasible, such as due to company policies or security concerns, RocketBI offers the option to connect via an SSH tunnel.


To use an SSH tunnel for PostgreSQL, you need to config on your Data Warehouse first:

  1. Install the ssh and psql clients on your server or local machine.

  2. Generate an SSH key pair.

  3. Copy the public SSH key to the PostgreSQL server.

  4. Create an SSH tunnel to the PostgreSQL server.

  5. Connect to the PostgreSQL server through the SSH tunnel.

Example for Data Warehouse setup:

The following example shows how to use an SSH tunnel to connect to a PostgreSQL server on the IP address 192.168.1.100:

# Generate an SSH key pair
ssh-keygen -t rsa -b 4096 -C "your_email@example.com" 
# Copy the public SSH key to the PostgreSQL server
ssh-copy-id postgres@192.168.1.100 
# Create an SSH tunnel to the PostgreSQL server
ssh -L 5432:localhost:5432 postgres@192.168.1.100  
# Connect to the PostgreSQL server through the SSH tunnel 
psql -h localhost -p 5432 -U postgres

Configure RocketBI to use SSH tunnel for PostgreSQL server


Follow these steps to enable this feature:

  • Create a user on your jump server and grant them access by adding the value found in the 'Public Key' field to the jump server. This allows the SSH service to authorize the connection.

  • The process for creating this user varies based on your server's operating system. Generally, the 'Public Key' should be saved as a plain text file with the '.pub' extension in the '/home/<username>/.ssh' directory, assuming your server is Linux-based (try to search for id_rsa.pub)

To establish an SSH tunnel in RocketBI, follow these steps:

  • Click the dropdown arrow next to the SSH tunnel option.

  • Fill in the required fields as follows:

    • SSH Host: This should be the address of the jump server that has access to your destination PostgreSQL database. It can be a hostname or an IP address.

    • SSH Port: Specify the port number of the SSH service on the jump server. The default is often 22, but it may be reconfigured.

    • SSH Username: Enter the username created for RocketBI to authorize the connection.

    • Public Key: Provide the key that needs to be added to the jump server for RocketBI to authorize the SSH connection.


Troubleshooting:

After completing the necessary fields, click the 'Test Connection' button to verify whether RocketBI can successfully connect to your PostgreSQL database.


Common Issues:

If you encounter an error message while connecting or syncing data from your data source, it may indicate the need for additional configurations to ensure that the connectors/jobs work correctly.


Connection Time Out:

This error can occur for several reasons:

  • The cloud provider's firewall may block RocketBI's connection request to PostgreSQL.

  • The hostname or IP address may be inaccessible from the RocketBI SaaS cloud server, often due to company security policies (e.g., VPN, private network).

  • The database connection might require specific properties for proper functionality. These properties can be added in the "Properties" section.


Communication Failure:

Similar to the previous error, this issue occurs when RocketBI can access the server hosting your PostgreSQL database but cannot communicate or retrieve information from the Data Warehouse. Typically, it requires specific properties for proper operation (e.g., useSSL, allowPublicKeyRetrieval).

If you encounter any of these scenarios, please share the error message with your IT or technical team to thoroughly examine the database connection status.


Server Time Error:


What to Do:

  1. Check the connector settings:

    • Go to Settings > Organization Settings > DataSource Config and click the 'Edit' button.

    • Look for the 'Properties' section in the pop-up panel and click 'Add.'

    • Depending on your connector type, you may need to configure the timezone of your database by providing a key-value pair setting.

    • For example, click 'Add' in the properties section, enter 'serverTimezone' as the key, and 'UTC' as the value (assuming your database operates on UTC timezone), then press Enter.

    • Click `Test Connection` to verify if the settings are correct. If the test is successful, click 'Apply' to save your changes.


Following these steps, you can seamlessly connect your PostgreSQL database to RocketBI, enabling efficient data analysis and reporting for your business needs.


Frequently Asking Questions:

1. Why is my system (website/ ERP/ CRM/ any other internal system) slowing down?

NEVER connect your transaction database to RocketBI for business intelligence.A transaction database, or record-keeping system, is where your internal systems store crucial records of everyday activities like sales, purchases, payments, and other transactions. It's similar to how your accountant can't simultaneously file an invoice and present the sales performance on your company's monthly meeting.

2. What should I do if I encounter connection issues or errors when connecting to MySQL?

  • Double-check Credentials: Ensure you've entered the correct server address, username, password, and database name. Pay attention to typos and case sensitivity.

  • Network Connectivity: Ensure you can access your PostgreSQL.

  • Firewall and Permissions: Make sure your PostgreSQL server allows connections from RocketBI's IP address ( 144.217.255.203 ). Check firewall settings and user permissions in PostgreSQL.

  • Port Configuration: Verify that PostgreSQL is running on the default port (5432) or the port you specified and that a firewall does not block it.

  • Security Settings: Check if PostgreSQL security settings (such as SSL) are configured correctly

3. Can I schedule data refreshes or automate data updates from PostgreSQL to RocketBI?

You don't need to do anything. Your data is sync in real-time.

4. What kind of access should I tell my IT/ Technical team to grant RocketBI?

For a completed experience, the user credential that was used to connect your PostgreSQL and RocketBI must have full Read and Write authority. We don't need your PostgreSQL user creation or full administration rights.

5. Is there a risk of data corruption when connecting PostgreSQL to RocketBI?

No, RocketBI syncs your data and allows you to do exploration, transformation & visualization. To avoid unnecessary errors, create new databases for your ETL & analysis.

6. Can I connect multiple PostgreSQL databases to RocketBI simultaneously?

No, only one data warehouse can be connected to RocketBI at a time. You can unplug and connect to a new PostgreSQL, but all your analysis results would stay in the old warehouse.

7. Why does the analyzing process is so slow?

PostgreSQL is a hybrid database capable of performing both read and write operations. However, for analytically intensive tasks, such as automated data processing on a large scale, the read performance may be impacted. We recommend consulting with your IT/Tech team and considering various strategies, such as transitioning your PostgreSQL data warehouse to a fully managed cloud solution or migrating to a specialized analytics database (for instance, we offer native support for ClickHouse).

8. Why does RocketBI become slower with time?

In most cases, it's not RocketBI but your Database server. It can become slow for various reasons, ranging from improper index usage to bugs in the storage engine. However, in most cases, they need to be maintain & clean-up, so monitor your data warehouse and keep an eye on their performance.

9. Which database is recommend for a data warehouse ?

For analytical data warehouse, we recommend you to use Clickhouse. it's fast & open source.


 
 
 

Comments


bottom of page