Analyze ClickHouse Data With Apache Superset
Updated: Dec 19, 2022
Apache Superset is an increasingly popular BI tool for ClickHouse users. Although there are many ways to connect ClickHouse data to Superset, this article will cover a brief overview of how to connect and analyze ClickHouse data through Python.
1. What is Apache Superset?
Apache Superset is a cutting-edge, open-source, business intelligence-ready web application. It allows users of various skill levels to easily explore and visualize their data using it, from simple pie charts to highly detailed geographic graphs. It is fast, lightweight, intuitive, and packed with possibilities. It will improve the way the company collects and analyzes data for better strategy development and execution. Here is an article introducing Superset
2. Connect ClickHouse to Superset
Gather connection details
To connect to ClickHouse with HTTP(S), you need this information:
HOST and PORT: typically, the port is 8443 when using TLS or 8123 when not using TLS.
DATABASE NAME: the name of the database you want to connect to.
USERNAME and PASSWORD: the username appropriate for your use case.
The details for your ClickHouse Cloud service are available in the ClickHouse Cloud console. Select the service that you will connect to and click Connect
Choose HTTPS, and the details are available in an example to curl command.
If you use self-managed ClickHouse, the connection details are set by your ClickHouse administrator.
Install the Driver
Superset uses the clickhouse-connect driver to connect to ClickHouse. The details of clickhouse-connect are at https://pypi.org/project/clickhouse-connect/, and it can be installed with the following command:
pip install clickhouse-connect
1. Select Data from the top menu and then Databases from the drop-down menu. Add a new database by clicking the + Database button
2. Select ClickHouse Connect as the type of database
3. Enter the connection information that you collected earlier
4. Click the CONNECT and then FINISH buttons to complete the setup wizard, and you should see your database in the list of databases.
3. Superset as a BI Tool for ClickHouse
Organize each graph in your dashboard: Extensive dashboard exploration. You may simply filter and organize data with Superset while concentrating on each graph or measure.
Easy to set up on the production level and easily integrated with any third-party tool.
SQL-LAB offers a simple no-code visualization builder or the cutting-edge SQL IDE to quickly integrate and analyze your data.
A simple semantic layer allowing to control of how data sources are displayed in the UI
Unable to query/join multiple tables. Performance is only possible to view by view, requiring several queries.
Although it has extensive connection support, it doesn’t support NoSQL data sources.
Unable to retrieve multiple tables at the same time. Having several queries affects the performance.
It requires technical expertise to install and set up which makes it less suitable for small businesses and startups.
4. Build a visualization with Superset
1. Select Charts from the top menu and click the button to add a new chart, then select a dataset to start
2. Select fields for the dimension and metric
3. Click the SAVE button to save the chart
Looking for an alternative?
You can use Rocket.BI as a better one-stop solution instead. It is also an open-source BI platform to collect, store, analyze, and visualize data, which supports both SQL and non-SQL data sources. You can easily join multiple tables and create relationships between different databases. Since it is natively integrated with ClickHouse, now you can easily perform BI for ClickHouse with some simple steps. Discover more here
Did you know? Data Insider Rocket.BI is providing free trials with unlimited features, integrations, and visualisation capabilities. Visit https://www.datainsider.co/register to sign up for a free trial and experience full accessibility.