Connecting to Apache Drill
Concept Information
Drill is an Apache open-source SQL query engine for high-performance analysis on the semi-structured data coming from Big Data applications.
Note
Analytics provides Drill as an optional connector and if it is not available in your Data Access window, it is likely that the connector was not selected during installation. For more information, see Install optional Analytics data connectors and Python engine.
Before you start
To connect to Drill, you must gather the following:
- the correct authentication type (basic or none)
- username and password if using basic authentication
- the correct connection type (Direct to Drillbit embedded mode or Zookeeper Quorum distributed mode)
- host and port for Drillbit or each node if using distributed mode
In distributed mode, host and port information must be entered in the Quorum field in comma-delimited format:
<host name/ip address> : <port number>, <host name/ip address> : <port number>, . . .
- name of the Drillbit cluster if using distributed mode
For help gathering the connection prerequisites, contact the Apache Drill administrator in your organization. If your administrator cannot help you, you or your administrator should contact Apache Drill Support.
For advanced connection properties, see the Apache Drill online help on Configuring ODBC on Windows.
Create a Drill connection
- From the Analytics main menu, select Import > Database and application.
- From the New Connections tab, in the ACL Connectors section, select Apache Drill.
Tip
You can filter the list of available connectors by entering a search string in the Filter connections box. Connectors are listed alphabetically.
- In the Data Connection Settings panel, enter the connection settings and at the bottom of the panel, click Save and Connect.
You can accept the default Connection Name, or enter a new one.
The connection for Apache Drill is saved to the Existing Connections tab. In the future, you can reconnect to Apache Drill from the saved connection.
Once the connection is established, the Data Access window opens to the Staging Area and you can begin importing data. For help importing data from Apache Drill, see Working with the Data Access window.
Connection settings
Basic settings
Setting | Description | Example |
---|---|---|
Connection Type |
Specifies the driver connection type. The options available are:
|
Direct to Drillbit |
Quorum | Specify the server(s) in your ZooKeeper cluster. Separate multiple servers using a comma (,). | |
Cluster ID | Name of the ZooKeeper cluster that the driver connects to. | drillbits1 |
Host | The IP address or host name of the Drill server. | localhost |
Port | The TCP port that the Drill server uses to listen for client connections. | 31010 |
Authentication Type |
Specifies how the driver authenticates the connection to Drill. The options available as:
|
No Authentication |
User | User name to authenticate to the Drill server. | |
Password | Password to authenticate to the Drill server. | |
Catalog | Name of the synthetic catalog under which all of the schemas/databases are organized. This catalog name is used as the value for SQL_DATABASE_NAME or CURRENT CATALOG. | DRILL |
Default Schema | Name of the database schema to use when a schema is not explicitly specified in a query. You can issue queries on other schemas by explicitly specifying the schema in the query. | |
Disable Async | Specifies whether the driver supports asynchronous queries. |
Advanced settings
Setting | Description | Example |
---|---|---|
Advanced Properties |
For configuring the driver. Separate each advanced property by using a semicolon (;), and then surround all the advanced properties in a connection string by using braces ({ }). |
CastAnyToVarchar=true; HandshakeTimeout=5; QueryTimeout=180; TimestampTZDisplayTimezone=local; ExcludedSchemas=sys,INFORMATION_SCHEMA; NumberOfPrefetchBuffers=5 |