Connecting to Couchbase

Concept Information

ACCESSDATA command

Couchbase is a NoSQL document-oriented database. Use the Couchbase data connector to import your organization's Couchbase data.

Note

Analytics provides Couchbase as an optional connector and if it is not available in your Data Access window, it is likely that the connector was not selected during installation. For more information, see Install optional Analytics data connectors and Python engine.

Before you start

To connect to Couchbase, you must gather the following:

  • at least one node in your Couchbase instance with the query and index services enabled
  • the database server's host name
  • the correct connection port
  • appropriate credentials for your chosen authentication method

    If your Couchbase instance requires authentication, you must provide either a JSON string or a JSON file that specifies the name and password for one or more buckets. For more information, see JSON authentication format.

  • a valid SSL certificate if connecting over SSL

For help gathering the connection prerequisites, contact the Couchbase administrator in your organization. If your administrator cannot help you, you or your administrator should contact Couchbase Support.

Create a Couchbase connection

  1. From the Analytics main menu, select Import > Database and application.
  2. From the New Connections tab, in the ACL Connectors section, select Couchbase.

    Tip

    You can filter the list of available connectors by entering a search string in the Filter connections box. Connectors are listed alphabetically.

  3. In the Data Connection Settings panel, enter the connection settings and at the bottom of the panel, click Save and Connect.

    You can accept the default Connection Name, or enter a new one.

The connection for Couchbase is saved to the Existing Connections tab. In the future, you can reconnect to Couchbase from the saved connection.

Once the connection is established, the Data Access window opens to the Staging Area and you can begin importing data. For help importing data from Couchbase, see Working with the Data Access window.

Connection settings

Setting Description Example
Server

The host name or IP address of the Couchbase server. This value can be set to an HTTP or HTTPS URL.

couchbase-server.com
Port The port the Couchbase server is running on. 8093
Authentication Mechanism

Authentication mechanism to use when connecting to Couchbase server.

  • No Authentication - Driver does not authenticate the connection.
  • Use Credentials - Driver authenticates the connection using the names and passwords specified in the credentials string.
  • Use Credential File - Driver authenticates the connection using the names and passwords specified in the credentials file.
 
Credentials

A JSON string that specifies the name and password of one or more users or buckets, for authenticating to a Couchbase server instance. The credentials string must use the following format:

[{"user": "[UserName1]", "pass":"[Password1]"},
{"user": "[UserName2]", "pass":"[Password2]"}]

 
Credentials File

Full path to a JSON file that contains credentials for authenticating to a Couchbase server instance.

The JSON file must contain the names and passwords of multiple users or buckets.

 
Enable SSL Specifies whether to use SSL when connecting to the Couchbase server. false
SSL Certificate

The certificate to be accepted from the server when Enable SSL is set to true.

You can provide any of the following:

  • a full PEM certificate
  • path to a local file containing the certificate
  • public key
  • MD5 or SHA1 thumbprint (hex values can also be either space or colon separated)

Any other certificate that is not trusted by the machine is rejected.

C:\cert.cer

Advanced settings

Setting Description Example
Query Mode

Specifies the query mode to use when sending queries to Couchbase server.

  • SQL- The driver executes all queries in SQL.
  • N1QL - The driver executes all queries in N1QL.
SQL
Consistency

The level of data consistency to enforce during index scans. Set this property to one of the following values:

  • NOT_BOUNDED
  • AT_PLUS
  • REQUEST_PLUS
  • STATEMENT_PLUS
REQUEST_PLUS
Enable Load Balancing Specifies whether the driver supports load balancing and failover between nodes in a Couchbase cluster.  
Sample Size The number of documents that the driver samples to detect the structure of the data when generating a schema definition using the SchemaMapOperation property. 100
Type Name List A comma-separated list of the attributes that the buckets use to specify document types. Each list item must be a bucket name surrounded by back quotes (`), a colon (:), and an attribute name surrounded by back quotes (`). `product`:`type`,`store`:`type`, `customer`:`jsonType`,`sales`:`type`

Querying the Couchbase instance

JSON authentication format

[{"user" : "userName1", "pass" : "passWord1"},
{"user" : "userName2", "pass" : "passWord2"}]

SQL queries vs N1QL API calls

The connector uses normal SQL queries against the Couchbase Server, translating standard SQL-92 queries into equivalent N1QL client API calls. This translation allows standard queries to run against your Couchbase Server instance. If a query cannot be fully translated, then the translated parts of the query are passed down as one or more N1QL queries to the Couchbase Server instance for processing while the untranslated parts of the query are processed by the Connector.

Note

The names of data structures are case-sensitive, therefore you must ensure the casing of structures such as tables, columns, or buckets in your queries match the structures in the database.

Schema definition

Couchbase is able to store data that follows different rules of data typing and structure compared to traditional relational tables and columns. Couchbase data is organized into buckets and documents, which can contain nested arrays or arrays of differently typed elements. This data needs to be mapped to a relational form. To achieve this, the connector generates a schema definition that maps the Couchbase data to an ODBC-compatible format.

When you connect to a database that does not already have the necessary schema definition, the connector automatically generates one by doing the following:

  1. For each document type identified in the database, the connector samples data from multiple documents to detect the structure of the data.
  2. The connector organizes all the documents into collections based on their type, and saves these collections as part of the schema definition. Using the schema definition, the driver exposes collections as tables.
  3. For each array detected in the database, the connector generates a virtual table to expand the data, and saves these virtual tables as part of the schema. Using the schema, the driver exposes virtual tables as normal tables.
  4. The connector defines a Couchbase data type for each column and maps each Couchbase data type to the SQL data type that is best able to represent the greatest number of values.

Base tables

Base tables represent data from collections of Couchbase documents. Documents appear as rows, and all attributes that are not arrays appear as columns. In each base table, the connector creates a primary key column named PK that identifies which Couchbase document each row comes from.

In the connector, the name of the base table is the document type that it represents. In Couchbase, the name of the base table is the bucket that the data comes from.

Virtual tables

Virtual tables provide support for arrays. Each virtual table contains the data from one array, and each row in the table represents an element from the array. If an element contains an array, then the connector creates additional virtual tables as needed to expand the nested data.

In each virtual table, the connector creates a primary key column name that identifies the document that the array comes from and references the column from the related base table. The connector also creates an index column (with the suffix _IDX in its name) to indicate the position of the element within the array.

Example

The following example shows the base tables and virtual tables that the connector would generate if it connected to a Couchbase database named ExampleDatabase, which contains two documents named Customer_123221 and Order_221354.

The Customer_123221 document is of type Customer and contains the following attributes. The SavedAddresses attribute is an array:

{
  "Type": "Customer",
  "Name": "John Doe",
  "SavedAddresses": ["123 Main St.", "456 1st Ave"]
}

The Order_221354 document is of type Order and contains the following attributes. The CreditCard attribute is an object, and the Items attribute is an array of objects:

{
  "Type": "Order",
  "CustomerID":"Customer_123221",
  "CreditCard":
    {
      "Type":"Visa",
       "CardNumber":"4111 1111 1111 1111",
       "Expiry":"12/12",
       "CVN":"123"
    },
  "Items":
    [
      {"ItemID":89123, "Quantity":1},
      {"ItemID":92312, "Quantity":5}
    ]
}

When Analytics connects to ExampleDatabase and generates the schema, the connector creates a collection for each document type and exposes these collections as two base tables, which are shown below:

Base table Customer

PK Name
"Customer_123221" John Doe

Base table Order

PK Customer
ID
CreditCard
_Type
CreditCard
_Number

CreditCard
_Expiry

CreditCard
_CVN
"Order_
221354"
"Customer_
123221"
"Visa" "4111 1111
1111 1111"
"12/12" "123"

The SavedAddresses array from the Customer_123221 document and the Items array from the Order_221354 document do not appear in these base tables. Instead, the connector generates a virtual table for each array:

SavedAddresses table

PK SavedAddresses_IDX SavedAddresses
"Customer_123221" 0 "123 Main St."
"Customer_123221" 1 "456 1st Ave"

Items table

PK Items_IDX ItemID Quantity
"Order_221354" 0 89123 1
"Order_221354" 1 92312 5