Connecting to MongoDB

This section contains details for MongoDB connections.

Select an option from the dropdown menu to display information for a particular driver class.

General information

FieldDescription
Data sourceMongoDB
Supported version

3.12.1122.0.8326.0

Connection stringjdbc:mongodb://
Packaged?

No Yes

Certified?

No Yes

Supported features
Estimate job

Yes

Analyze data

Yes

Schedule

Yes No

Processing capabilities
Pushdown

No

Spark agent

Yes

Yarn agent

Yes No

Parallel JDBC

No Yes

Java Platform version compatibility
JDK 8

Yes

JDK 11

Yes

Minimum user permissions

In order to bring your MongoDB data into Collibra Data Quality & Observability, you need the following permissions.

  • Read access on your MongoDB tables.
  • ROLE_ADMIN assigned to your user in Collibra DQ.

Recommended and required connection properties

RequiredConnection PropertyTypeValue

Yes

NameTextThe unique name of your connection. Ensure that there are no spaces in your connection name.

Yes

Connection URLString

The connection string path of your MongoDB connection.

For successful job execution, you must first configure a Spark Scratch Directory to allocate space for a schema cache. After you configure the Spark Scratch Directory, add to the connection string the temporary writable location schema=/tmp/scratch/${mongo_example_file_name.xml} for the schema cache to write, as shown in the example below. Replace ${mongo_example_file_name.xml} with the file name corresponding to the database name.

When referring to the example below, replace the ${value} sections of the connection URL with your actual value.

Example jdbc:mongodb://${host}.mongodb.net?AuthDatabase=admin&AuthScheme=SCRAM-SHA-1&UseSSL=true&schema=/tmp/scratch/${mongo_example_file_name.xml}jdbc:mongodb://${host}.mongodb.net?AuthDatabase=admin&AuthScheme=SCRAM-SHA-1&UseSSL=true&database=${dbName}

Yes

Driver NameString

The driver class name used for your connection.

mongodb.jdbc.MongoDrivercdata.jdbc.mongodb.MongoDBDriver

Yes

PortInteger

The port number to establish a connection to the datasource.

The default port is 443

No

Source NameStringN/A

No

Target AgentOptionThe Agent used to submit your DQ Job.

Yes

Auth TypeOption

The method to authenticate your connection.

Note The configuration requirements are different depending on the Auth Type you select. See Authentication for more details on available authentication types.

No

PropertiesString

The configurable driver properties for your connection. Multiple properties must be comma delimited. For example, abc=123,test=true

Authentication

Select an authentication type from the dropdown menu. The options available in the dropdown menu are the currently supported authentication types for this data source.

RequiredFieldDescription

Yes

UsernameThe username of your MongoDB account.

Yes

PasswordThe password of your MongoDB account.

Yes

Script

The file path that contains the script file that the password manager uses to interact with and authenticate a user account.

Example /tmp/keytab/mongodb_pwd_mgr.sh

No

Param $1Optional. An additional parameters to authenticate your MongoDB connection.

No

Param $2Optional. An additional parameter to authenticate your MongoDB connection.

No

Param $3Optional. An additional parameter to authenticate your MongoDB connection.

Yes

PrincipalThe Kerberos entity to authenticate and grant access to your connection.

Yes

Keytab

The file path of the keytab file that contains the encrypted key for a Kerberos principal.

Example /tmp/keytab/hive_user.keytab

Yes

PasswordThe secret credential associated with your Kerberos principal.

Yes

Script

The file path that contains the script file used to interact with and authenticate a Kerberos user.

Example /tmp/keytab/mongodb_pwd_mgr.sh

No

Param $1Optional. Additional Kerberos parameter.

No

Param $2Optional. Additional Kerberos parameter.

No

Param $3Optional. Additional Kerberos parameter.

Yes

TGTThe ticket-granting ticket cache that stores the TGT to authenticate your connection.

Command line example for basic spark-submit job

Copy
-lib "/opt/owl/drivers/mongodb/" 
-h localhost:5432/postgres 
-master local[*] 
-ds tpch.lineitem_7 
-br 10 -deploymode client 
-q "select * from tpch.lineitem where l_shipdate between '${rd} 00:00:00.000+0000' 
and '${rdEnd} 00:00:00.000+0000' " 
-bhlb 10 -rd "1998-12-01" 
-driver "mongodb.jdbc.MongoDriver" 
-loglevel INFO -cxn MongoDB -rdEnd "1998-12-02"

Note For more details about the various Create and Alter SQL statements and table-level actions, see the official MongoDB documentation.