Skip to content

You are viewing documentation for Immuta version 2022.4.

For the latest version, view our documentation for Immuta SaaS or the latest self-hosted version.

Cloudera Native Workspace Configuration

Audience: System Administrators

Content Summary: This page describes how to configure Native Workspaces for Immuta-enabled CDH clusters. The Native Workspace requires a CDH cluster with the Immuta parcel installed and configured. For more information about CDH deployments, please see the main installation guide.

Overview

This workspace allows native access to data on cluster without having to go through the Immuta SparkSession or Immuta Query Engine.

Accessing Data

Users will only be able to access the directory and database created for the workspace when acting under the project. The Immuta Spark SQL Session will apply policies to the data, so any data written to the workspace will already be compliant with the restrictions of the equalized project, where all members see data at the same level of access. When users are ready to write data back to Immuta, they should use the SparkSQL session to copy data into the workspace.

Workspace Configuration Options:

  • Cloudera HDFS
  • Cloudera S3A

Available Data Source Types:

  • Amazon S3 (Cloudera S3A)

Immuta Web Configuration

The native workspace must be enabled from the App Settings page.

Hive Configuration

If your workspace storage is located in S3, the AWS key pair configuration snippet below must be set in Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml. Note that the property names differ between CDH 5.x and CDH 6.x clusters.

<property>
    <!-- <name>fs.s3a.awsAccessKeyId</name> Use this property name for CDH 5.x clusters -->
    <name>fs.s3a.access.key</name>
    <value>(Your access key)</value>
</property>
<property>
    <!-- <name>fs.s3a.awsSecretAccessKey</name> Use this property name for CDH 5.x clusters -->
    <name>fs.s3a.secret.key</name>
    <value>(Your secret key)</value>
</property>

The Immuta System API Key configuration snipped below must be set in Hive Metastore Server Advanced Configuration Snippet (Safety Valve) for core-site.xml.

<property>
    <name>immuta.system.api.key</name>
    <value>(Your Immuta System API Key)</value>
</property>

The Immuta Group Mapping and Immuta System API Key configuration snippet below must be set in HiveServer2 Advanced Configuration Snippet (Safety Valve) for core-site.xml. For more information on Immuta Group Mapping configuration, see Enabling Immuta Group Mapping.

<property>
    <name>hadoop.security.group.mapping</name>
    <value>org.apache.hadoop.security.CompositeGroupsMapping</value>
    <final>true</final>
</property>
<property>
    <name>hadoop.security.group.mapping.providers</name>
    <value>jni,immuta</value>
    <final>true</final>
</property>
<property>
    <name>hadoop.security.group.mapping.providers.combined</name>
    <value>true</value>
    <final>true</final>
</property>
<property>
    <name>hadoop.security.group.mapping.provider.jni</name>
    <value>org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback</value>
    <final>true</final>
</property>
<property>
    <name>hadoop.security.group.mapping.provider.immuta</name>
    <value>com.immuta.security.ImmutaGroupsMapping</value>
    <final>true</final>
</property>
<property>
    <name>immuta.system.api.key</name>
    <value>(Your Immuta System API Key)</value>
    <final>true</final>
</property>

Impala Configuration

The AWS key pair and Immuta System API Key configuration snippet below must be set in Impala Catalog Server Advanced Configuration Snippet (Safety Valve) for core-site.xml. Note that the property names may differ between CDH 5.x and CDH 6.x clusters.

<property>
    <!-- <name>fs.s3a.awsAccessKeyId</name> Use this property name for CDH 5.x clusters -->
    <name>fs.s3a.access.key</name>
    <value>(Your access key)</value>
</property>
<property>
    <!-- <name>fs.s3a.awsSecretAccessKey</name> Use this property name for CDH 5.x clusters -->
    <name>fs.s3a.secret.key</name>
    <value>(Your secret key)</value>
</property>
<property>
    <name>immuta.system.api.key</name>
    <value>(Your Immuta System API Key)</value>
</property>

The AWS key pair, Immuta System API Key, and Immuta Group Mapping configuration snippet below must be set in Impala Daemon Advanced Configuration Snippet (Safety Valve) for core-site.xml. For more information on Immuta Group Mapping configuration, see Enabling Immuta Group Mapping.

<property>
    <name>hadoop.security.group.mapping</name>
    <value>org.apache.hadoop.security.CompositeGroupsMapping</value>
    <final>true</final>
</property>
<property>
    <name>hadoop.security.group.mapping.providers</name>
    <value>jni,immuta</value>
    <final>true</final>
</property>
<property>
    <name>hadoop.security.group.mapping.providers.combined</name>
    <value>true</value>
    <final>true</final>
</property>
<property>
    <name>hadoop.security.group.mapping.provider.jni</name>
    <value>org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback</value>
    <final>true</final>
</property>
<property>
    <name>hadoop.security.group.mapping.provider.immuta</name>
    <value>com.immuta.security.ImmutaGroupsMapping</value>
    <final>true</final>
</property>
<property>
    <name>immuta.system.api.key</name>
    <value>(Your Immuta System API Key)</value>
    <final>true</final>
</property>
<property>
    <!-- <name>fs.s3a.awsAccessKeyId</name> Use this property name for CDH 5.x clusters -->
    <name>fs.s3a.access.key</name>
    <value>(Your access key)</value>
</property>
<property>
    <!-- <name>fs.s3a.awsSecretAccessKey</name> Use this property name for CDH 6.x clusters -->
    <name>fs.s3a.secret.key</name>
    <value>(Your secret key)</value>
</property>

Sentry Configuration

If you want users to be able to create derived data sources and/or native Hive or Impala tables within Immuta's native project workspaces, you will need to grant a Sentry admin role to the immuta user. This requires adding the immuta user to Admin Groups and Allowed Connecting Users under Sentry's configuration in Cloudera Manager.

You should also create a new Sentry role for immuta, with all privileges granted. Run the SQL snippet below in beeline or impala-shell as either the immuta user or as any user with Sentry admin privileges.

CREATE ROLE immuta;
GRANT ALL ON SERVER <server name> TO ROLE immuta WITH GRANT OPTION;
GRANT ROLE immuta TO GROUP immuta;

You will also need to enable the ImmutaGroupsMapping service in Hive and/or Impala's configuration to allow Immuta to manage Sentry permissions for Immuta users. For instructions on how to do this, please see Enabling ImmutaGroupsMapping.

Create a Cloudera or EMR Workspace

  1. Navigate to the Policies tab and enable Project Equalization by clicking the Project Equalization slider to on.
  2. Scroll to the Native Workspace section and click Create.
  3. Select the Cloudera or EMR Workspace Configuration from the dropdown menu.

  4. Select the Cluster Name from the subsequent dropdown menu.

  5. Opt to edit the Workspace Directory field or add a Hive Connection (if available).

  6. Click Create to enable the workspace.