SAP HANA (public preview)
What is SAP HANA?
SAP HANA is an in-memory relational database and application development platform designed for high performance and real-time data processing. It allows businesses to analyze large volumes of data quickly and efficiently, supporting advanced analytics, transactional processing, and complex queries.
Why Connect SAP HANA to Monte Carlo?
Integrating Monte Carlo with SAP HANA allows you to monitor your data and tables. Monte Carlo can detect anomalous volume and freshness changes out of the box. With the ability to run SQL queries on your SAP HANA tables and views, you can create additional advanced monitors to analyze the data in your tables. For a full list of supported monitor types see Monitor Support.
After connecting SAP HANA here is what you can expect to see out of the box:
Freshness and Volume:
Our freshness and volume monitors are active out of the box. By collecting metadata on your tables, Monte Carlo’s machine learning algorithm can set automatic thresholds to alert when a table has gone too long without an update, and when an abnormal amount of rows are added or deleted.
Monitor & Lineage Support
Below are the monitors & lineage support for the SAP HANA integration today. Please reach out to your Monte Carlo representative if you have requirements for additional monitors.
Category | Monitor / Lineage Capabilities | Support |
---|---|---|
Table Monitor | Freshness | ✅ |
Table Monitor | Volume | ✅ |
Table Monitor | Schema Changes | |
Metric Monitor | Metric | |
Validation Monitor | Custom SQL | ✅ |
Validation Monitor | Comparison | ✅ |
Validation Monitor | Validation | ✅ |
Job Monitor | Query performance | |
Lineage | Lineage |
Connecting SAP HANA
Now that we know the value connection SAP HANA to Monte Carlo can bring, here are the necessary steps to set up the integration.
Prerequisites
To create a database user, you will need to have USER ADMIN privileges.
This guide explains how to create a read-only service account for Monte Carlo on SAP HANA.
To review all steps necessary to integrate a data warehouse with Monte Carlo, please see here.
Create a read-only database user
Please run the SQL snippets below on your SAP HANA tenant database to create a read-only database user for Monte Carlo.
-- Creates a restricted user. Turns off requirement that this user change their password.
CREATE RESTRICTED USER <username> PASSWORD <password> NO FORCE_FIRST_PASSWORD_CHANGE;
-- Allows user to connect to the database through a client.
ALTER USER <username> ENABLE CLIENT CONNECT;
-- Grants user SELECT access on system views needed for metadata collection.
GRANT CATALOG READ TO <username>;
GRANT SELECT ON SYS.M_DATABASE TO <username>;
GRANT SELECT ON SYS.SCHEMAS TO <username>;
GRANT SELECT ON SYS.M_TABLES TO <username>;
GRANT SELECT ON SYS.M_TABLE_STATISTICS TO <username>;
GRANT SELECT ON SYS.VIEWS TO <username>;
-- For custom monitors, user needs SELECT access on the individual tables.
-- To give the user SELECT privileges on every table in a schema, run:
GRANT SELECT ON SCHEMA <schema_name> TO <username>;
-- To give the user SELECT privileges on a specific table only, run:
GRANT SELECT ON <schema_name>.<table_name> TO <username>;
FAQs
What hosting solutions are supported?
This integration supports any instance that is reachable from one of our hosting options. Please see the networking docs for more detail.
Do you support multi-tenant deployments?
If your SAP HANA deployment has multiple tenant databases that you would like Monte Carlo to monitor, you will need to complete steps Create a read-only database user and Provide service account credentials for each tenant database. Monte Carlo will create a connection for each tenant database and collect metadata on them separately.
How can I find the port number of my tenant databases?
Monte Carlo needs the SQL_port number of each tenant database you want to monitor. The easiest way to find this port is by running this query in the systemDB
:
SELECT DATABASE_NAME, SQL_PORT
FROM SYS_DATABASES.M_SERVICES
WHERE ((SERVICE_NAME='indexserver' and COORDINATOR_TYPE= 'MASTER')
or (SERVICE_NAME='xsengine'));
Which tables does Monte Carlo query and why?
See below for the list of tables MC queries:
SYS.M_DATABASE
: Used to get the name of the tenant database.SYS.SCHEMAS
: Used to list schemas in the tenant database.SYS.M_TABLES
: Used to get a list of tables in a schema and their current row count (volume).SYS.M_TABLE_STATISTICS
: Used to get the last_modify_time of tables in a schema (freshness).SYS.VIEWS
: Used to get a list of views in a schema.
What do the queries Monte Carlo issue look like?
Database Query
SELECT DATABASE_NAME FROM SYS.M_DATABASE;
Schema Query
SELECT SCHEMA_NAME FROM SYS.SCHEMAS
WHERE SCHEMA_NAME NOT IN (<ignored_schemas, see list below>)
AND HAS_PRIVILEGES = 'TRUE';
** Schemas we ignore
_SYS_DATA_ANONYMIZATION
_SYS_EPM
_SYS_LDB
_SYS_PLAN_STABILITY
_SYS_REPO
_SYS_RT
_SYS_SECURITY
_SYS_SQL_ANALYZER
_SYS_STATISTICS
_SYS_TABLE_REPLICAS
_SYS_TASK
_SYS_TELEMETRY
_SYS_WORKLOAD_REPLAY
_SYS_XS
HANA_XS_BASE
SAP_REST_API
SAP_XS_LM
SAP_XS_LM_PE
SAP_XS_USAGE
SYS
SYSTEM
UIS
_SYS_ADVISOR
_SYS_AFL
_SYS_AUDIT
_SYS_BI
_SYS_BIC
Tables and Views Query
SELECT
'{tenant_db_name}' AS catalog_name,
t.SCHEMA_NAME AS schema_name,
t.TABLE_NAME AS table_name,
'table' as table_type,
t.RECORD_COUNT as row_count,
ts.LAST_MODIFY_TIME as last_updated_on
FROM SYS.M_TABLES t
LEFT JOIN SYS.M_TABLE_STATISTICS ts
ON t.SCHEMA_NAME = ts.SCHEMA_NAME
AND t.TABLE_NAME = ts.TABLE_NAME
WHERE t.SCHEMA_NAME IN ({list of schemas})
ORDER BY t.SCHEMA_NAME, t.TABLE_NAME
LIMIT ?
OFFSET ?;
SELECT
'{tenant_db_name}' AS catalog_name,
SCHEMA_NAME as schema_name,
VIEW_NAME as table_name,
'view' as table_type
FROM SYS.VIEWS
WHERE SCHEMA_NAME IN ({list of schemas});
Updated 2 days ago