Stored Procedures - Guidelines

Introduction

This document is not intended to teach how to write Stored Procedures but aimed at those already familiar with SQL that should observe recommended best practices when creating Stored Procedures for Selligent purposes.

Hence focus is on:

Syntax, style and structure of SQL
Typical situations in which SPs will be used
Situations when SPs should be avoided
Recommendations for structure and style when writing SPs
Things to avoid when writing SPs
Some example SPs for common use cases.

SQL Styles Rules

Following these rules when writing SQL ensures consistency and uniformity across Marigold toolsets, improving maintainability when code is amended by different developers. An added benefit is improved performance when cached plans are reused.

Upper casing of keywords

Always capitalize T-SQL keywords (SELECT, FROM, …) and built-in functions (COUNT, MAX, MIN, …). SSMS users can adjust IntelliSense settings for built-in functions

Naming conventions

Use a consistent naming scheme for tables. A common scheme is: TYPE _ENVIRONMENT where:

ENVIRONMENT is to differentiate between test (_TST) and production (_PRD) for situations where a platform is used for both
‘TYPE’ depends on the object.

Table names

For lists created in the Selligent solution, a default naming prefix exists for the underlying tables:

Audience List: USERS_
Data List: DATA_
Option Lists: OPTION_
Data Selection Lists: ARTICLES_
Device Lists: DEVICES_

It is recommended that additional tables follow a standard naming convention during creation, e.g.:

Data Lists used for extended profiles (1-1) usually begin with: DATA_EXT_
DataLoader tables: SYNC_
DataExporter tables: EXP_
Temporary tables (physical): TMP_
Configuration tables: CFG_

Stored Procedures

When created in Selligent, all Stored Procedures begin with an ST_ prefix. Where possible it is encouraged to add further prefixes to help identify the purpose of the stored procedure:

Dataloader: ST_SYNC_
Dataexporter: ST_EXP_
Processing scripts (e.g.: procedures called by the Job Agent for nightly processing of new data or aggregate calculations): ST_PROCESS_
Report retrieval (e.g.: procedures linked to Reporting applications): ST_REPORT_
Logging: ST_LOGGING_

Other objects

PRIMARY KEYS: use the prefix PK_ (e.g.: PK_tablename for manually-created primary keys)
FOREIGN KEYS: use the prefix FK_
INDEXES:

Use IDX_ as the prefix

Use the key columns in the name of the index, e.g. an index on MAIL uses IDX_MAIL. For combined indexes, list all the key columns in the correct order.

Add _INC if there are included columns

Add _FILTERED if it is a filtered index
Add _UNQ if it is a unique index

CONSTRAINTS: DF_ as prefix (add _UNQ to indicate a unique constraint)

Schema qualified object names

Using schema-qualified object names avoids potential object mismatch by providing the database engine with information about which specific object to call, so will not presume the schema of the current logged-in user (SaaS accounts run under their own schema). For example, instead of:

SELECT name, mobile FROM USERS_CUSTOMERS;

Write:

SELECT name, mobile FROM dbo.USERS_CUSTOMERS;

An additional benefit is plan reuse, which avoids creating multiple plans for the same statement or batch to be executed by users with a different default schema.

Field naming conventions

There are plenty of articles written about the importance of appropriate and meaningful choices of field and object names (columns, stored procedures, parameters, etc). Names should be:

Legal: names should begin with a letter and can only contain letters, numbers and underscores - no spaces. To avoid confusion, it is recommended to avoid using keywords as names. https://msdn.microsoft.com/en-us/library/ms189822.aspx has a list of keywords.
Descriptive: the name should be indicative of its purpose. ADDRESS could refer to home, work, email, IP, MAC etc, so may need the table name to qualify it. Furthermore, it is encouraged to populate that column’s Description with annotation to clarify its purpose.
Succinct: although the maximum length of a name is 64 characters, a longer and more descriptive name introduces a greater risk of mistyping and may be truncated in presentation. Brevity should be observed, but not at the expense of clarity.
Independent of type: do not include the data type in the name. A column called REDEEMED is more obvious than BOOL_REDEEMED, and decoupling the type from the purpose permits the data type to be modified without having to update related references.
List-contextual: columns have meaning appropriate to that list, so a POINTS profile extension can simply use ARCHERY and GOLF rather than ARCHERYPOINTS and GOLF_POINTS.

Aliases

When using Common Table Expressions or subqueries, choose meaningful alias names (rather than aliases of ‘a’, ‘t1’, etc) and ensure they are clear and obvious. For example, a table is called ‘USERS_CONTACTS_PROD’ could use an alias of ‘CONTACTS’.

The same applies for CTEs and subqueries: use descriptive text. For example, use PURCHASES if the tale represents all purchases.

Firstly, identify column aliases with the ‘AS’ syntax, e.g. use:

Copy

SELECT last_purchase_date AS lastbuy;

Instead of:

Copy

SELECT last_purchase_date = lastbuy;

Secondly, all references to table columns should then use any declared table aliases to make it clear which table the field belongs to. For example:

Copy

SELECT customers.id AS customer_identifier
, optin.value AS consent
FROM dbo.USERS_CUSTOMERS AS customers WITH (NOLOCK)
LEFT JOIN dbo.DATA_CUSTOMERS_OPTIN AS optin WITH (NOLOCK)
ON customers.id = optin.userid;

Table aliases are mandatory when statements incorporate more than one object. For statements featuring single object operations, aliases are still recommended to future-proof the statement for later expansion to include multiple objects.

Statement terminators

Future-proof statements with a statement terminator (;). Some current options (e.g.: Common Table Expressions) already require that the previous statement has been properly closed:

Copy

; WITH EXAMPLE_CTE (
SELECT customers.id AS userid
FROM dbo.USERS_CUSTOMERS customers
WHERE created_dt IS NOT NULL
)
SELECT USERID
FROM EXAMPLE_CTE
WHERE CREATED_DT > DATEADD(MONTH, -1, CURRENT_TIMESTAMP);

Joins

Do not use old-style join syntax (using the WHERE clause to perform a join). For example, instead of

Copy

SELECT USERS.MAIL, ORDERS.ORDERDATE
FROM dbo.USERS_CUSTOMERS USERS, dbo.DATA_ORDERS ORDERS
WHERE USERS.ID = ORDERS.USERID;

Instead, adhere to ANSI/ISO 1999 syntax with the following:

Copy

SELECT USERS.MAIL, ORDERS.ORDERDATE
FROM dbo.USERS_CUSTOMERS USERS WITH (NOLOCK) INNER JOIN dbo.DATA_ORDERS ORDERS WITH (NOLOCK)
ON USERS.ID = ORDERS.USERID;

Avoid keyword shortcuts

Using shorthand gives no performance gain and retards readability, so always use fully qualified keywords instead of shorthand equivalents, e.g.:

In datetime functions, use DAY instead of 'DD'
Write WEEKDAY instead of ‘DW’
Use CREATE PROCEDURE instead of ‘CREATE PROC’
Use BEGIN TRANSACTION instead of ‘BEGIN TRAN’
Write INNER JOIN instead of just ‘JOIN’ - make all joins explicit
Write WITH (NOLOCK) instead of just ‘(NOLOCK)’

SQL Data Types

Although all data types could be used within SQL routines (tasks, stored procedures, etc) only certain types are recognised by Selligent. Consequently, only use compatible data types in the situation where database fields are exposed to Selligent functionality (e.g.: a return value from a Stored Procedure or SQL Task).

The SQL data types used by Selligent are:

MMP	MS-SQL Server type
Boolean	BIT
Numeric	INT
Long	BIGINT
Float	FLOAT
Date	DATE
DateTime	DATETIME
Text	NVARCHAR
LongText	NVARCHAR

The types that can be safely exposed to Selligent are:

INT, NVARCHAR, DATETIME, DATE, FLOAT and BIT
NCHAR is permitted but with caveats:
- These fields are safe for PERSONALISATION purposes
- SELECTIONs against these data types adversely impacts SQL plans and introduces performance delays as the platform is unaware of those data types (so performs implicit converts during filtering)

Data types that should not be exposed to Selligent are:

SMALLINT/TINYINT – the platform does not perform boundary-checking
VARCHAR/CHAR/NCHAR are permitted but with caveats:
- These fields are safe for PERSONALISATION purposes
- SELECTIONs against these data types adversely impacts SQL plans and introduces performance delays as the platform is unaware of those data types (so performs implicit converts during filtering)

Data types that should be avoided are:

Deprecated data types (e.g.: NTEXT or TEXT)

Data width and constraints

Make field as narrow as possible, e.g.:

NUMERIC rather than LONG
TEXT rather than LONGTEXT Be accurate where possible about the length and data type of fields, for example: a field that will always be populated should be set to NOT NULL

Syntax formatting

These basic formatting rules improves readability of statements and thus assists with faster understanding and quick amendments:

List each column on a new row
Commas should be added at the beginning of a row and not at the end of a previous row (which permits commenting out specific elements more easily).
Use spaces liberally: for example, add spaces before and after each comparison
Use line breaks liberally: for example, add line breaks after each WHERE clause element, separate multiple statements with at least one blank line, etc

An example of a well-formatted statement would be:

Copy

SELECT customers.ID AS USERID
, promo.CODE
FROM dbo.USERS_CUSTOMERS customers WITH (NOLOCK)
INNER JOIN dbo.DATA_EXT_CUSTOMERS_PROMO promo
ON customers.ID = promo.USERID
WHERE EXISTS (
SELECT 1
FROM dbo.ACTION_CUSTOMERS_PROMO actionpromo
WHERE actionpromo.USERID = customer.ID
AND (
actionpromo.ACTIONCODE = 'NEWYEAR'
OR
actionpromo.EXEC_DT < DATEADD(DAY, -7, CURRENT_TIMESTAMP)
)
)
AND customers.[NAME] = 'Fireworks';

Permitted database objects in Selligent

There are particularly good reasons (mostly related to performance) not to use any of these database objects - by policy, Selligent only permits Stored Procedures.

By default, the Selligent solution does not provide any functionality to develop and test stored procedures, therefore it is advisable to use a local MS-SQL install for development and debugging work. Contact Zeta’s support for assistance in setting up a suitable environment.

Due to the risk that poor code presents to all organisations using that platform, management of stored procedures is restricted to those holding System admin privilege for that install.

Stored procedures must adhere to specific recommended Zeta practises, which include (but are not limited to):

Naming – all Stored Procedures follow a standard naming convention
Documentation – the procedure should be annotated with a standard header
Variable and parameter declaration – definition and structure
BEGIN / END blocks – denoting self-contained grouped statements
TRY / CATCH blocks – handling errors gracefully
OUTPUT and RAISERROR – return values and exit status
Logging – recommended to use ST_LOGGING_SELLIGENT_ROUTINE

Naming convention

The name of the stored procedure should begin with ST_ and can contain only alphanumerical characters and an underscore (_). Attempting to use a name that differs from this convention will report an error and disable the SAVE icon.

Documentation header

Upon creation each stored procedure is furnished with a standard documentation header:

This information should not only be completed but also updated as amendments are made, bringing maintainability benefits such as:

Identifying the procedure’s original author.
Assisting new developers to quickly understand the purpose of the routine.

Lines beginning with a Data Integration Documentation identifier (-- DID -) are parsed using a custom SQL Server Management Studio script which captures information following each tag (header attribute). To avoid truncation, each line should not exceed 200 characters.

Tags serve to document and annotate code blocks, so may appear multiple times (with content concatenated by the script based on the order in which they appear) and can appear anywhere in the code. It is strongly recommended additional tags being placed closer to code blocks rather than collating everything in the header.

The following tags are recognized:

Author — Specifies the code’s original author.
CreationDate — Should contain the original date when the routine was created.
Version — Indicates a X.Y.Z version number (format mentioned later)
Description — Describe the purpose and goal of the routine (without reading the code)
Exceptions — Description of the specific error(s) thrown. Use multiple lines (with tags) for multiple errors thrown.
BusinessRules — Describe the functional purpose of the routine, i.e.: the overall goal to be achieved by this code. Additional business rules can be specified by repeating the ‘DID’ part; again, consider using this tag extensively to document distinct parts of the routine within the code itself rather than to try and explain everything in the header.
LastModifiedBy — Identifies who last made amendments. It is recommended that this developer also annotates their changes accordingly closer to the code.

Variable and parameter declaration

Variables used in stored procedures should be defined at the top of the routine, providing a quick review of which are in use. Each variable should be placed on a new line with their datatype and length (even if optional) as length specification avoids the issue of silent truncation errors. In cases where a variable has a default value, explicitly specify this in the declarations.

For example:

Copy

DECLARE @NAME VARCHAR(50);
DECLARE @AGE INT;
DECLARE @NOW DATETIME = CURRENT_TIMESTAMP;

Parameters should also be treated in the same way:

Properly aligned on a new line each with the coma separator at the front
A data type and length explicitly defined and all properly aligned.
Provide default values for all parameters and align them for easy review.

Any output parameters should be placed at the end of the list. For example:

Copy

CREATE PROCEDURE dbo.ST_SYNC_MYDATALOADER
@FILENAME NVARCHAR(500)
, @CAMPAIGNID INT = NULL
, @RESULT INT = NULL
, @INSERT INT OUTPUT
, @UPDATE INT OUTPUT
, @REJECT INT OUTPUT
, @MSG NVARCHAR(4000) OUTPUT

Try/Catch blocks

Some SQL errors will not halt procedure execution and continue to cause more widespread damage with further processing, for example: a variable specified a data type of INT but is assigned a BIGINT value - an error will be logged but the variable will contain the value NULL.

For that reason, always use TRY / CATCH blocks to stop the routine upon encountering an error so that the error can be gracefully handled. For example:

Copy

DECLARE @LOG NVARCHAR(4000);
BEGIN TRY
-- === write procedure logic here ===
DECLARE @NUMBER INT;
-- !!! Oops, Arithmetic overflow error !!!
SET @NUMBER = 4654651321564163131;
END TRY
BEGIN CATCH
DECLARE @ERROR_MSG NVARCHAR(2000);
DECLARE @ERROR_SEVERITY INT;
DECLARE @ERROR_STATE INT;
-- get all error information
SET @ERROR_MSG = ERROR_MESSAGE();
SET @ERROR_SEVERITY = ERROR_SEVERITY();
SET @ERROR_STATE = ERROR_STATE();
-- log the error first
SET @LOG = '{"event_type":"ERROR","MSG":"' + @ERROR_MSG + '"}';
EXEC [ST_LOGGING_SELLIGENT_ROUTINE] @LOG, @@PROCID;
-- then throw error back up
RAISERROR(@ERROR_MSG, @ERROR_SEVERITY, @ERROR_STATE);
END CATCH

Output, Return and Raiserror

Use the RETURN and OUTPUT options in stored procedures correctly:

OUTPUT – use this parameter to identify data returned from a procedure
RETURN – only use this to provide status information, e.g.: ERROR_NUMBER() / @@ERROR.
RAISERROR – use this function (instead of using the return value of the processor) to raise errors. It is better to use CATCH to trap the error so it can be logged, then RAISERROR to handle it gracefully.

More information can be found at:

https://docs.microsoft.com/en-us/sql/t-sql/language-elements/raiserror-transact-sql

For Tasks in Selligent, note that:

OUTPUT parameters can be used in the next task stage
RETURN values can be used to determine if the procedure fails or not
A hard error in the procedure will make the task fail; RAISERROR will show that exact message.

For Custom Components that use Stored Procedures:

OUTPUT parameters can be scoped and used further in the journey
RETURN values can also be fetched and used further in the journey
A hard error in the procedure will be exposed to the journey, causing that to fail as well (with a default error message).

Logging

Selligent provides ST_LOGGING_SELLIGENT_ROUTINE, a logging framework which writes logging information to the LOG_SELLIGENT_ROUTINES table. Typical examples of usage from within a stored procedure include:

Copy

EXEC dbo.[ST_LOGGING_SELLIGENT_ROUTINE] 'THIS IS A LOG TEST', @@PROCID
SET @MSG = '{"event_type":"ERROR","MSG":"' + ERROR_MESSAGE() + '"}'; 
EXEC dbo.[ST_LOGGING_SELLIGENT_ROUTINE] @MSG, @@PROCID

A simple Stored Procedure should illustrate how logging is performed, for example:

Copy

CREATE PROCEDURE dbo.ST_ProofOfConcept
AS
BEGIN
/* ****** Documentation Template (max 200 chars per line) ************
-- DID - Author: John Doe
-- DID - CreationDate: Today
-- DID - Version: 0.1.0
-- DID - Description: A proof-of-concept to test logging routines
-- DID - Exceptions: Exceptional
-- DID - BusinessRules: TwelveInch Rule
-- DID - LastModifiedBy: ChangedByThem
********* Documentation Template (max 200 chars per line) ******** */
SET NOCOUNT ON;
-- writes 'This is a Log Test' into LOG_SELLIGENT_ROUTINES:
EXEC [ST_LOGGING_SELLIGENT_ROUTINE] 'This is a Log Test', @@PROCID
END;

Invocation

If possible (from Campaign) invoke the stored procedure with:

Copy

EXEC [ST_ProofOfConcept]

From Selligent, this will require an SQL task containing this Stored Procedure, or a Custom Component using this stored procedure added to a custom journey which can be invoked manually.

Checking logs

The LOG_SELLIGENT_ROUTINES table can then be queried using a routine like:

Copy

SELECT TOP 1000 L.ID
, L.SESSIONID
, OBJECT_NAME(L.PROC_OBJECTID) AS 'PROC_OBJECTNAME'
, L.EVENTTYPE
, CONVERT(DATETIME2(0),L.EVENTDT) AS 'EVENTDT'
, OA.CALLSTACK
, CONVERT(NVARCHAR(MAX), DECOMPRESS(L.[LOG])) AS 'LOGMSG' 
FROM DBO.LOG_SELLIGENT_ROUTINES AS L WITH (NOLOCK) 
OUTER APPLY 
( 
SELECT (STUFF(( SELECT '>' + CONVERT(NVARCHAR(256), 
ISNULL(OBJECT_NAME(TRY_CONVERT(INT, S.[VALUE])),'')) AS [text()] 
FROM string_split(L.CALL_STACK,'|') S 
FOR XML PATH ,TYPE).value('.[1]','nvarchar(max)') , 1, 2, '')
) 
) OA(CALLSTACK) 
WHERE 1 = 1 
-- specify the name of the Stored Procedure here: 
AND OBJECT_NAME(L.PROC_OBJECTID) = 'ST_ProofOfConcept' 
ORDER BY L.ID DESC

Those with access to Campaign may use the SQL Pane to conduct such a query. As this feature does not yet exist in Selligent, the only way to surface this information is:

Create a Data Selection List containing a text field sufficiently large to contain the logging information
Create a Stored Procedure that queries the DBO.LOG_SELLIGENT_ROUTINES table then inserts relevant data into that Data Selection List
Create a Custom Component that calls this Stored Procedure
Create a page with a Data Selection that uses the Data Selection List as a data source
Add a repeater to the page to surface this information
Create a Custom Journey that calls the Custom Component (to populate the Data Selection List) then uses this page to display rows of logging information.

An alternative approach is to use an EXPORT TASK to export the logging information to a separate file for analysis, but this may not be a more time-effective method than the journey to expose the information.

Writing T-SQL statements

The following section describes several guidelines to follow when creating T-SQL statements. These provide some best practices on how to use specific options and which methods are best suited in which case.

Cursors

Avoid at all costs; they are unnecessary for 99% of cases and cause overhead.

With (Nolock)

Add this clause to queries (SELECT statements) to ensure table locking is reduced.

Bear in mind this still allows dirty reads (meaning uncommitted data can be returned by this hint). Hence, omit this hint for situations when updated values must be retrieved.

Select into

This is a common approach to copy data from one table to a new one – but only data is copied: constraints, indexes and keys will not be duplicated. Consequently, it is advised to create the table first with the correct structure then use ‘INSERT INTO’ to copy data over.

Similarly, it is wasteful to use ‘SELECT * INTO’ to duplicate all fields when only a few are required: be economical with data retrievals!

Search arguments

Always use proper search arguments as predicates in queries, which benefits plan creation and performance:

Do not use manipulations (functions) on columns in the WHERE clause
Do not use wildcard matches at the start of a string compare (e.g.: avoid LIKE '%.com').
Make use of variables (e.g.: today’s date) for better plan caching.
Parameterize values rather than CONCAT
Use QUOTENAME to prevent possible SQL injection errors

Update

When using the UPDATE statement, ensure:

Use a WHERE clause to modify only specific rows
When working on intermediate result sets, there can be performance gains by creating a new table then inserting records rather than updating an intermediate table.

Delete

Avoid DELETE statements when manipulating data through ‘temporary’ or ‘real’ tables; I/O overhead and fragmentation means benefits of using smaller intermediate tables are lost.

Ensure the initial selection considers the proper filters directly or insert the result set into a new temporary table.

Merge

Although a powerful way to upsert data, the equivalent UPDATE/INSERT/DELETE may still run faster – in most cases, it is better to use UPDATE and INSERT operations instead. However if MERGE is to be used, consider the following points:

The source table has proper datatypes - avoid NVARCHAR(MAX) types.
The target selection should be proper, so only the fields needed for the MERGE
The join predicate between the TARGET and SOURCE table should be based on indexed fields, preferably UNIQUE indexes or PRIMARY KEYs – else SQLServer will not be able to generate a fast SQL execution plan.
Any transformation of data must be done in the USING clause for the SOURCE selection.
The data width of the target selection should not exceed the SQL server 8060 bytes block limit, else spills and table spools for the selection with MERGE will slow down performance significantly.
NEVER add extra constraints in the WHEN MATCHED or WHEN NOT MATCHED clause of the MERGE – this causes the MERGE statement to treat the results like a cursor, degrading performance.

Copy

MERGE dbo.USERS_CUSTOMERS AS TARGET 
USING (SELECT MAIL, [NAME] 
FROM SYNC_USERS_CUSTOMERS WITH (NOLOCK) WHERE MAIL IS NOT NULL) AS SOURCE 
ON (TARGET.MAIL = SOURCE.MAIL) 
--When records are matched, update the records if there is any change 
WHEN MATCHED 
THEN UPDATE SET TARGET.MAIL = SOURCE.MAIL , 
TARGET.CREATED_DT = CURRENT_TIMESTAMP 
--When no records are matched, insert the incoming records from source table to target table 
WHEN NOT MATCHED 
THEN INSERT (MAIL, [NAME]) VALUES (SOURCE.MAIL, SOURCE.[NAME]);

Dynamic SQL

Dynamic SQL plans cannot be cached, so they cause a recompile of the SQL execution plan for each invocation – cached plans are better for performance. However, if Dynamic SQL is needed, adhere to the following practice:

Always invoke the dynamic SQL through the following syntax: EXEC SP_EXECUTESQL …
All parameters should be provided using variables passed to the ‘SP_EXECUTESQL’ syntax: EXEC SP_EXECUTESQL @SQL, N’@VALUE INT’, @VALUE = @VALUE; So no concatenation of the SQL string just to add a value.
If concatenation is needed for column names or table names, use the ‘QUOTENAME’ function to mitigate the risk of erroneous SQL, for example:

Copy

SET @SQL = 'SELECT @MINID = MIN(ID) 
FROM dbo.' + QUOTENAME(@TABLENAME) + ' WITH(NOLOCK)';

CTE

Common Table Expressions have their benefits, such as:

Improves readability and maintainability
Provides recursive programming via self-referring tables – e.g.: queries walking itself via self-joins
Quickly narrowing down data used with windowing functions (ROW_NUMBER, RANK, …), to calculate rolling averages

However, a CTE will not persist or pre-calculate any data; although it provides similar functionality to a view, the definition is nor stored in metadata so the SQL engine re-executes the same code each time it is used. If a CTE is to be referenced multiple times in the same statement, consider instead intermediate (temporary) tables to hold results, which can be optimized for multiple uses.

Computed columns

The two approaches are:

Non-persisted — the value is not stored but calculated at the moment of retrieval (I.e.: evaluated as late as possible).
Persisted — the value is calculated during a data change, then stored in the table (i.e.: evaluated as early as possible). Due to the additional performance overhead required by non-persisted, this approach is preferred.

Computed columns cause an overhead when manipulating data but can simplify queries, even removing the need for Views. Consider the following situation, in which a filter on email domain names takes the form of LIKE comparisons in a segment, or specific RIGHT/SUBSTRING operations - meaning a performance overhead each time a segment is executed as the comparison or string operation is unnecessarily repeated.

An alternative approach is to make a persisted computed column, for example taking the domain part of an email address:

Copy

ALTER TABLE dbo.users_vouchers 
ADD domain AS RIGHT(mail, CHARINDEX('@', REVERSE(mail))-1) PERSISTED;

In this situation, the domain is determined when the MAIL address is created (or altered), performing the string splitting operation once up-front – so that the value is available for successive queries and comparison operations. This approach means the initial overhead of a single additional write operation is a worthwhile investment to save time for frequent read operations in future. Note also that adding an index to a computed column makes it persisted.

Note that this computed field (DOMAIN) can even be indexed or included in an index, but the amount of expected read and write operations should justify the use of computed columns. Review the data model and typical business use of the data carefully to justify this approach.

Temporary tables / Table variables

Temporary Tables are best avoided where possible, as they are stored in TempDB which is heavily used by the platform as well as shared by many customers on one instance – so overuse of TempDB incurs a performance hit with a wide area of impact.

Table Variables are treated as Temporary Tables but SQL Server will always return an estimated row count of one record, and suffer from the same consequences as Temporary Tables - so best avoided for the reasons mentioned earlier.

Stored procedures for tasks

Every main type of task (SQL, IMPORT, EXPORT) for ETL processing requires a stored procedure at the core of its setup:

For an EXPORT TASK (extract) the stored procedure will be a query that defines the dataset to export.
For an IMPORT TASK (load), the stored procedure migrates newly-imported data from a source staging table to the destination production table.
For an SQL TASK (transformation), the stored procedure will process already-existing data to modify it according to business objectives.

Example export code

For a simple example, this procedure summarizes the number of contacts per language to give an idea of audience list demographics:

Copy

CREATE PROCEDURE ST_show_languages
@resultCode INT OUTPUT -- OUTPUT parameter = what's sent out
AS
BEGIN
/* ******************* Documentation Template (max 200 chars per line) ********************
-- DID - Author: Dave
-- DID - CreationDate: 2019-07-23
-- DID - Version: 0.1.0
-- DID - Description: Just used to group and total by language
-- DID - Exceptions: Exceptional
-- DID - BusinessRules: RulerOfBusiness
-- DID - LastModifiedBy: LastModifiedBy
******************* Documentation Template (max 200 chars per line) ******************** */
SET NOCOUNT ON;
SET @resultCode = 99; -- don't default to automatic success
BEGIN TRY -- try to query the audience list here
SELECT language, count(*) AS total
FROM users_newsletter_contacts
WHERE optout = 0
GROUP BY language
ORDER BY total DESC;
SET @resultCode = 0; -- success!
END TRY;
-- +++ OUCH! Something's gone wrong! +++
BEGIN CATCH
SET @resultCode = 4; -- SQL error somewhere
END CATCH;
-- report the results:
RETURN @resultCode;
END

Example import code

This is used for a data loader in which SYNC_IMPORT_TABLE is the list holding the newly-imported data, used to update both an audience list (USERS_CUSTOMERS) and a profile extension (DATA_EXT_ADDRESS) off that list.

Copy

CREATE PROCEDURE ST_SYNC_DataLoaderExample @FILENAME NVARCHAR(500) , 
@INSERT INT OUTPUT , 
@UPDATE INT OUTPUT , 
@REJECT INT OUTPUT , 
@MSG NVARCHAR(4000) OUTPUT AS BEGIN 
/* ******************* Documentation Template (max 200 chars per line) ******************** -- 
DID - Author: Author -- 
DID - CreationDate: Creation Date -- 
DID - Version: 0.1.0 -- 
DID - Description: Description -- 
DID - Exceptions: Exceptions -- 
DID - BusinessRules: Rule -- 
DID - LastModifiedBy: LastModifiedBy 
******************* Documentation Template (max 200 chars per line) ******************** */ 
SET NOCOUNT ON; 
DECLARE @LOG NVARCHAR(4000);
BEGIN TRY
SET @INSERT = 0; 
SET @UPDATE = 0; 
SET @REJECT = 0; 
SET @MSG = '';
SET @LOG = 'File [' + @FILENAME + '] is being processed.'; 
EXEC [ST_LOGGING_SELLIGENT_ROUTINE] @LOG, @@PROCID;
-- Create matching table 
IF OBJECT_ID('TEMPDB..#MATCH') IS NOT NULL 
BEGIN; 
DROP TABLE #MATCH; 
END;
CREATE TABLE #MATCH( 
SYNCID INT NOT NULL , 
USERID INT NOT NULL , 
PRIMARY KEY (SYNCID, USERID) 
-- Both fields as not just the audience list will be updated 
); 
-- Add existing matches on audience list to matching table 
INSERT INTO #MATCH(SYNCID, USERID) 
SELECT SYNC.ID, USERS.ID 
FROM dbo.SYNC_IMPORT_TABLE SYNC 
INNER JOIN dbo.USERS_CUSTOMERS USERS WITH (NOLOCK) 
ON USERS.MAIL = SYNC.MAIL ; 
-- AUDIENCE LIST: Add new records if it does not yet exist and output to matching table 
MERGE INTO dbo.USERS_CUSTOMERS TARGET 
USING dbo.SYNC_IMPORT_TABLE SYNC 
ON SYNC.MAIL = TARGET.MAIL 
WHEN NOT MATCHED 
THEN INSERT (MAIL, CREATED_DT) 
VALUES (SYNC.MAIL, CURRENT_TIMESTAMP)
OUTPUT SYNC.ID, INSERTED.ID 
INTO #MATCH (SYNCID, USERID);
-- === PROFILE EXTENSION: Update existing records === 
UPDATE ADR 
SET STREET = SYNC.STREET , HOUSENUMBER = SYNC.HOUSENUMBER , CITY = SYNC.CITY 
FROM DATA_EXT_ADDRESS ADR 
INNER JOIN #MATCH MAT 
ON ADR.USERID = MAT.USERID 
INNER JOIN dbo.SYNC_IMPORT_TABLE SYNC 
ON MAT.SYNCID = SYNC.ID 
; 
-- === PROFILE EXTENSION: Insert new records === 
INSERT INTO dbo.DATA_EXT_ADDRESS (STREET, HOUSENUMBER, CITY, USERID) 
SELECT SYNC.STREET, SYNC.HOUSENUMBER, SYNC.CITY, MAT.USERID 
FROM dbo.SYNC_IMPORT_TABLE SYNC 
INNER JOIN #MATCH MAT 
ON SYNC.ID = MAT.SYNCID 
WHERE NOT EXISTS ( 
SELECT 1 FROM dbo.DATA_EXT_ADDRESS ADR WITH (NOLOCK) 
WHERE ADR.USERID = MAT.USERID 
);
END TRY 
BEGIN CATCH
DECLARE @ERROR_MSG NVARCHAR(2000);
DECLARE @ERROR_SEVERITY INT; 
DECLARE @ERROR_STATE INT; 
SET @ERROR_MSG = ERROR_MESSAGE(); 
SET @ERROR_SEVERITY = ERROR_SEVERITY(); 
SET @ERROR_STATE = ERROR_STATE(); 
SET @LOG = '{"event_type":"ERROR","MSG":"' + ERROR_MESSAGE() + '"}'; 
EXEC [ST_LOGGING_SELLIGENT_ROUTINE] @LOG, @@PROCID; 
RAISERROR(@ERROR_MSG, @ERROR_SEVERITY, @ERROR_STATE); 
END CATCH;
END;

Example SQL tasks

Some internal manipulation, but how this is then added to a task, I.e.: create the SPs, then create a task and drag it onto the canvas with success/fail paths.

Copy

ALTER PROCEDURE ST_SQL_INT_FORMATION_EX1_SP1 (@ERROR INT OUTPUT, @EXEC_DT NVARCHAR(50) OUTPUT, @OMSG NVARCHAR(4000) OUTPUT) 
AS 
BEGIN 
SET NOCOUNT ON; 
SET @ERROR = 0 
SET @EXEC_DT = CONVERT(NVARCHAR,GETDATE(),111) 
SET @OMSG = 'The procedure SQL_INT_FORMATION_EX1_SP1 was started at ' + CONVERT(NVARCHAR,@EXEC_DT) 
SET @OMSG = ISNULL(@OMSG,'') + 'The procedure SQL_INT_FORMATION_EX1_SP1 ended at '+ 
CONVERT(NVARCHAR,GETDATE(),111) + ' AND HAS RETURN VALUE ' + CONVERT(NVARCHAR,ISNULL(@@ERROR,0)) 
SET @ERROR = @ERROR + ISNULL(@@ERROR,0) 
RETURN @ERROR 
END

Reporting journey based metrics

Copy

SELECT METRICS.CAMPAIGNID, METRICS.ACTIONID, CAMP.[NAME], MAIL.[NAME] , 
METRICS.TARGETCOUNT, METRICS.DELIVERYCOUNT, METRICS.BOUNCECOUNT 
FROM CAMPAIGNS CAMP WITH (NOLOCK) 
INNER JOIN SIM_REPORTING_FLOWMETRICS METRICS WITH (NOLOCK) 
ON CAMP.ID = METRICS.CAMPAIGNID 
INNER JOIN CAMPAIGN_ACTIONS ACTIONS WITH (NOLOCK) 
ON METRICS.ACTIONID = ACTIONS.ACTIONID AND METRICS.CAMPAIGNID = ACTIONS.CAMPAIGNID 
INNER JOIN MAILS MAIL WITH (NOLOCK) 
ON MAIL.ID = ACTIONS.MAILID 
WHERE CAMP.CREATED_DT < DATEADD(MONTH, -1, CURRENT_TIMESTAMP);

Transactional journey feed

One common use for SQL TASKS is to preload an (ARTICLE) list with pending communications then trigger a Transactional Journey to process those records. This is a preferable approach to trigger a mass-mail by batch-processing all the records first then triggering the journey once, rather than triggering the journey with numerous API calls.

Copy

ALTER PROCEDURE ST_SQL_TRG_TRAN_RECEIPT_JOURNEY 
AS 
BEGIN 
/* ******************* Documentation Template (max 200 chars per line) ******************** 
-- DID - Author: Author -- 
DID - CreationDate: Creation Date -- 
DID - Version: 0.1.0 -- 
DID - Description: Description -- 
DID - Exceptions: Exceptions -- 
DID - BusinessRules: Rule -- 
DID - LastModifiedBy: LastModifiedBy ******************* Documentation Template (max 200 chars per line) ******************** */ 
SET NOCOUNT ON;
;WITH LATEST_ORDERS ([USERID],[ORDERID],[ORDERCODE],[RN]) AS 
(
SELECT [USERID] 
, [ID] AS [ORDERID] 
, [ORDERCODE] 
, ROW_NUMBER() OVER (PARTITION BY [USERID] ORDER BY [ORDER_DATE] DESC) AS RN 
FROM DBO.[DATA_ACA_DATA_RETAILER_ORDERS] ORD 
) 
-- === this is ACTION_AL_TX_[orgIDstring]_[JourneyID] ... 
-- e.g.: the 32-character organisation of abcd1234-dead-beef-9999-ab12cd34ef56 
-- and FlowID of 9876 (read from the journey properties) makes 
-- "abcd1234deadbeef9999ab12cd34ef56_9876" 
INSERT INTO [ACTION_AL_TX_abcd1234deadbeef9999ab12cd34ef56_9876] 
(USERID,ACTIONCODE,CONTENTDATA,TXARRAYFIELD1) 
SELECT 
U.[ID] 
, 'RECEIPT' AS ACTIONCODE 
, '{"ORDERCODE":"' + [ORDERCODE] + '"}' AS CONTENTDATA 
, '[{' + 
( SELECT STUFF(( 
SELECT 
'{ "ID":"' + CONVERT(NVARCHAR(100),ISNULL(OL.ID,'')) + 
'","PARAM":"RECEIPT","CONTENT":{"ORDERCODE":"' + 
CONVERT(NVARCHAR(100),ISNULL(O.ORDERCODE,'')) +
'","ORDER_LINE_TOTAL":"' + CONVERT(NVARCHAR(100),ISNULL(OL.TOTAL_PRICE,'')) + 
'","PRODUCT_ID":"' + CONVERT(NVARCHAR(100),ISNULL(OL.PRODUCTID,'')) + 
'","AMOUNT_PRODUCTS":"' + CONVERT(NVARCHAR(100),ISNULL(OL.AMOUNT,'')) + 
'","PRODUCT_NAME":"' + CONVERT(NVARCHAR(100),ISNULL(PROD.[PRODUCTNAME],'') ) + 
'","PRODUCT_DESCRIPTION":"' + CONVERT(NVARCHAR(100),ISNULL(PROD.[PRODUCTDESCRIPTION] ,'') ) + 
'"}},' 
FROM DBO.[DATA_ACA_DATA_RETAILER_ORDERS] O 
INNER JOIN DBO.[DATA_ACA_RETAILER_ORDERLINES] OL ON O.[ID] = OL.[ORDERID] 
INNER JOIN DBO.[DATA_ACA_RETAILER_PRODUCTS] PROD ON PROD.[ID] = OL.[PRODUCTID] 
WHERE LORD.ORDERID = O.ID 
FOR XML PATH ('')), 1, 1, '')) + ']'
FROM dbo.[USERS_ACA_USR_RETAILER] U 
INNER JOIN LATEST_ORDERS LORD 
ON LORD.[USERID] = U.[ID] AND LORD.[RN] = 1 
WHERE U.TESTUSER = 1;
END

Raising custom events

Inserting records into a Scheduled Custom Event is a useful method used to drive notifications – all that is needed is the underlying table name which can be found in the Advanced Settings of the event queue's properties itself:

For example, a Bounce Stored Procedure to notify someone that an OPTOUT has taken place due to the mail quality settings:

Copy

ALTER PROCEDURE ST_exampleBounceNotification 
@LISTID INT, 
@USERID INT, 
@EMAIL NVARCHAR(255), 
@BOUNCE_THRESHOLD_REASON NVARCHAR(50), 
@INQUEUEID BIGINT = -1, 
@STATE INT = -1 
AS
BEGIN 
/* ******************* Documentation Template (max 200 chars per line) ******************** 
-- DID - Author: Dave 
-- DID - CreationDate: Back in 2019 
-- DID - Version: 0.1.0 
-- DID - Description: Responds to an optout bounce by raising a custom event 
-- DID - Exceptions: 
-- DID - BusinessRules: Note the Custom Event needs to be created first 
-- DID - LastModifiedBy: LastModifiedBy 
******************* Documentation Template (max 200 chars per line) ******************** */ 
SET NOCOUNT ON;
-- ==== let's try to inject the record here ============ 
BEGIN TRANSACTION;
BEGIN TRY 
INSERT INTO EVENTS_469122_ANNUALDAY 
/* Custom Event tablename */ 
( 
user_id, 
created_dt, 
bounced_address, 
/* what address bounced? */ 
bounce_reason /* what was the given reason? */ 
) 
VALUES 
( 
1, /* inform person 1 in MANAGERS list */ 
getdate(), /* today's date/time */ 
@EMAIL, /* bad email */ 
@BOUNCE_THRESHOLD_REASON /* bounce reason */ 
) 
END TRY
BEGIN CATCH 
SELECT ERROR_MESSAGE() AS ErrorMessage; 
ROLLBACK TRANSACTION; 
END CATCH;
COMMIT TRANSACTION;
END

Note: This Stored Procedure presumes there is a Journey that responds to events being raised, so will send a notification to whomever has MASTER.ID = 1 in the MANAGERS audience list.