Showing posts with label L1 DBA. Show all posts
Showing posts with label L1 DBA. Show all posts

Mar 14, 2024

Correlated Subquery

Correlated Subquery


Introduction-

 

A correlated subquery is a SQL query in which the values of the inner query (subquery) are derived from the outer query. Essentially, the subquery is conducted many times, one for each row processed by the outer query. This makes correlated subqueries very useful for complex data retrieval scenarios, but they can also be more performance-intensive than non-correlated subqueries due to repeated execution.

 

Here's a step-by-step explanation to understand correlated subqueries in a structured way:


1. Basic Structure-

A correlated subquery typically looks something like this:

 

SELECT column_name(s)

FROM table_name AS outer_table

WHERE column_name OPERATOR

    (SELECT column_name

     FROM table_name AS inner_table

     WHERE outer_table.column = inner_table.column);

 

2. Key Components


Outer Query-

This is the main query that first gets executed up to the point of the subquery. It retrieves data from the outer table.

 

Inner Query (Subquery)-

This query is nested inside the outer query. Unlike a regular subquery, it references columns from the outer query, creating a dependency.

 

Correlation-

The correlation happens through the condition that links the inner and outer queries (e.g., outer_table.column = inner_table.column). This condition is what makes the subquery "correlated" because it must reference a column from the outer query for each row it processes.

 

3. Execution Flow-

The outer query starts executing.

For each row in the outer query, the inner (correlated) subquery executes.

The subquery uses a value from the current row of the outer query to perform its operation.

The result of the subquery is used to complete the operation of the outer query for the current row.

Steps 2-4 repeat for each row processed by the outer query.

 

4. Use Cases


Correlated subqueries are useful for-


Row-by-row operations-

When you need to perform calculations or comparisons on each row individually.

 

Existence checks-

To check if some related data exists in another table for each row.

 

Aggregate calculations-

Such as finding the minimum, maximum, average, etc., of grouped data specific to each row in the outer query.

 

 

5. Performance Considerations-

Since the subquery maybe executed once for each row in the outer query, correlated subqueries can be slow, especially on large datasets. It's often recommended to look for
alternatives, such as JOINs or temporary tables, if performance becomes an issue.

 

Example

Consider two tables, employees and department, where you want to find the names of employees who earn more than the average salary in their respective departments-

 

SELECT e.name, e.salary

FROM employees e

WHERE e.salary > (

    SELECT AVG(salary)

    FROM employees

    WHERE department_id = e.department_id

);

 

In this example, for each employee in the outer query, the subquery calculates the average salary of their department and checks if their salary is above this average. The correlation between the outer and inner query is established through department_id.

Mar 7, 2024

Extended Events in SQL Server

Extended Events in SQL Server


Introduction-

Extended Events (XEvents) is a highly scalable and adaptable event-driven monitoring system that debuted with SQL Server 2008. It is intended to assist database administrators (DBAs) and developers in gathering extensive information on SQL Server behavior, such as performance issues, query execution, and system diagnostics. Extended Events' architecture provides for minimum performance effect, making it appropriate for use in production applications.

 

Let us break down the word "Extended Events" and explain each component:

 

Extended-

The name indicates that this feature goes beyond typical monitoring and tracing tools such as SQL Trace and SQL Server Profiler. It has more events, targets, and configuration choices, making it more adaptable and comprehensive.


Events-

Events are locations of interest within the SQL Server engine where activities take place. Examples include query execution, login attempts, and failures. When an event occurs, the Extended Events system can collect it, along with any pertinent data.


Extended Events are based on several fundamental components:


Events-

These are the happenings in SQL Server that you want to track. Each event is assigned a unique name and contains detailed data about the occurrence.


Event Sessions-

To gather data, you must create an event session. An event session specifies which events to monitor, what data to gather, and how to store the data.


Targets-

The locations where the information gathered during an event session is kept are known as targets. Targets include things like event files, which write the data to disk files, ring buffers, which store the data in memory, and histograms, which aggregate the data according to predetermined standards.


Actions-

Further components, such as the SQL Server process ID or the execution time, can be appended to session events in order to gather further context.


Predicates-

Predicates let you filter occurrences according to particular standards, ensuring that you only gather the information you're interested in. This lessens the effect on system performance and helps to reduce the amount of data that is gathered.


Create Extended Events in SQL Server-


Let us create an Event for 5 seconds. Which query takes 5 seconds to finish will be tracked by this event.

 

IF EXISTS (SELECT 1 FROM sys.server_event_sessions WHERE name = 'DurationOver5Sec')

       DROP EVENT SESSION [DurationOver5Sec] ON SERVER;

GO

CREATE EVENT SESSION [DurationOver5Sec]

ON SERVER

 

ADD EVENT sqlserver.rpc_completed(

       ACTION

       (

               sqlserver.client_app_name -- ApplicationName from SQLTrace

             , sqlserver.client_hostname -- HostName from SQLTrace

             , sqlserver.client_pid     -- ClientProcessID from SQLTrace

             , sqlserver.database_id     -- DatabaseID from SQLTrace

             , sqlserver.request_id     -- RequestID from SQLTrace

             , sqlserver.server_principal_name  -- LoginName from SQLTrace

             , sqlserver.session_id     -- SPID from SQLTrace

       )

       WHERE

       (

                    duration >= 5000000

       )

),

ADD EVENT sqlserver.sql_batch_completed(

       ACTION

       (

               sqlserver.client_app_name -- ApplicationName from SQLTrace

             , sqlserver.client_hostname -- HostName from SQLTrace

             , sqlserver.client_pid     -- ClientProcessID from SQLTrace

             , sqlserver.database_id     -- DatabaseID from SQLTrace

             , sqlserver.request_id     -- RequestID from SQLTrace

             , sqlserver.server_principal_name  -- LoginName from SQLTrace

             , sqlserver.session_id     -- SPID from SQLTrace

       )

       WHERE

       (

             duration >= 5000000

       )

),

ADD EVENT sqlos.wait_info(

       ACTION

       (

               sqlserver.client_app_name -- ApplicationName from SQLTrace

             , sqlserver.client_hostname -- HostName from SQLTrace

             , sqlserver.client_pid     -- ClientProcessID from SQLTrace

             , sqlserver.database_id     -- DatabaseID from SQLTrace

             , sqlserver.request_id     -- RequestID from SQLTrace

             , sqlserver.server_principal_name  -- LoginName from SQLTrace

             , sqlserver.session_id     -- SPID from SQLTrace

       )

       WHERE

       (

             duration > 5000 --This one is in milliseconds, and I'm not happy about that

            AND ((wait_type > 0 AND wait_type < 22) -- LCK_ waits

                    OR (wait_type > 31 AND wait_type < 38) -- LATCH_ waits

                    OR (wait_type > 47 AND wait_type < 54) -- PAGELATCH_ waits

                    OR (wait_type > 63 AND wait_type < 70) -- PAGEIOLATCH_ waits

                    OR (wait_type > 96 AND wait_type < 100) -- IO (Disk/Network) waits

                    OR (wait_type = 107) -- RESOURCE_SEMAPHORE waits

                    OR (wait_type = 113) -- SOS_WORKER waits

                    OR (wait_type = 120) -- SOS_SCHEDULER_YIELD waits

                    OR (wait_type = 178) -- WRITELOG waits

                    OR (wait_type > 174 AND wait_type < 177) -- FCB_REPLICA_ waits

                    OR (wait_type = 186) -- CMEMTHREAD waits

                    OR (wait_type = 187) -- CXPACKET waits

                    OR (wait_type = 207) -- TRACEWRITE waits

                    OR (wait_type = 269) -- RESOURCE_SEMAPHORE_MUTEX waits

                    OR (wait_type = 283) -- RESOURCE_SEMAPHORE_QUERY_COMPILE waits

                    OR (wait_type = 284) -- RESOURCE_SEMAPHORE_SMALL_QUERY waits

       --OR (wait_type = 195) -- WAITFOR

                )

       )

)

ADD TARGET package0.event_file

(

       SET filename = 'DurationOver5Sec.xel',

             max_file_size = 10,

             max_rollover_files = 5

)

WITH

(

       MAX_MEMORY = 10MB

       , MAX_EVENT_SIZE = 10MB

       , STARTUP_STATE = ON

       , MAX_DISPATCH_LATENCY = 5 SECONDS

       , EVENT_RETENTION_MODE = ALLOW_MULTIPLE_EVENT_LOSS

);

 

ALTER EVENT SESSION DurationOver5Sec

ON SERVER

STATE = START;


How to Check the server for events-


This query allows us to simply check the query.

 

 

DECLARE

       @SessionName SysName

       , @TopCount Int = 1000

 

--SELECT @SessionName = 'UserErrors'

SELECT @SessionName = 'DurationOver5Sec'

--SELECT @SessionName = 'system_health'

/*

SELECT * FROM sys.traces

 

SELECT  Session_Name = s.name, s.blocked_event_fire_time, s.dropped_buffer_count, s.dropped_event_count, s.pending_buffers

FROM sys.dm_xe_session_targets t

       INNER JOIN sys.dm_xe_sessions s ON s.address = t.event_session_address

WHERE target_name = 'event_file'

--*/

 

SET STATISTICS IO, TIME ON

 

IF OBJECT_ID('tempdb..#Events') IS NOT NULL BEGIN

       DROP TABLE #Events

END

 

IF OBJECT_ID('tempdb..#Queries') IS NOT NULL BEGIN

       DROP TABLE #Queries

END

 

DECLARE @Target_File NVarChar(1000)

       , @Target_Dir NVarChar(1000)

       , @Target_File_WildCard NVarChar(1000)

 

SELECT @Target_File = CAST(t.target_data as XML).value('EventFileTarget[1]/File[1]/@name', 'NVARCHAR(256)')

FROM sys.dm_xe_session_targets t

       INNER JOIN sys.dm_xe_sessions s ON s.address = t.event_session_address

WHERE s.name = @SessionName

       AND t.target_name = 'event_file'

 

SELECT @Target_Dir = LEFT(@Target_File, Len(@Target_File) - CHARINDEX('\', REVERSE(@Target_File)))

 

SELECT @Target_File_WildCard = @Target_Dir + '\'  + @SessionName + '_*.xel'

 

--SELECT @Target_File_WildCard

 

SELECT TOP (@TopCount) CAST(event_data AS XML) AS event_data_XML

INTO #Events

FROM sys.fn_xe_file_target_read_file(@Target_File_WildCard, null, null, null) AS F

ORDER BY File_name DESC

       , file_offset DESC

 

SELECT  EventType = event_data_XML.value('(event/@name)[1]', 'varchar(50)')

       , Duration_sec = CAST(event_data_XML.value ('(/event/data[@name=''duration'']/value)[1]', 'BIGINT')/CASE WHEN event_data_XML.value('(event/@name)[1]', 'varchar(50)') LIKE 'wait%' THEN 1000.0 ELSE 1000000.0 END as DEC(20,3))

       , CPU_sec = CAST(event_data_XML.value ('(/event/data[@name=''cpu_time'']/value)[1]', 'BIGINT')/1000000.0 as DEC(20,3))

       , physical_reads_k = CAST(event_data_XML.value ('(/event/data  [@name=''physical_reads'']/value)[1]', 'BIGINT')/1000.0 as DEC(20,3))

       , logical_reads_k = CAST(event_data_XML.value ('(/event/data  [@name=''logical_reads'']/value)[1]', 'BIGINT') /1000.0 as DEC(20,3))

       , writes_k = CAST(event_data_XML.value ('(/event/data  [@name=''writes'']/value)[1]', 'BIGINT')/1000.0 as DEC(20,3))

       , row_count = event_data_XML.value ('(/event/data  [@name=''row_count'']/value)[1]', 'BIGINT')

       , Statement_Text = ISNULL(event_data_XML.value ('(/event/data  [@name=''statement'']/value)[1]', 'NVARCHAR(4000)'), event_data_XML.value ('(/event/data  [@name=''batch_text''     ]/value)[1]', 'NVARCHAR(4000)'))

       , TimeStamp = DateAdd(Hour, DateDiff(Hour, GetUTCDate(), GetDate()) , CAST(event_data_XML.value('(event/@timestamp)[1]', 'varchar(50)') as DateTime2))

       , SPID = event_data_XML.value ('(/event/action  [@name=''session_id'']/value)[1]', 'BIGINT')

       , Username = event_data_XML.value ('(/event/action  [@name=''server_principal_name'']/value)[1]', 'NVARCHAR(256)')

       , Database_Name = DB_Name(event_data_XML.value ('(/event/action  [@name=''database_id'']/value)[1]', 'BIGINT'))

       , client_app_name = event_data_XML.value ('(/event/action  [@name=''client_app_name'']/value)[1]', 'NVARCHAR(256)')

       , client_hostname = event_data_XML.value ('(/event/action  [@name=''client_hostname'']/value)[1]', 'NVARCHAR(256)')

       , result = ISNULL(event_data_XML.value('(/event/data  [@name=''result'']/text)[1]', 'NVARCHAR(256)'),event_data_XML.value('(/event/data  [@name=''message'']/value)[1]', 'NVARCHAR(256)'))

       , Error = event_data_XML.value ('(/event/data  [@name=''error_number'']/value)[1]', 'BIGINT')

       , Severity = event_data_XML.value ('(/event/data  [@name=''severity'']/value)[1]', 'BIGINT')

       , EventDetails = event_data_XML

INTO #Queries

FROM #Events

 

SELECT q.EventType

       , q.Duration_sec

       , q.CPU_sec

       , q.physical_reads_k

       , q.logical_reads_k

       , q.writes_k

       , q.row_count

       , q.Statement_Text

       , q.TimeStamp

       , q.SPID

       , q.Username

       , q.Database_Name

       , client_app_name = CASE LEFT(q.client_app_name, 29)

                                 WHEN 'SQLAgent - TSQL JobStep (Job '

                                        THEN 'SQLAgent Job: ' + (SELECT name FROM msdb..sysjobs sj WHERE substring(q.client_app_name,32,32)=(substring(sys.fn_varbintohexstr(sj.job_id),3,100))) + ' - ' + SUBSTRING(q.client_app_name, 67, len(q.client_app_name)-67)

                                 ELSE q.client_app_name

                                 END 

       , q.client_hostname

       , q.result

       , q.Error

       , q.Severity

       , q.EventDetails

FROM #Queries q

--WHERE eventtype NOT IN /*rather typical filtering*/ ('security_error_ring_buffer_recorded', 'sp_server_diagnostics_component_result', 'scheduler_monitor_system_health_ring_buffer_record')

       --AND eventtype NOT IN /*specific troubleshooting filtering*/ ('connectivity_ring_buffer_recorded', 'wait_info')

ORDER BY TimeStamp DESC