30 th Sept 2011 Denali – Always On Deep Dive Bob Duffy Database Architect Prodata SQL Centre of Excellence Speaker Profile – Bob Duffy • • • • • • Database Architect at Prodata SQL Centre Excellence, Dublin One of about 25 MCA for SQL Server globally (aka SQL Ranger) SQL Server MVP, 2009+ MCM on SQL 2005 and 2008 20 years in database sector, 250+ projects Senior SQL Consultant with Microsoft 2005-2008 • Blog http://blogs.prodata.ie/bob Agenda • • • • • • • Typical HADR Requirements Always On Installation Migrating to Availability Groups Planned Failover Automated Failover Active Secondary Replicas Integration with Failover Clustering Case Study – Dell DVD Store Enterprises Need to accept 24 x 7 transactions For High Availability Zero down time and data loss (automatic) SLA to cover storage failure “I see web site unavailable and you are fired” “Reporting” Near real time. No more than 1 minute behind With minimal impact on live transactions Disaster Recovery Have second data centre. No more than 30 mins down time and 5 minutes data loss Upgrade Any upgrade needs to be with no down time or data loss Typical HA/DR/Reporting Architecture Application Server(s) DR Site Denali Always On Synchronous Secondary Replica (Automatic) Application Server(s) Asynchronous Secondary Replica Read Only DR Site Secondary Replica (Manual Async) Installation Always On relies on WSFC Always On is NOT a clustered install of SQL Does not need Shared Disk Windows 2008 Server+ EE needed Need to Enable HadrService Via configuration manager Or via PowerShell Some Patches and QFE Needed KB 976097 KB 2494036 Installation Cluster + HadrService Migrating to Always On / Denali Any normal migration is possible: Backup / Restore Attach / Detach Mirroring Log Shipping Replication Might need some post migration tasks Compatibility Use of new features such as Containment Migrating to Denali Creating your First Availability Group Failover to Secondary Quorum Changes Required for automatic failover Built in feature of WSFC Replaces the concept of “witness” to avoid split brain Two recommended Steps 1. Select nodes to vote 2. Select the Quorum Type If Odd number use “Node Majority” If even number use “Node Majority plus File Share Witness” Requires QFE KB KB 2494036 to enable Voting May need to adjust quorum after failover outside automatic target Flexible Failover Policy Flexible Failover Policy provides administrators control over the conditions when an automatic failover should be initiated. Diagnostics Configurable options eliminate false failover Improved logging for better diagnostics New Failure Condition Levels Level Condition Description 0 No Automatic Failure Maintenance Mode 1 Failover or restart on SQL Server down Service Down 2 Failover or restart on SQL Server unresponsive No data from sp_server_diagnostics 3* Failure of restart of critical Services sp_server_diagnostics Return ‘system error’ 4 Failover or restart on moderate server errors sp_server_diagnostics Return ‘resource error’ 5 Failover or restart on any qualified failure conditions sp_server_diagnostics Return ‘query processing error’ Automated Failover Secondary Replica Features Failover Backup \ DBCC Reporting Queries AlwaysOn AutoStat Automatic Page Repair Read Only Replicas Secondary Replica – Active Log Sync Log Capture Network Log Recieve DB1 DB1 Log Pool Commit Redo Thread Log Cache Log Cache Log Flush Log Harden DB1 Log DB1 Data Acknowledge Commit DB1 Log Redo Pages Page Updated DB1 Data Secondary read is behind primary Log is first hardened and then applied Redo thread is asynchronous and runs in the background Latency (typically seconds) can be larger for log intensive operations like bulk import or index create/rebuild Sync Replica minimizes latency due to network issues Auto-Stats on Readable Secondary Query Optimization and Statistics SQL Server uses cost based optimizer that heavily relies on object statistics If statistics are missing, SQL Server automatically creates them and persists Auto-stat on readable secondary will require physical Changes Example: Table T1 (C1, C2, C3) Query on primary with predicate (C3 > 10). SQL Server auto-create the statistics, if needed, on column C3 on primary replica Won’t work on Readable Secondary as it will cause physical changes to the database. Similar issue if the statistical information is stale on the secondary replica. Solution Auto-create statistics on the secondary replica but persist them in TempDB Existing catalog views (e.g. sys.stats) show temporary statistics Unplanned Disaster! Availability Groups and Failover Cluster Instance > Coming up… Speaker Quest Title Room Gruesome, Grotesque and Gory – Horrors Stories from the Aintree World of IT Idera Tricks of the Trade - DBA tools Attunity Data Replication Redefined – best practices for replicating Empire data to SQL Server Fusion-io Myths and Monsters of Flash SQLSentry Real Time and Historical Performance Troubleshooting with Pearce SQL Sentry #SQLBITS Lancaster Derby Questions ? AlwaysOn Resources “Denali” AlwaysOn Resource Center: http://msdn.microsoft.com/enus/sqlserver/gg490638(en-us,MSDN.10) CTP download Documentation MSDN forums Microsoft Connect AlwaysOn Blog Trace Flag 9532 TSQL features Important – Trace Flag 9532 Some Always On Availability Groups features require that the instances of SQL Server that host availability replicas have been enabled for trace flag 9532. For information about which Transact-SQL options require enabling this trace flag, CREATE AVAILABILITY GROUP Transact-SQL options: Using more than two REPLICA ON clauses FAILURE_CONDITION_LEVEL HEALTH_CHECK_TIMEOUT SESSION_TIMEOUT Using any of the following ALTER AVAILABILITY GROUP Transact-SQL options: ADD REPLICA ON REMOVE REPLICA ADD DATABASE REMOVE DATABASE ENDPOINT_URL FAILURE_CONDITION_LEVEL HEALTH_CHECK_TIMEOUT SESSION_TIMEOUT Thank You!