The configuration option "Max degree of parallelism" (MaxDOP) defines how many processors SQL Server can use in parallel to execute a single SQL query. Setting this option to a value other than 1 may decrease the response time of a single, long running SQL command. However, this will also decrease the overall throughput and concurrency of SQL Server as the parallel executions are consuming more system resources as the serial executions. Running with MaxDOP = 1 can decrease the runtime (compared to the parallel execution) but will increase the concurrency (more users can work in parallel) and the throughput. For a long time SAP recommends to set MaxDOP to 1 to disable intra query parallelism completely, with the background that the common ERP query is small and does not return many rows. Another reason to force a MaxDOP of 1 was to provide predictable performance of queries as those get executed thousands of times throughout the day under different load conditions of the underlying server infrastructure.

In order to give a bit more background, let's briefly touch on how SQL Server's query parallelism works. Before a potential parallel query is executed, SQL Server checks the available resources. The number of processor resources available, available SQL Server worker threads and memory available to determine the number of processors used to service the query. Based on those checks, a query might not be executed using all CPU threads available, instead using only a few CPU threads (or even just one CPU thread).
When executing a query in parallel, multiple streams of data are sorted in parallel and merged afterwards. This leads to higher buffer usage in cache and in addition, the CPU resource consumption by a single query usually increases with the degree of parallelism.
It is important to note that the MaxDOP set either in the global configuration or with query hints as done by SAP BW, can applied to the different execution branches (parallel operators) of a query. That can result in a way higher number of SQL Server worker threads engaged in the execution of a query than MaxDOP is set to. Easy to imagine, especially in more complex queries joining several tables. We might end up with a plan that decides to scan all tables involved in a parallel manner. In such a case the MaxDOP setting is applied to each scan. Means we can have number of tables multiplied by MaxDOP worker threads allocated for the scan alone. We then can have MaxDOP applied for every single join. And so it can go on. At the end complex queries can allocate way more worker threads than the MaxDOP setting.
Although an isolated query executed in parallel generally will execute much faster, there is a point at which the parallel query execution becomes inefficient and can even extend the execution time. For example, parallel queries performing small joins and aggregations on small data sets can become inefficient when they run on all 64 CPU threads of a server. Due to different degrees of parallelism chosen at execution time, response times for one query can be different depending on resource availability. The most severe effect is the varying response times experienced by end users. Like administrators, end users want predictability in a system. They want predictable performance in the most important areas. Allowing SQL Server to execute queries in parallel can compromise this predictability. Different run times for the same query due to different run time decisions on parallel execution may occur.

Things to consider when changing MaxDOP to a value other than 1

SQL Server has to create multiple plans for each statement, one for the parallel and one for the serial execution. So the compilation time will be higher for all statements and the system needs more memory for storing SQL plans.
Increasing MaxDOP will increase the need for system resources (CPU and Memory) for any parallel query. Modern hardware with many CPUs and huge amount of RAM can provide these. See note 1612283 for more details.
The runtime of a statement can vary, depending on the type of execution (parallel or serial). The type of execution is determined through the SQL Server optimizer at runtime, depending on the load on the system and availability of system resources. A query that can run in parallel in the morning (as system load is low) might run in serial in the afternoon with a different runtime. This is especially true for small virtual or physical machines with limited system resources.

With the advent of increasing numbers of cores and therefore CPUs per physical socket (see this blog for details ) and with the enhanced potential of SQL Server 2012 to be able to handle parallel queries more gracefully, the recommendation to only use a MaxDOP setting of 1 has been relaxed for this and later releases of the product. The new recommendation from SAP in regards of the MaxDOP setting is as follows:

It applies only for SQL Server 2012 or higher
For SAP BW System there is no change, means the old recommendation of 1 is still valid. See note 1654613 for details.
If the server has less than 16 CPUs the old recommendation of MaxDOP = 1 is still valid
If the system has 16 to 63 CPUs you might set it to 1 or 2
If there are 64 or more CPUs available, valid settings are 1, 2, 3, or 4
A value of zero (0) is not allowed at any time (this setting means the SQL Server can decide how many threads it is using for the query (from 1 up to #CPUs, but not necessarily all CPUs))

The SAP alerting functionality will leverage these new MaxDOP thresholds as of the basis support packages listed in SAP Note 2133120.

You can find this new MaxDOP setting documented in the SAP SQL Server Configuration notes 1702408 (SQL Server 2012) and 1986775 (SQL Server 2014) as well.

The SAP Early Watch Alert will change accordingly to this new recommendation. This flow diagram illustrates the new colored warning levels for the MaxDOP setting in the early watch report. Click on the diagram to open it in a bigger resolution. You have to read the diagram from left to right.

To change the MaxDOP setting you can execute the following SQL script via the SQL Server Management Studio:

use master
exec sp_configure 'Max degree of parallelism', <new value, e.g. 2>
reconfigure with override
go

The change will be active instantly without the necessity to restart SAP or SQL Server.

As you can see, we are careful with our recommendations in setting MaxDOP for SAP systems. Different DBMS systems are following different implementations of query parallelism. Some DBMS systems, especially systems that keep data in-memory resident, are claiming that they want to provide all CPU resources to maximize performance for a single query and have all other queries waiting. Argument for such a strategy is that queries are executed so fast with using all CPU resources that even under concurrent situations the wait time of the other queries is at the end lower than trying to split CPU resources to execute different queries. An approach that may work under certain circumstances for very specific OLAP type of workload (BW), but badly breaks in OLTP type of workload (ERP, CRM etc.). In SQL Server the idea is to maximize for concurrency. Something which definitely benefits OLTP or hybrid workloads. With the settings as we suggested it above, we think that we should have a very good compromise, especially for SAP ERP workloads. On the one side we should not overload CPU resources by engaging multiple worker threads for a single query. On the other side, the queries executed in parallel are of little complexity and should be executed in an efficient way.

Eager to hear your feedback after changing the settings.

Rahat Ahmad from our Partner Architect Team has contributed this blog.

In one of the partner workshop last month, we had an interesting conversation around SAP Sizing principals and options on Microsoft Azure.

As one of the prerequisites for SAP certification to run SAP Applications on Azure, Microsoft benchmarked some of the Virtual Machine Types that are certified by SAP for Azure (eg. A6, D14 etc). These certifications are publicly available here. SAP benchmarks are done using a specific combination of Operating System, Database and SAP Application releases. For example if you see here, you will find Operating system as Windows Server 2012 DC Edition, SQL Server 2012 and SAP EhP 5 for SAP ERP 6.0. Noticing further please note CPU utilization of central server, which is 99%.

So, the question is: SAP recommends sizing to be conducted targeting a maximum of 60% - 70% CPU utilization. As a normal practice, do we need to scale down the SAPS benchmarked while sizing for Microsoft Azure? If we take any benchmark certification from SAP, let say, certification number: 2014040. According to this benchmark we get 18,770 SAPS at 99% CPU utilization. If I need to map to Virtual Machines (VMs) available on Azure, what are the recommendations?

Let me try to explain:

There are two recommended ways SAP sizing is done.

1^st SAP Quicksizer: we have three options: 1. Quicksizer for Initial Sizing, 2. Quicksizer for Advanced Sizing and 3. Quicksizer for Expert Sizing. These options are used to perform user or throughput based sizing. The accuracy of SAP Quicksizer results is related to the quality of the input information.

2^nd Reference Sizing: Sizing based on the comparison of ST03 and other actual productive customer performance data to another known customer system with the same or similar performance data and a known hardware configuration.

Example: An on-premises customer system with a database size of 1TB and 1,000 users processes 3 million dialog steps per day total. Out of this 3 million dialog steps, 1 million is Dialog, 1 million is RFC and the balance is other task types. The month end peak dialog steps/hour is 400,000/hour. ST03 data reports 400 low, 400 medium and 200 high usage users. The workload is compared to a different known customer deployment and found to be nearly identical to another customer running a dedicated DS14 VM with SQL Server datafiles and log files on Premium Storage connected to 4 DS12 VMs running two SAP application instances per VM. Each SAP application server instance has 50 workprocesses. It is therefore possible to recommend a similar configuration to the on-premises customer after validating the annual growth rates etc.

Sizing for SAP systems are usually an iterative process and it depends closely on the quantity and quality of information available at the time of sizing to arrive on the actual compute resource requirements. The key is to identify at what stage a customer is in their SAP project lifecycle. If a customer is starting their journey with SAP, SAP Quicksizer is used. However, if a customer is already using SAP for production either Expert sizing using Quicksizer is used or reference based sizing is done.

It is critical that we clearly distinguish between the sizing results derived using SAP Quick Sizer and the one derived out of existing systems on the underlying (existing) hardware (Reference sizing). When sizing exercise is done using SAP Quicksizer, the recommended CPU utilization is factored in. The CPU sizing results are calculated against an average target CPU utilization of 65% for throughput-based sizings (and 33.3% for user-based sizings) to achieve predictable server behavior. Ideally, you would observe 65% CPU utilization if you ran the same processes used in Quicksizer and purchased hardware to meet the CPU sizing recommendations, which are measured according to the SAP Application Performance Standard (SAPS).

As the Quicksizer tools calculates for 65% utilization, you can use this value to check with existing benchmark results. You do not have to do any more calculations.

Therefore, if we have the visibility in the process of how sizing results (SAPS) are arrived at and what was sized using SAP Quicksizer, we are good with mapping the SKUs (compute resources) available on Azure. However, if we do not have the visibility on the process, we need to be conservative and add buffers before mapping to Azure VMs.

Whereas when looking in to existing SAP systems we need to see how the customer defines the SAPS. It is not recommended to do a simple 1:1 mapping of current on-premises CPU and RAM to Azure VM types. This will typically lead to significantly oversized solutions.

If we are simply taking the SAPS number of the existing hardware and map it to a VM, we need to know about the CPU resource consumption of the hardware as well (EWA can provide this information). Whereas if we are getting a SAPS number where the customer already calculated the SAPS down (means server has 40K SAPS, but we only use 50% CPU hence we need 20K SAPS), then we need to take a buffer into account since we don’t want to run the servers on 100%.

Coming back to mapping the SAPS requirements and VMs on Azure, lets us see an example.

A company is running SAP EhP6 for SAP ERP 6.0. The SAPS required for this system is:
Total SAPS required: 36760
SAPS for DB: 3500
SAPS for App: 33260
Note: For simplicity, let us ignore IOPS, Storage and other typical requirements.

Following is the table with few of the VMs available (not exhaustive) on Azure along with the benchmarked 2-tier SAPS.

VM Type	# of CPU cores	Memory size (GB RAM)	SAPS	Highspeed Local SSD (GB)	# of attachable disks to maximize IOPS
A5	2	14	1,500	-	4
A6	4	28	3,000	-	8
A7	8	56	6,000	-	16
A8	8	56	11,000	-	16
A9	16	112	22,570	-	16
A10	8	56	11,000	-	16
A11	16	112	22,570	-	16
D11	2	14	2,338	100	4
D12	4	28	4,675	200	8
D13	8	56	9,350	400	16
D14	16	112	18,770	800	32
DS11	2	14	2,338	28	4
DS12	4	28	4,675	56	8
DS13	8	56	9,350	112	16
DS14	16	112	18,770	224	32

Table 1: Azure VMs for running SAP applications.

If I have to identify the Virtual Machines available on Azure. I will need to know whether the SAPS are derived using SAP Quicksizer or from existing hardware running the same SAP system.

If the SAPS are derived using SAP Quicksizer, I will assume CPU utilization (65%) is already factored in and will map the VMs accordingly.

Since the Quicksizer SAPS are already adjusted down for 65% CPU utilization we have two options, either to scale up the required SAPS output from Quicksizer or scale down the SAPS benchmarked for the VMs.

Option 1 – Scale-up SAP Quicksizer SAPS (@65%) and match to SAP SD 2 Tier Benchmark (@100%)

If we scale-up the SAPS we get 3,700 -> 5385 SAPS for DB and 33260 -> 51170 SAPS for APP. We have a straight forward mapping option;

Answer: A7 (6000 SAPS) and 3 xD14 (56310 SAPS).

Option 2 – Scale-down SAP SD 2 Tier Benchmark (@100%) and match to SAP Quicksizer SAPS (@65%)

Alternately, taking the scale-down option, from table 1, we see D14 is benchmarked for 18770 SAPS. Our requirement for Application server is 33460 SAPS. Adding two D14 will give me 2*18770 = 37540 SAPS, which is a little higher than our requirement and we should be happy about it. However, as we know the SAPS benchmark for VMs are done at 99% or near 100% CPU utilization, D14*2 is a challenge. How? When we scale down the utilization to 65%, D14 will give me approximately 12200 SAPS and D14 * 2 or 12200*2 = 24400 SAPS, which is less than 33260 SAPS required for App. Therefore, if we want to use D14 VMs, we will have to provision 3 and we will get 12200*3 = 36600 SAPS.

Similarly, for DB the required SAPS is 3700. We have two options A7 which is benchmarked for 6000 SAPS and D12 which is benchmarked for 4675 SAPS. Scaling down to 65% we get, for A7 3900 SAPS and D12 3040 SAPS approximately. The choice is clear here A7 provides a little higher SAPS and D12 provides lower SAPS than required. Hence, we go with A7 VM.

Answer: 1 x A7 for DB, 3 x D14 for SAP application server

Note: This is just an illustration. It is possible that we choose different VMs altogether as the solution to this example for various reasons.

If SAPS are arrived using existing hardware running the same SAP system, in addition to SAPS, we need to consider the following:

CPU utilization for the system in question
Processor Speed of the current hardware
CPU Type

Azure VMs have different performance characteristics dependent on the different SKUs (A1-A7, A9, A10, A11, D Series, DS Series, G Series). It is advised to have a proper understanding on specs for these SKUs to perform a better sizing/compute mapping.

IOPS and Disc latency

In addition to SAPS, IOPS and disk latency as well are critical factors for optimum sizing of compute resources on cloud.

It is important that we know the certified VMs, the type of storage supported with each VMs and the number of attachable disks to maximize the IOPS. This will give us an understanding on disk layouts possible and the maximum IOPS we can get for specific VMs. Following is the table providing the stats with and example disk layouts.

Best Practice Recommendation: Ensure Database datafiles are distributed across multiple disks and consider using Premium Storage especially for Database Log. Typically medium size databases should have 8-16 datafiles distributed across 4-8 disks (approximately 2 datafiles per disk). Larger databases will benefit from more disks.

This is illustrated in the below example:

Contoso Corporation is running SAP ERP landscape on-premises and are interested in migrating the entire landscape, which consists of Development, Quality Assurance, Production and Disaster Recovery systems to Azure data center. Following are the technical details of the production system:

• SAP System : SAP ERP ECC 6.0 EHP 7

• Operating System : Windows Server 2012 R2

• Database : SQL Server 2014

• Production Database server : 8 cores, 32GB RAM x 1 node

• IOPS (of Production DB server) : 7000 IOPS

• Production Database size : 2TB

• Performance Characteristics : Batch input jobs and heavy batch reports

• Production Application server : 8 cores, 32GB RAM x 3 nodes

• Production Total SAPS : 30k SAPS

Let us focus in designing a solution for the production system only (no Dev, QA and DR and no HA considered to keep the solution example simple).

We need to perform two major steps:

1. Select the deployment architecture of the SAP ERP system (3-tier). Choose appropriate VM Types for SAP ASCS/SCS, Application server and Database

2. Choose appropriate Storage Type and decide how many hard discs are needed for Database Files and the Log Files

There are multiple options available to design a solution for the given production system on Azure. The table above shows one of the option to design the solution.

Let us check the requirements once again.

From the requirements statistics in the above example, the SAPS for Database and Application Server layers are not clear and hence we will have to design the solution based on certain assumptions.

The SAPS split for Application and Database are not provided explicitly, therefore, we can take a conservative thumb rule of 70:30 for App and DB and we arrive at 21000 SAPS for App and 9000 SAPS for DB.

The SAPS split for ASCS/SCS and App is not explicitly given. With our experience we now ASCS/SCS instance usually do not need too much of SAPS and therefore we can keep the SAPS for ASCS/SCS instance to bare minimum.

Solution - from the table:

1. VM type D11 is considered for ASCS/SCS instance. D11 is benchmarked for 2338 SAPS at 99% utilization. Therefore scaling down to 65% utilization we get approximately 1520 SAPS.

2. VM Type D13 is considered for App instance. D13 is benchmarked for 11000 SAPS at 99% utilization. Therefore scaling down to 65% utilization we get approximately 7150 SAPS. 3*7150 = 21450 SAPS. Therefore the total SAPS for Application layer becomes 1520+21450 = 22970 (a bit higher than 21000 SAPS as per the requirement).

3. VM Type DS14 is considered for Database instance. DS14 is benchmarked for 18600 SAPS at 99% utilization. Therefore scaling down to 65% utilization we get approximately 12200 SAPS. Note that we also have other VMs that can be used for Database, e.g. D13 which is benchmarked for 11000 SAPS. However, in our solution we are considering to use Premium storage (a high performance storage option which uses SSDs unlike Standard storage which uses HDDs) which is currently available with DS-Series VMs. Though the required SAPS for DB is 9000 SAPS the VM providing us the nearest SAPS is DS-14 (in the DS Series).

4. The other requirement was of 7000 IOPS for database server. The blob storage provides us with a maximum of 500 IOPS per disc. Therefore, we can take 15*200 GB disc to achieve 15*500= 7500 IOPS for database files. For database log files we can use 1 x Premium Storage P20 512GB disk which provides single digit latency for log files.

SAPS and benchmarking.

Performance optimization is a continuous effort and the experience of running SAP systems helps in identifying the performance bottlenecks and scope for improvements. However, if we are preparing to setup a new SAP system or planning for migration from one hardware to another or from on-premises hardware to cloud platform, we need to carefully consider factors which can help in proper sizing. SAPS or SAP Application Performance Standard is the benchmarking done usually on a two-tier architecture with Sales & Distribution process (SD). This measurement is very important for sizing SAP workloads.

SAP applications are usually deployed in 2-tier or 3-tier architecture.

2-tier deployment architecture is a central installation where the Database as well as the application layer components are deployed on the single Operating System with multiple client systems accessing the SAP application.

3-tier deployment is a distributed architecture where single Database instance server supports multiple application server instances. Each instance (DB, App1, App2, Appn,.) are deployed on single Operating system and accessed by multiple client systems. In general 3-tier deployments are recommended for all but the smallest SAP on Azure implementations.

Respective topology represents different characteristics in terms of flexibility, complexity, manageability and resiliency of SAP systems.

As the deployment topologies are different, it is imperative we perform a detailed sizing exercise based on the deployment architecture (2-tier or 3-tier). We cannot and should not use 2-tier benchmark and size for 3-tier deployment architecture. The simple reason is the SAPS required for DB server instance. Typically, the SAP Quicksizer provides the result with 2-tier benchmark and I feel it is biased towards the application server benchmarking and somewhere misses on the DB server. Usually, SAP Quicksizer results shows around 10 to 15% of total SAPS for DB (refer to the results of examples given in SAP Quicksizer). In real world scenario the ratio between Application server to DB server is somewhere around 5:1 for OLTP workload (e.g. SAP ERP) and to a maximum of 1:1 for OLAP workload (e.g. SAP BW).

Another important point we must understand while sizing SAP systems for public cloud infrastructure is the buffer provisions. In on-premises scenario usually we factor in the user growth requirements (usually 3 to 5 years) and then sizing exercise is performed. Being in cloud we always have the ability and flexibility to scale up and scale out.

Care should be taken not to “oversize” SAP on Azure solutions. When procuring on-premises hardware SAP administrators normally add substantial performance buffer because hardware typically has a lifecycle of 3-5 years. Such a performance buffer is not required on Azure because resources can be added quickly and easily as needed.

References:

https://websmp205.sap-ag.de/~sapidb/011000358700000108102008E/QS_Best_Pract_V38_2.pdf

This week we increased the minimum required SQL Server 2014 patch for using the Columnstore with SAP BW. You should upgrade to:

SQL Server 2014 SP1 Cumulative Update 1.

As a matter of course you might apply any newer SQL Server Patch instead. You can always download the latest SQL Server patches from: http://technet.microsoft.com/en-us/sqlserver/ff803383.aspx.

SQL Server 2014 SP1 CU1 solves particularly the following issues:

Sporadic issues with SQL error 35377 do not occur any more.
See SAP Note 2153188 - SQL Error 35377 for details
BW cube compression of the Flat Cube is much faster:
See SAP Note 2153188 - SQL Error 35377 for details
A potential issue regarding a physical inconsistency is fixed.
See https://support.microsoft.com/en-us/kb/3067257 for details. Although we have never seen this issue on an SAP system yet, we strongly recommend applying the SQL Server patch above.

We have updated SAP Note 2116639 - SQL Server 2014 Columnstore documentation, which always contains the latest news regarding SQL Server Columnstore on SAP BW.

SAP Retail Customer Saves Money by moving legacy Info Assets to Azure

Foodstuffs North Island (FSNI) is a $6 Billion Dollar Supermarket and Retail chain in New Zealand with 400 stores distributed across the North Island New Zealand.

A decommissioned SAP implementation ran ECC 6.0 and BW 7.0 on IBM AIX and DB2 running on p570 UNIX servers. In order to meet tax and legal compliance requirements while at the same time terminating a legacy end of life technologies and lowering costs FSNI migrated these Decommissioned SAP systems to Azure.

1. Business Problem & Technological Obsolescence

FSNI re-implemented a full SAP IS RETAIL solution in 2013 running on Windows 2012, SQL Server 2012 and Hyper-V and has successfully deployed this solution to some of the 400 stores. The remaining stores are interfaced to the new solution and are being progressively converted to a new POS solution and simultaneously migrated to the new SAP IS-RETAIL solution. The modern deployment leverage all the latest SAP applications for Retail including ECC 6.0, FNR, PMR, RC, PI, EP and POSDM.

The legacy SAP systems are no longer live, but must be kept available for legal, audit and taxation reasons. These legacy systems were expensive to maintain and did not align to global industry trends around consolidation onto commodity platforms

2. SAP Azure Platform – Pay Per Minute Infrastructure & Software Licenses

FSNI switched from DB2 to SQL Server during the migration to take advantage of features like database compression and better integration with cloud platforms. FSNI did not wish to buy full SQL Server licenses for a system that might be running for just a few hours per week.

The Azure platform offers customers the ability to rent virtual machines that come with a SQL Server database license included. As with most Azure services a VM with a rented copy of SQL Server is charged at one minute intervals.

In price sensitive industries such as Retail moving the cost base to a “Pay per Use” model is part of the way FSNI controls costs and remains competitive.

3. SAP OS/DB Migration Process

SAP provide free of charge tools that allow a customer to export a system from one OS and DB combination and import to another different OS and DB combination. The tools are well documented and simple to use. A licensed OS/DB Migration Consultant is permitted to migrate Production systems. Any SAP Basis consultant is permitted to perform an OS/DB migration on non-production systems.

FSNI engaged a local partner Syd Consulting to perform the SAP Migration and to setup the Azure resources required to support the SAP systems. Syd Consulting took approximately 4 days to export the AIX/DB2 system and then import and upload to Azure. Additional activities included setting up the Azure Site to Site VPN to allow seamless network connectivity between Foodstuffs and Azure.

4. Automating SAP on Azure – Scheduled Start Up and Shutdown

FSNI required an automated Start Up and Shutdown schedule to minimize the cost and maximize the benefits of the pay as you go Azure model. During the migration from AIX/DB2 to Azure and SQL Server SYD had the Azure VMs running 24 hours per day, but once the migration was completed this was switched to a schedule which allowed the business to access the system as required. Outside of core business hours the systems are automatically shut down and de-provisioned. When the Azure virtual machines are offline the cost of the VMs and the SQL Server database license drops to zero. Only the cost of storage is charged, a cost that is very inexpensive measured in cents per GB.

The users have seen no change in the operation of the system, they access it as they always have and the change has been transparent to users

5. Business Benefits

FSNI were able to lower fixed costs by shutting down high cost UNIX servers and stop paying for the associated hardware and software maintenance.

The legacy systems were migrated into the Azure Australia region. In order to ensure FSNI remained fully compliant with legal requirements a full compressed backup of the databases is held on an on premise server. This ensures that FSNI meets any requirement to physically hold the data within their own local jurisdiction.

FSNI also benefits from hosting resources in a world class data center with multiple layers of redundancy and security.

The requirement to maintain Decommissioned SAP systems is frequently a cost and resource burden for companies:

1. Obsolete HPUX, Sun Solaris and IBM AIX Servers have extremely expensive hardware and software support costs

2. Large UNIX systems often require 3 phase power, produce a lot of heat and consume a lot of data center space

3. Experienced UNIX systems administrators are becoming scarcer

4. Almost all organizations have standardized on commodity Intel platforms and/or Public Cloud and UNIX vendors are terminating investments into UNIX platforms

Most importantly FSNI has been able to switch costs to a pay-per-use model. “Moving the IT cost base to a pay-per-use rather than a ‘pay in full upfront regardless of use’ model is something FSNI will look to repeat in the future” states the customer.

6. Technical Details

Below is a summary of the technical details of the solution

Azure region = Australia East

VM sizes = A6 for database server and SAP application server

Database size on AIX/DB2 3.2TB was compressed to 1.2TB on SQL Server

The Migration took 4 days to complete

Local Redundant storage is used for the database and Geo Redundant for backups

Azure Site-2-Site VPN has been used to seamlessly link the Azure VNET to Foodstuffs network. This allows finance users to logon to these systems as if they were running locally.

The project followed the standard SAP migration methodology and the creation of the target server including the storage required was very straight forward. The most time consuming part was getting the networks integrated and all the firewall changes in place.

Links

Foodstuffs New Zealand www.foodstuffs.co.nz

SAP Migration Partner www.syd.co.nz

SAP Notes

1928533 - SAP Applications on Azure: Supported Products and Azure VM types

1999351 - Troubleshooting Enhanced Azure Monitoring for SAP

1380654 - SAP support in public cloud environments

2015553 - SAP on Microsoft Azure: Support prerequisites

2039619 - SAP Applications on Microsoft Azure using the Oracle Database: Supported Products and Versions

1329848 - Oracle Support for Microsoft Hyper-V

Documentation on SAP on Azure - http://go.microsoft.com/fwlink/p/?LinkId=397566

Information on UNIX platforms

http://www.intel.com/content/dam/doc/white-paper/performance-xeon-7500-next-gen-x86-paper.pdf

In this article we want to talk about four SAP 3-Tier benchmarks that SAP and Microsoft released today. As of October 5^th 2015, the best of these benchmarks sets the World Record for the category of SAP Sales and Distribution Standard Application Benchmarks executed in the cloud deployment type. The benchmarks were conducted on Azure Virtual Machines, but the technical finding we discuss in this article, are often applicable to private cloud deployments of Hyper-V as well. The idea of the four benchmarks was to use a SAP 3-Tier configuration and use the Azure VM types GS1, GS2, GS3 and GS4 as dedicated DBMS VMs in order to see how the particular GS VM-Series scale with the very demanding SAP SD benchmark workload. The results and detailed data of the benchmarks can be found here:

Dedicated DBMS VM of GS1: 7000 SAP SD benchmark and 38850 SAPS
Dedicated DBMS VM of GS2: 14060 SAP SD benchmark users and 78620 SAPS
Dedicated DBMS VM of GS3: 24500 SAP SD benchmark users and 137520 SAPS

World Record for SAP SD Standard Application benchmarks executed in public cloud:

Dedicated DBMS VM of GS4: 45100 SAP SD benchmark users and 247880 SAPS

What do these benchmarks prove?

Let’s start with the SAP Standard Sales and Distribution Standard Application Benchmark first. This is a benchmark that executes six typical SAP transactions that are necessary to receive a customer order, check the customer order, create a delivery, book a delivery and look at past orders of a specific customer. The exact order of the SAP transactions as executed in the benchmark is listed here: http://global.sap.com/campaigns/benchmark/appbm_sd.epx. For non-SAP experts, the database workload of the benchmark can be characterized more as an OLTP benchmark than an OLAP benchmark. All the SAP benchmarks certified and released can be found here: http://www.sap.com/benchmark.

What do we call 2-Tier and 3-Tier in SAP?

We call a SAP 2-Tier configuration if the DBMS layer and the SAP application layer is running within the same VM or server on one operating system image. The second layer would be the user interface layer or in case of a benchmark configuration, the benchmark driver, like shown here:

We are talking about an SAP 3-Tier configuration if the DBMS and the SAP application layer are running in separate VMs/servers and the User interface layer or benchmark driver in a 3^rd tier as shown here:

The purpose of the benchmark is to get as much business throughput out of a configuration as shown above. In the 2-Tier configuration we want to push the CPU resources as hard as we can. Ideally being as close to 100% CPU consumption as possible. On the other side, the benchmark requires an average response time of <1sec measured on the benchmark driver. Same rules apply for the 3-Tier benchmark. Whereas there the component we want to drive to maximum CPU utilization is the DBMS server or VM. Whereas we can scale-out the SAP application layer to produce more workload on top of the DBMS instance. This scale-out capability also does not force us to maximize CPU utilization in the application layer VMs/servers.

In order to provide you as a customer with sizing information, SAP requires us to certify our Azure VM Series with SAP. This usually is done by conducting a SAP Standard Sales and Distribution Standard Application Benchmark with a 2-Tier SAP ERP configuration. The reason we use this particular SAP SD benchmark in a 2-Tier configuration is the fact that this is the only SAP Standard Application Benchmark that provides a particular benchmark measure which is called 'SAPS'. The 'SAPS' unit is used for SAP sizing as explained here: http://global.sap.com/campaigns/benchmark/measuring.epx.

From our side, we usually don’t certify each and every Azure VM for SAP purposes. We pre-select VMs that we find capable for SAP software. The principles we apply should avoid that we certify VM types that are not suitable from a CPU to memory ratio for the relatively memory hungry SAP NetWeaver applications and related databases. Or that you run into cases where the DBMS in such a VM can’t scale up because of the Azure storage you can combine with a certain VM type or series.

At the end the Azure VM types certified for SAP and supported by SAP are listed in SAP Note: 1928533 – SAP Applications on Azure: Supported Products and Sizing.

If you take a closer look into the SAP note and into your SAP landscape, you will realize that the note falls short in listing 3-Tier configurations. If we compare SAP 2-Tier configurations using a certain VM type and compare to a 3-Tier configuration where the same VM type is used as a dedicated DBMS server, we can identify that in such 3-Tier configurations, the stress applied by the workload will differ in:

Much higher demands in IOPS writing into the disk that contains the DBMS transaction log.
Requirement of lower latency writing to the disk that contains the DBMS transaction log (ideally low single digit milliseconds).
High demand on network throughput since application instances in multiple VMs send tens of thousands of network requests per second to the dedicated DBMS VM.
Ability of the host running the VM as well as the VM to handle these tens of thousands of network requests per second.
Ability of the Operating System to deal with the tens of thousands of network requests per second
High storage bandwidth requirement to write frequent checkpoints by the database to the data files of the DBMS.

All the volumes listed in the areas above are orders of magnitude higher with running a SAP Standard Sales and Distribution benchmark in a 3-Tier configuration than it is the case when running in a 2-Tier configuration. Means VM configurations that might show capable enough of running fine in a configuration where the SAP application instances and the DBMS are running in one VM might not necessarily scale to a 3-Tier configuration. Therefore, 3-Tier configuration benchmarks additionally to the pure SAPS number derived out of the 2-Tier configuration benchmarks, give you way more information about the capabilities and scalability of the offered VMs and the underlying public cloud infrastructure, like Microsoft Azure.

Therefore, we decided to publish a first series of benchmarks of the Azure VM GS-Series where we used the Azure GS1, GS2, GS3 and GS4 VMs as dedicated DBMS servers on the one side and scaled the application layer by using the Azure GS5 VM type.

Actual configurations of these benchmark configurations

Let’s take a closer look at the configurations we used for these four benchmarks. Let’s start with introducing the Azure GS VM Series. The detailed specifications of this VM Series can be found at the end of this article: https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-size-specs/

Basic capabilities are:

Leverage Azure Premium Storage – this is a requirement for really scaling load on these VMs. Please note the certification using the GS VM types requires you to use Azure Premium Storage for the data and transaction log files of the SAP database(s)
The different VM sizes of this series allow to mount a different number of VHDs. The larger the VM, the more VHDs can be attached. See the column ‘Max. Data Disks’ in the article mentioned above.
The larger the VM, the higher the storage throughput bandwidth to Azure Premium Storage. See the column ‘Max. Disk IOPS and Throughput’ in the article mentioned above.
The larger the VM the larger the read cache that can be used by Premium Storage disks on the local Azure compute node. See the column ‘Cache Size (GB)’ in the article mentioned above.

Configuration Details:

For all four configurations we used a GS2 VM (4 vCPUs and 56GB memory) to run a traditional SAP Central Instance with its Enqueue and Message services.

As building blocks for the SAP application instances we used GS5 VMs (32 vCPUs and 448GB memory). That kept the configuration handy and small in regards to the number of VMs. Hence we could limit the number of application server VMs to:

One GS5 VM running SAP dialog instances for the GS1 dedicated DBMS VM
Two GS5 VM running SAP dialog instances for the GS2 dedicated DBMS VM
Four GS5 VM running SAP dialog instances for the GS3 dedicated DBMS VM
Seven GS5 VM running SAP dialog instances for the GS4 dedicated DBMS VM

As dedicated DBMS VM we used the Azure GS1 to GS4 VMs. The SQL Server based database for SAP Standard Sales and Distribution benchmarks we use since years for SAP benchmarks has 16 data files plus one log file.

Disk configuration on the dedicated DBMS Server

In order to sustain enough IOPS, we chose to use Azure Premium Storage P30 disks. These provide 5000 IOPS and 200MB/sec throughput by themselves. Even for the transaction log we used the P30, despite the fact that we did not expect to hit anywhere close to 5000 IOPS on this drive. However, the bandwidth and capacity of the P30 drive made it easier to ‘misuse’ the disk for other purposes like frequent and many database backups of the benchmark database.

In opposite to our guidance in this document: http://go.microsoft.com/fwlink/p/?LinkId=397965 we did not use Windows Storage Spaces at all to build stripes over the data disks. The reason is that the benchmark database is well maintained with its 16 data files and is well balanced in allocated and free space. That results in SQL Server always allocating and distributing data evenly amongst the 16 data files. Hence reading data from the data files is balanced as well as checkpoints writing into the data files. Actually conditions, we don’t meet too often with customer systems. Therefore, the customer guidance to use Windows Storage Spaces as expressed in our guidance.

We distributed the data files in in same numbers over two (GS1 and GS2) P30 disks or four (GS3 and GS4) P30 data disks. These disks used Azure Premium Storage Read-Cache.

Whereas the log file was on its own disk which did not use any Azure Premium Storage cache.

Controlling I/O in a bandwidth limited environment

Independent of the deployment, whether bare-metal, private cloud and public cloud, you can encounter situations where hard limits of IOPS or I/O bandwidth are enforced for an OS image or VM. Looking at the specifications of our Azure GS Series, it is apparent that this is the case. This can be kind of a problem with DBMS or here SQL Server specifically, because some data volumes SQL Server is trying to read or write are either dependent on the workload or simply hard to limit. But as soon as such an I/O bandwidth limitation is hit, the impact on the workload can be severe.

One of the scenarios customers reported for long time in such bandwidth limited environments was the impact of SQL Server’s checkpoint writer on the writes that need to happen into the transaction log. SQL Server’s checkpoint mechanism is only throttling back on issuing I/Os once the latency for writing to the data files is hitting a 20ms threshold. In most of the bandwidth limited environments using SSDs, as Azure Premium Storage, you only get to those 20ms, when you hit the bandwidth limits of VM/server or infrastructure. If all the bandwidth to the storage is eaten up by writing checkpoints, then write operations that need to persist changes in the database transaction log will suffer high latencies. Something that again impacts the workload in a severe manner instantaneously.

That problem of controlling the volume of data SQL Server’s checkpoint writer can write per second led to several solutions that could be used automatically with SQL Server:

Introduction of an I/O volume limit SQL Server would honor when writing checkpoints. This method is documented here: https://support.microsoft.com/en-us/kb/929240
Introduction of indirect checkpoint as documented in this article: https://msdn.microsoft.com/en-us/library/ms189573.aspx

Both methods can be used to limit the volume that SQL Server writes during checkpoints, so, that the maximum I/O bandwidth, the infrastructure has available, is not reached by just writing checkpoints. As a result of limiting the data volume per second that is written to the storage during a checkpoint, other read and write activities will encounter a reliable latency and with that can serve the workload in a reliable manner.

In the case of our Azure benchmarks we decided to go with the first solution (https://support.microsoft.com/en-us/kb/929240) where we could express the limit the checkpoint writer could write per second in MB/sec. Knowing the limit we had with each of the VM types we usually gave the checkpoint writer 50-75% of the I/O bandwidth of the VM. For the SAP SD benchmark workload, we ran, these were appropriate settings. No doubt, for other workloads, the limit could look drastically different.

Just to give you an idea, in the case of the benchmark we used GS4 as dedicated DBMS server running 45100 SAP SD benchmark users, the write volume into the SQL Server transaction log was around 100MB/sec with 1700-1800 I/Os per second.

With limiting the data volume per second, the checkpoint writer could issue, we could:

Provide enough bandwidth for the writes into the transaction log
Reliable write latency of around 3ms for writing into the transaction log
Stay within the limit of the IOPS and data volume a single P30 disk could sustain while writing the checkpoints
Avoid that we needed to spread the data files over more than 4 x P30 disks to sustain checkpoints

Dealing with Networking

The one resource in the SAP 2-Tier configuration that is not significantly stressed at all is networking. Whereas in 3-Tier configurations, this is the most challenging part, besides providing low enough disk latency. Just to give an idea. With the GS4 centered SD benchmark running 45100 SAP SD benchmark users, we look into around 550,000 network requests per second, the DBMS VM needs to deal with. The data volume exchanged over the network by the DBMS VM and its associated VMs running the dialog instances was around 450MB/sec or 3.6Gbit/sec.

Looking at those volumes, three challenges arise:

The network quotas assigned to VMs in such scenarios need to have enough bandwidth assigned to sustain e.g. 3.6Gbit/sec network throughput.
The Hypervisor used needs to be able to deal with around 550K network requests per second.
The Guest-OS needs to be able to handle 550K network requests.

In the VMs we tested, there was no doubt that they could handle the network volume. So point #1 was a given by the VM types.

Point #2 got resolved in October 2014, when we completed deployment of the Windows Server 2012 R2 Hyper-V to our hundreds of thousands of Azure nodes. With Windows Server 2012 R2, we introduced Virtual RSS (vRSS). This feature allows to distribute handling of the network requests over more than one logical CPU on the host.

Point #3 as well got addressed with Windows Server 2012 R2 with the introduction of vRSS. Before vRSS, all the network requests arriving in the Guest-OS were handled on vCPU 0 usually. In case of our benchmark exercise with the GS4 as dedicated DBMS server, the throughput shown could not be achieved since vCPU 0 of the VM went close saturation. In the other three SAP benchmarks using GS1, GS2 and GS3 we could work without leveraging vRSS within the DBMS VM. But with GS4, the CPU resources could not be leveraged without using vRSS within the VM.

The benchmark exercises with the different VM types showed that there can be an impact on network handling even if a vCPU handling the network requests is running SQL Server workload as well. If such a vCPU was required to leverage its resources mostly for network handling, plus having SQL Server workload on it, we could see that the network handling as well as the response time for SQL requests handled on this CPU got severely affected. This issue certainly got amplified by us using affinity mask settings of <> 0 in the benchmark configuration which blocks Windows from rescheduling requests on a different vCPU that could provide more resources.

As essence one can state that:

For production scenarios where you usually don’t use affinity mask settings of <> 0, you need to use vRSS at least with DBMS VMs/servers that have 16 logical or virtual CPUs.
For scenarios where you are using affinity mask for SQL Server instances, because you e.g. run multiple SQL Server instances and want to restrict CPU resources for each of them, use vRSS as well. Don’t assign (v)CPUs that handle vRSS queues to the SQL Server instance(s). That is what we did at the end in the benchmark that used the GS4 as DBMS VM. Means in our GS4 benchmark SQL Server used 14 of the 16 vCPUs only. The two vCPUs not running SQL Server were assigned to handle one vRSS queue each.

General remarks to vRSS

In order to use vRSS, your Guest-OS needs to be Windows Server 2012 R2 or later. Virtual RSS is off by default in VMs that are getting deployed on Hyper-V or Azure. You need to enable it when you want to use it. Rough recommendations can look like:

In cases where you have affinity mask of SQL Server set to 0, you want to use at least 4 queues and 4 CPUs. The default start vCPU to handle the RSS is usually 0. You can leave that setting. It is expected that vCPUs 0 to 4 are used then for handling the network requests.
In cases where you exclude vCPUs to handle network requests, you might want to distribute those over the different vNUMA nodes (if there are vNUMA nodes in your VM), so that you still can use the memory in the NUMA nodes in a balanced way.

So all in all this should give you some ideas on how we performed those benchmarks and what the principal pitfalls were and how we got around them.

The remainder of this article is listing some minimum data about the benchmarks we are obligated to list according to SAP benchmark publication rules (see also: http://global.sap.com/solutions/benchmark/PDF/benchm_publ_proc_3_8.pdf ).

Detailed Benchmark Data:

Benchmark Certification #2015042: Three-tier SAP SD standard application benchmark in Cloud Deployment type. Using SAP ERP 6.0 Enhancement Package 5, the results of 7000 SD benchmark users and 38415 SAPS were achieved using:

Deployment type: Cloud
1 x Azure VM GS2 with 4 CPUs and 56 GB memory running windows Server 2012 R2 Datacenter as Central Instance
1 x Azure VM GS5 with 32 CPUs and 448 GB memory running Windows Server 2012 R2 Datacenter for Dialog Instances
1 x Azure VM GS1 with 2 CPUs and 28 GB memory running Windows Server 2012 R2 Datacenter and SQL Server 2012 as DBMS instance
All VMs hosted on Azure nodes with the capability of providing, dependent on configuration, a maximum of 2 processors / 32 cores / 64 threads based on Intel Xeon Processor E5-2698B v3 with 2.00 GHz
More details: Link to SAP benchmark webpage follows as soon as SAP updated site with this benchmark result

-------------------------------------------------------------------------------------------------------------------------------------------

Benchmark Certification #2015043: Three-tier SAP SD standard application benchmark in Cloud Deployment type. Using SAP ERP 6.0 Enhancement Package 5, the results of 14060 SD benchmark users and 78620 SAPS were achieved using:

Deployment type: Cloud
1 x Azure VM GS2 with 4 CPUs and 56 GB memory running windows Server 2012 R2 Datacenter as Central Instance
2 x Azure VM GS5 with 32 CPUs and 448 GB memory running Windows Server 2012 R2 Datacenter for Dialog Instances
1 x Azure VM GS2 with 4 CPUs and 56 GB memory running Windows Server 2012 R2 Datacenter and SQL Server 2012 as DBMS instance
All VMs hosted on Azure nodes with the capability of providing, dependent on configuration, a maximum of 2 processors / 32 cores / 64 threads based on Intel Xeon Processor E5-2698B v3 with 2.00 GHz
More details: Link to SAP benchmark webpage follows as soon as SAP updated site with this benchmark result

-------------------------------------------------------------------------------------------------------------------------------------------

Benchmark Certification #2015044: Three-tier SAP SD standard application benchmark in Cloud Deployment type. Using SAP ERP 6.0 Enhancement Package 5, the results of 24500 SD benchmark users and 137520 SAPS were achieved using:

Deployment type: Cloud
1 x Azure VM GS2 with 4 CPUs and 56 GB memory running windows Server 2012 R2 Datacenter as Central Instance
4 x Azure VM GS5 with 32 CPUs and 448 GB memory running Windows Server 2012 R2 Datacenter for Dialog Instances
1 x Azure VM GS3 with 8 CPUs and 112 GB memory running Windows Server 2012 R2 Datacenter and SQL Server 2012 as DBMS instance
All VMs hosted on Azure nodes with the capability of providing, dependent on configuration, a maximum of 2 processors / 32 cores / 64 threads based on Intel Xeon Processor E5-2698B v3 with 2.00 GHz
More details: Link to SAP benchmark webpage follows as soon as SAP updated site with this benchmark result

-------------------------------------------------------------------------------------------------------------------------------------------

Benchmark Certification #2015045: Three-tier SAP SD standard application benchmark in Cloud Deployment type. Using SAP ERP 6.0 Enhancement Package 5, the results of 45100 SD benchmark users and 247880 SAPS were achieved using:

Deployment type: Cloud
1 x Azure VM GS2 with 4 CPUs and 56 GB memory running Windows Server 2012 R2 Datacenter as Central Instance
7 x Azure VM GS5 with 32 CPUs and 448 GB memory running Windows Server 2012 R2 Datacenter for Dialog Instances
1 x Azure VM GS4 with 16 CPUs and 224 GB memory running Windows Server 2012 R2 Datacenter and SQL Server 2012 as DBMS instance
All VMs hosted on Azure nodes with the capability of providing, dependent on configuration, a maximum of 2 processors / 32 cores / 64 threads based on Intel Xeon Processor E5-2698B v3 with 2.00 GHz
More details: Link to SAP benchmark webpage follows as soon as SAP updated site with this benchmark result

-------------------------------------------------------------------------------------------------------------------------------------------

For more details see: http://global.sap.com/campaigns/benchmark/index.epx and http://global.sap.com/campaigns/benchmark/appbm_cloud_awareness.epx .

For customers of SAP (and others) it is very important that the runtime of the different reports or programs is predictable, fast and stable. From time to time it happens that former fast query executions deteriorate in their runtime and need much more time than before. For most of the customers it is totally unclear why this happens and what the root may be. In this blog I will try to describe how changing query runtimes can be explained by uneven data distribution and query compilation.

SQL Server, like most other DBMS, is caching query plans to reduce the CPU and memory overhead of compiling statements. The cache that is used for the plans is called plan, statement or procedure cache. A plan of a query is compiled and stored when the query is executed the first time. The SQL Server optimizer will create the access plan based on the given parameters for the WHERE clause of the first execution of the query. This cached plan then has to used for all subsequent executions of the same query, independently of the parameter sets of the subsequent executions. So the values of the first execution parameters determine the plan efficiency of the first and all subsequent executions.

I'd like to discuss the underlying problem with an easy example. I set up a table named ORDERS with customer orders. It has only 7 columns

Verteilung Table Orders

a primary key (ORDERNUMBER) and three nonclustered indexes, one for CUSTOMER_ID alone, one for CUSTOMER_ID with ITEMS and one for CUSTOMER_ID with REGION.

Verteilung Indexes ORDERS

Side Note: This set of indexes is not fully optimized as the ORDERS_NC_CUSTOMER_ID index is redundant, as both the ORDERS_NC_ITEMS or ORDERS_NC_REGION can be used for the same purpose (searching for CUSTOMER_ID only). The indexes are only created to show the problem more clearly. In general the order of the columns in an index should follow their selectivity, starting with the most selective filed at the beginning down to the least selective field at the end.

I inserted 160.000 rows with fictional orders of 20 customers. They all have more or less the same amount of orders (around 4.700 orders each), only the customer with the ID number of 2 has only 250 orders and the busy customer with ID = 15 has 76000 orders. The graph of the distribution looks like:

Verteilung Excel Graph

This will be our starting point. As SAP uses at least since version 640 prepared statements for the access to the data in SQL Server databases, I'm using simple and prepared statements to select from this table:

SELECT REGION, ITEMS FROM ORDERS WHERE CUSTOMER_ID = @P1

It is searching for an CUSTOMER_ID and will give back the REGION and ITEMS of this customer. The complete execution looks like:

DBCCFREEPROCCACHE -- Clear the cache so that we get a new plan every time (only for testing)
GO
DECLARE@P2INT ;
EXECsp_prepexec@P2output,N'@P2 INT', N'SELECT REGION, ITEMS FROM ORDERS WHERE CUSTOMER_ID = @P1',@P1 = 2
EXECsp_unprepare@P1 ;
GO

Setting @P1 either to 2 (for the small customer), to 15 (for the busy customer with many orders) or to any other ID (e.g. 12), we might see different plans of the SQL Server to get the data. I ran all three execution in one batch to see the relative cost of each statement compared to the other two.

Lets get started with ID = 12, one of the normal customers with an average amount of orders. As we have an index on CUSTOMER_ID and we are selecting only roughly 4.700 rows out of 160.000 rows (2,9 %) we could expect that the optimizer will use an index.

Verteilung Query Plan 12

SQL Server is choosing two index intersections with a Hash Match, the first between the ITEMS and the CUSTOMER_ID index and with resulting row set with the REGION index. The cost of this query was 27 % out of 100 % for all three queries (CUSTOMER_ID = 2,12 and 15), so this query is cheaper as the expected average of 33 %.

In this plan SQL Server gives us an index recommendation to speed up this statement. The recommended index has the CUSTOMER_ID as the only key column, but includes the REGION and ITEMS as an included fields. This would avoid the lookup of the rows to get the REGION and ITEMS, as they would be included in the index directly. SAP supports included columns in index definitions.

This was a fairly complex plan for customer 12, so what can we expect from the big customer (number 15)?

Verteilung Query Plan 15

SQL Server was choosing a full table scan to retrieve the data, as we are selecting 74.000 rows out of 160.000 rows (46 %). Therefore the SQL Server is choosing the primary key as he has to retrieve REGION and ITEMS from the table and none of the indexes contain both columns. The cost of the query lies at 70 %, what was expected due to the table scan. This query would benefit from the recommended index as well, as the SQL Server then has to scan only the smaller index instead of the complete table. The tipping point between an Index Seek and a Table Scan lies somewhere between 2-5 % of the rows of the table. If SQL Server estimates more that this amount of rows to be returned, it will perform more likely a table scan. If there are less rows an Index Seek or Index Range Scan is more likely. There are many other factors beside the amount of rows that are considered by SQL Server when the plan is compiled.

After this expensive plan, what will customer number 2 with only 250 rows give us ?

Verteilung Query Plan 2

SQL Server uses again an index intersection, but this time only one between the REGION and the ITEMS index. The cost of this query is very low with only 3%.

So we got three different plans, depending on which parameter value was use to create the plan. In my example I created a new plan every time, but what will happen if we have to reuse the same plan all the time as it is the case for an SAP system ? The plan will be the same for all three executions !

Used value for compilation	Resulting plans
Customer 12
Customer 15
Customer 2

When you look at the plans you will see that the plans that were created for customer 12 and 2 show a yellow warning for the customer 15 execution. This is because this plan is so expensive that it spools data into tempdb for the execution with customer 15, what will slow down the execution even more.

What we proved today is, that depending on the initial value that is used to compile the plan and the selectivity of the value, the plan for the exact same statement can be totally different and might be not optimal for all subsequent executions of the same statement.

But why can a plan change happen ? There are many causes for a plan eviction from the cache or a recompilation of the plan. Here only the most important and most frequent ones:

Cause	When can it happen
Changes made to a table or view referenced by the query.	Installation of a SAP Support package or transport
Changes to any indexes used by the execution plan	Installation of a SAP Support package or transport
Updates on statistics used by the execution plan, either explicitly with UPDATE STATISTICS or generated automatically	Manually or after a certain amount of row changes.
The Plan fell out of the LRU Plan Cache	When many other new plans push out the plan out of the plan cache
SQL Server Service restart	Only manually
An explicit call to sp_recompile to recreate the plan(s) for one statement or table	Only manually
An explicit flush of the complete plan cache	Only manually

Solutions:

What can be the solutions in a case like this?

Create a fitting index

SQL Server suggested a new index for two of the three executions, what will happen if we create this index ?

CREATE NONCLUSTERED INDEX ORDERS_GOOD ON [dbo].[ORDERS] ([CUSTOMER_ID])
INCLUDE ([REGION],[ITEMS])
GO
The plans for the three executions are like this:

We will get a stable plan as the plan is the same for all three parameter values. You can see the different costs for the different values, but the plan is the optimal plan for all three statements. But sometimes it is not possible to create a good index for all possible values, what else can we do ?

Recompile the statement at each execution

When we force SQL Server to compile the statement at each execution, we will get the best plan for all parameter sets, but with a CPU penalty as the compilations cost CPU. But based on experiences with many customer systems, CPU is not a limited resource anymore, especially with the new and powerful commodity server that available today.

How can we force a recompilation from within the SAP System ? Therefore we have to change the ABAP report to add a database hint. Given a fictional ABAP statement:

SELECT REGION, ITEMS FROM ORDERS
INTO TABLE it_orders
WHERE CUSTOMER_ID EQ it_CU_ID.

This statement uses a variable it_CU_ID to pass the customer ID to the statement and will run on the database ilk one of the above shown statements. To enable the recompilation you have to add a line with the database specific hint:

SELECT REGION, ITEMS FROM ORDERS
INTO TABLE it_orders
WHERE CUSTOMER_ID EQ it_CU_ID
%_HINTS MSSQLNT '&REPARSE&'.

The resulting SQL Server statement will include a WITH (RECOMPILE) clause. This statement is then compiled at every execution by SQL Server.
ABAP rewrite

This is only a last-resort option and should only be used if any other solution didn't show a result. The idea behind this is, to use the unique ABAP statement ID to get separate plan. Each ABAP statement has a unique identification, which is part of the statement (Comment block at the end of the statement). So each ABAP statement will get its own plan. When we now separate the executions of the big and small customer from the "normal" ones, we will get a unique plan for each of them. The ABAP pseudo code might then look like:

IF it_CU_ID = 15 THEN

SELECT REGION, ITEMS FROM ORDERS
INTO TABLE it_orders
WHERE CUSTOMER_ID EQ it_CU_ID.

ELSE IF it_CU_ID = 2 THEN

SELECT REGION, ITEMS FROM ORDERS
INTO TABLE it_orders
WHERE CUSTOMER_ID EQ it_CU_ID.

ELSE

SELECT REGION, ITEMS FROM ORDERS
INTO TABLE it_orders
WHERE CUSTOMER_ID EQ it_CU_ID.

The selection are the same, but the underlying ABAP statement ID is different and so the plans are. There might be other ways to change the ABAP to work around distribution problems. Keep in mind that the maintenance effort and error rate might be higher when you implement something like this.

More information:

Execution Plan Caching and Reuse (MSDN article)
Parameters and Execution Plan Reuse (MSDN article)
SQL Execution Plans Part 1 or 3
SQL Execution Plans Part 2 or 3
SQL Execution Plans Part 3 or 3

We have released a new white paper detailing how to size SAP solutions on Microsoft Azure public cloud. The document is attached to this blog.

Initial work on this topic is mentioned in this blog: How to Size SAP Systems Running on Azure VMs.

Microsoft is Executing and Publishing SAP SD Standard Application Benchmark on Azure

Sizing is an important topic when deploying a SAP solution since we want to make sure we have enough computing resources to guarantee us the performance needed.

As Microsoft is a cloud provider, we are responsible for executing and publishing official SAP 2-tier and 3-tier benchmarks. All the SAP benchmarks certified and released on Azure can be found here: http://www.sap.com/benchmark and are also listed in SAP Note 1928533 - SAP Applications on Azure:Supported Products and Azure VM types

Recently we announced WorldRecord SAP Sales and Distribution Standard Application Benchmark for SAP cloud deploymentsreleased using Azure IaaS VMs.

Each benchmark specifies the exact configuration, VM size, and storage type used for the benchmark.

The sizing process is nevertheless a complex topic. To help our customers and partners with this process, we have developed this document as a set of guidelines to help in the sizing process.

“T-Shirt” Sizing

This document provides “T-shirt” sizing primary for 3-tier SAP Solutions running on Azure, as 3-tier is more suitable for the productive SAP use case. The document also discusses 2-tier configurations for SAP solutions on Azure.

Picture 1: T-shirt Sizing

Covered SAP Applications and Components

As different SAP applications and components can have very different workload characteristics and requirements, the document also reflects these different applications, such as SAP ECC 6.0, SAP Business Objects, SAP liveCache, SAP Content Server, SAP Solution Manager, SAP BI Java and Enterprise Portal, SAP XI/PI, SAP NetWeaver Gateway, CTM Optimizer, SAP TREX etc. The document also reflects different components of the SAP NetWeaver technology stack, such as SAP application servers, SAP database server, and SAP ASCS/SCS instance.

Azure Premium Storage

A special chapter is dedicated to Azure fast SSD premium storage since as fast storage is crucial for performant storage-intensive IO workloads like database servers.

Advantages of Dynamic SAP Resizing on Azure Public Cloud

Many customers who run their SAP systems on-premises are oversizing their infrastructure to make sure they can cover the peak load. Peak time can happen for example once in three months (during the quarter close) and can last for a few days. But most of the time, there is a much lower load on the system compared to peak time, which means that infrastructure resources dedicated to SAP system are underutilized.

One huge advantage of Azure cloud compared to on-premises infrastructure is the ability to dynamically increase resources (to cater for increased demand) or decrease resources if resources are underutilized (to save money). Either way, you are only paying for what is used and when it is used.

Here is an example of scaling out the SAP NetWeaver ABAP application server component, where the SAP workload is based on interactive SAP user workload, which is running on two SAP application servers running on two Azure VMs (the same principle can be applied on SAP batch, RFC, and update workload). SAP user workload is automatically balanced by logon (SMLG) groups.

As new users are added to an SAP application server, after a certain point the performance starts to degrade. Additional new workload further degrades application server performance, as the servers are too busy serving currently logged-on users. In extreme cases the application server appears to hang.

Picture 2: Scale Out of SAP Application Server (AS) - Initial State

The solution for this problem is to dynamically scale out SAP application server layer, e.g. to add an additional new third SAP application to handle new SAP users. We have already prepared additional application servers (AS3 and AS4), which are pre-deployed on two VMs. These two VMs are initially in an offline state and are not generating cost (except a very minor storage cost).

We can start the new VM with SAP AS either manually or using Azure Autoscale in Azure for automatic scale out.

Picture 3: Scale Out of SAP Application Server (AS) – End State after the scale out

In a similar way, when SAP user load drops, we can scale in SAP application server(s), e.g. we can stop some SAP application server(s) and shut down VM(s), so reducing infrastructure costs.

Generally speaking, it is recommended to review utilization on infrastructure about every three months to determine whether the Azure resources provisioned match the demand from the SAP workload.

You can also actively monitor utilization, try to recognize the utilization patterns and identify peaks, and proactively resize Azure infrastructure for your SAP system.

Dedicated DBMS VM of GS1: 7000 SAP SD benchmark and 38850 SAPS
Dedicated DBMS VM of GS2: 14060 SAP SD benchmark users and 78620 SAPS
Dedicated DBMS VM of GS3: 24500 SAP SD benchmark users and 137520 SAPS

World Record for SAP SD Standard Application benchmarks executed in public cloud:

Dedicated DBMS VM of GS4: 45100 SAP SD benchmark users and 247880 SAPS

What do these benchmarks prove?

What do we call 2-Tier and 3-Tier in SAP?

In order to provide you as a customer with sizing information, SAP requires us to certify our Azure VM Series with SAP. This usually is done by conducting a SAP Standard Sales and Distribution Standard Application Benchmark with a 2-Tier SAP ERP configuration. The reason we use this particular SAP SD benchmark in a 2-Tier configuration is the fact that this is the only SAP Standard Application Benchmark that provides a particular benchmark measure which is called ‘SAPS’. The ‘SAPS’ unit is used for SAP sizing as explained here: http://global.sap.com/campaigns/benchmark/measuring.epx.

At the end the Azure VM types certified for SAP and supported by SAP are listed in SAP Note: 1928533 – SAP Applications on Azure: Supported Products and Sizing.

Much higher demands in IOPS writing into the disk that contains the DBMS transaction log.
Requirement of lower latency writing to the disk that contains the DBMS transaction log (ideally low single digit milliseconds).
High demand on network throughput since application instances in multiple VMs send tens of thousands of network requests per second to the dedicated DBMS VM.
Ability of the host running the VM as well as the VM to handle these tens of thousands of network requests per second.
Ability of the Operating System to deal with the tens of thousands of network requests per second
High storage bandwidth requirement to write frequent checkpoints by the database to the data files of the DBMS.

Actual configurations of these benchmark configurations

Basic capabilities are:

Leverage Azure Premium Storage – this is a requirement for really scaling load on these VMs. Please note the certification using the GS VM types requires you to use Azure Premium Storage for the data and transaction log files of the SAP database(s)
The different VM sizes of this series allow to mount a different number of VHDs. The larger the VM, the more VHDs can be attached. See the column ‘Max. Data Disks’ in the article mentioned above.
The larger the VM, the higher the storage throughput bandwidth to Azure Premium Storage. See the column ‘Max. Disk IOPS and Throughput’ in the article mentioned above.
The larger the VM the larger the read cache that can be used by Premium Storage disks on the local Azure compute node. See the column ‘Cache Size (GB)’ in the article mentioned above.

Configuration Details:

For all four configurations we used a GS2 VM (4 vCPUs and 56GB memory) to run a traditional SAP Central Instance with its Enqueue and Message services.

One GS5 VM running SAP dialog instances for the GS1 dedicated DBMS VM
Two GS5 VM running SAP dialog instances for the GS2 dedicated DBMS VM
Four GS5 VM running SAP dialog instances for the GS3 dedicated DBMS VM
Seven GS5 VM running SAP dialog instances for the GS4 dedicated DBMS VM

Disk configuration on the dedicated DBMS Server

We distributed the data files in in same numbers over two (GS1 and GS2) P30 disks or four (GS3 and GS4) P30 data disks. These disks used Azure Premium Storage Read-Cache.

Whereas the log file was on its own disk which did not use any Azure Premium Storage cache.

Controlling I/O in a bandwidth limited environment

That problem of controlling the volume of data SQL Server’s checkpoint writer can write per second led to several solutions that could be used automatically with SQL Server:

Introduction of an I/O volume limit SQL Server would honor when writing checkpoints. This method is documented here: https://support.microsoft.com/en-us/kb/929240
Introduction of indirect checkpoint as documented in this article: https://msdn.microsoft.com/en-us/library/ms189573.aspx

With limiting the data volume per second, the checkpoint writer could issue, we could:

Provide enough bandwidth for the writes into the transaction log
Reliable write latency of around 3ms for writing into the transaction log
Stay within the limit of the IOPS and data volume a single P30 disk could sustain while writing the checkpoints
Avoid that we needed to spread the data files over more than 4 x P30 disks to sustain checkpoints

Dealing with Networking

Looking at those volumes, three challenges arise:

The network quotas assigned to VMs in such scenarios need to have enough bandwidth assigned to sustain e.g. 3.6Gbit/sec network throughput.
The Hypervisor used needs to be able to deal with around 550K network requests per second.
The Guest-OS needs to be able to handle 550K network requests.

In the VMs we tested, there was no doubt that they could handle the network volume. So point #1 was a given by the VM types.

As essence one can state that:

For production scenarios where you usually don’t use affinity mask settings of <> 0, you need to use vRSS at least with DBMS VMs/servers that have 16 logical or virtual CPUs.
For scenarios where you are using affinity mask for SQL Server instances, because you e.g. run multiple SQL Server instances and want to restrict CPU resources for each of them, use vRSS as well. Don’t assign (v)CPUs that handle vRSS queues to the SQL Server instance(s). That is what we did at the end in the benchmark that used the GS4 as DBMS VM. Means in our GS4 benchmark SQL Server used 14 of the 16 vCPUs only. The two vCPUs not running SQL Server were assigned to handle one vRSS queue each.

General remarks to vRSS

In cases where you have affinity mask of SQL Server set to 0, you want to use at least 4 queues and 4 CPUs. The default start vCPU to handle the RSS is usually 0. You can leave that setting. It is expected that vCPUs 0 to 4 are used then for handling the network requests.
In cases where you exclude vCPUs to handle network requests, you might want to distribute those over the different vNUMA nodes (if there are vNUMA nodes in your VM), so that you still can use the memory in the NUMA nodes in a balanced way.

So all in all this should give you some ideas on how we performed those benchmarks and what the principal pitfalls were and how we got around them.

Detailed Benchmark Data:

Deployment type: Cloud
1 x Azure VM GS2 with 4 CPUs and 56 GB memory running windows Server 2012 R2 Datacenter as Central Instance
1 x Azure VM GS5 with 32 CPUs and 448 GB memory running Windows Server 2012 R2 Datacenter for Dialog Instances
1 x Azure VM GS1 with 2 CPUs and 28 GB memory running Windows Server 2012 R2 Datacenter and SQL Server 2012 as DBMS instance
All VMs hosted on Azure nodes with the capability of providing, dependent on configuration, a maximum of 2 processors / 32 cores / 64 threads based on Intel Xeon Processor E5-2698B v3 with 2.00 GHz
More details: Link to SAP benchmark webpage follows as soon as SAP updated site with this benchmark result

——————————————————————————————————————————————-

Deployment type: Cloud
1 x Azure VM GS2 with 4 CPUs and 56 GB memory running windows Server 2012 R2 Datacenter as Central Instance
2 x Azure VM GS5 with 32 CPUs and 448 GB memory running Windows Server 2012 R2 Datacenter for Dialog Instances
1 x Azure VM GS2 with 4 CPUs and 56 GB memory running Windows Server 2012 R2 Datacenter and SQL Server 2012 as DBMS instance
All VMs hosted on Azure nodes with the capability of providing, dependent on configuration, a maximum of 2 processors / 32 cores / 64 threads based on Intel Xeon Processor E5-2698B v3 with 2.00 GHz
More details: Link to SAP benchmark webpage follows as soon as SAP updated site with this benchmark result

——————————————————————————————————————————————-

Deployment type: Cloud
1 x Azure VM GS2 with 4 CPUs and 56 GB memory running windows Server 2012 R2 Datacenter as Central Instance
4 x Azure VM GS5 with 32 CPUs and 448 GB memory running Windows Server 2012 R2 Datacenter for Dialog Instances
1 x Azure VM GS3 with 8 CPUs and 112 GB memory running Windows Server 2012 R2 Datacenter and SQL Server 2012 as DBMS instance
All VMs hosted on Azure nodes with the capability of providing, dependent on configuration, a maximum of 2 processors / 32 cores / 64 threads based on Intel Xeon Processor E5-2698B v3 with 2.00 GHz
More details: Link to SAP benchmark webpage follows as soon as SAP updated site with this benchmark result

——————————————————————————————————————————————-

Deployment type: Cloud
1 x Azure VM GS2 with 4 CPUs and 56 GB memory running Windows Server 2012 R2 Datacenter as Central Instance
7 x Azure VM GS5 with 32 CPUs and 448 GB memory running Windows Server 2012 R2 Datacenter for Dialog Instances
1 x Azure VM GS4 with 16 CPUs and 224 GB memory running Windows Server 2012 R2 Datacenter and SQL Server 2012 as DBMS instance
All VMs hosted on Azure nodes with the capability of providing, dependent on configuration, a maximum of 2 processors / 32 cores / 64 threads based on Intel Xeon Processor E5-2698B v3 with 2.00 GHz
More details: Link to SAP benchmark webpage follows as soon as SAP updated site with this benchmark result

——————————————————————————————————————————————-

For more details see: http://global.sap.com/campaigns/benchmark/index.epx and http://global.sap.com/campaigns/benchmark/appbm_cloud_awareness.epx .

As of 10/08/2015, SAP GUI is supported to run on the Windows 10 operating system. This includes the usage of IE11 in conjunction with SAP GUI as well. The new Edge browser will not be supported since it is not running Active-X which would be essential for some of the SAP GUI scenarios. The exact SAP GUI versions and also important patch levels of those SAP GUI releases for running Windows 10 is documented in SAP Note #66971 – Supported SAPGUI platforms (SAP logon required). This note also covers the exact SAP GUI releases and patch levels which support the usage of IE11 in earlier versions of the Windows operating system in conjunction with SAP GUI.

I’d like to discuss the underlying problem with an easy example. I set up a table named ORDERS with customer orders. It has only 7 columns

Verteilung Table Orders

a primary key (ORDERNUMBER) and three nonclustered indexes, one for CUSTOMER_ID alone, one for CUSTOMER_ID with ITEMS and one for CUSTOMER_ID with REGION.

Verteilung Indexes ORDERS

Verteilung Excel Graph

This will be our starting point. As SAP uses at least since version 640 prepared statements for the access to the data in SQL Server databases, I’m using simple and prepared statements to select from this table:

SELECT REGION, ITEMS FROM ORDERS WHERE CUSTOMER_ID = @P1

It is searching for an CUSTOMER_ID and will give back the REGION and ITEMS of this customer. The complete execution looks like:

DBCC FREEPROCCACHE – Clear the cache so that we get a new plan every time (only for testing)
GO
DECLARE @P2 INT ;
EXEC sp_prepexec @P2 output,N’@P2 INT’, N’SELECT REGION, ITEMS FROM ORDERS WHERE CUSTOMER_ID = @P1′,@P1 = 2
EXEC sp_unprepare @P1 ;
GO

Verteilung Query Plan 12

This was a fairly complex plan for customer 12, so what can we expect from the big customer (number 15)?

Verteilung Query Plan 15

After this expensive plan, what will customer number 2 with only 250 rows give us ?

Verteilung Query Plan 2

SQL Server uses again an index intersection, but this time only one between the REGION and the ITEMS index. The cost of this query is very low with only 3%.

Used value for compilation	Resulting plans
Customer 12
Customer 15
Customer 2

But why can a plan change happen ? There are many causes for a plan eviction from the cache or a recompilation of the plan. Here only the most important and most frequent ones:

Cause	When can it happen
Changes made to a table or view referenced by the query.	Installation of a SAP Support package or transport
Changes to any indexes used by the execution plan	Installation of a SAP Support package or transport
Updates on statistics used by the execution plan, either explicitly with UPDATE STATISTICS or generated automatically	Manually or after a certain amount of row changes.
The Plan fell out of the LRU Plan Cache	When many other new plans push out the plan out of the plan cache
SQL Server Service restart	Only manually
An explicit call to sp_recompile to recreate the plan(s) for one statement or table	Only manually
An explicit flush of the complete plan cache	Only manually

Solutions:

What can be the solutions in a case like this?

Create a fitting index

SQL Server suggested a new index for two of the three executions, what will happen if we create this index ?

CREATE NONCLUSTERED INDEX ORDERS_GOOD ON [dbo].[ORDERS] ([CUSTOMER_ID])
INCLUDE ([REGION],[ITEMS])
GO

The plans for the three executions are like this:

We will get a stable plan as the plan is the same for all three parameter values. You can see the different costs for the different values, but the plan is the optimal plan for all three statements. But sometimes it is not possible to create a good index for all possible values, what else can we do ?

Recompile the statement at each execution
When we force SQL Server to compile the statement at each execution, we will get the best plan for all parameter sets, but with a CPU penalty as the compilations cost CPU. But based on experiences with many customer systems, CPU is not a limited resource anymore, especially with the new and powerful commodity server that available today.

How can we force a recompilation from within the SAP System ? Therefore we have to change the ABAP report to add a database hint. Given a fictional ABAP statement:

SELECT REGION, ITEMS FROM ORDERS
INTO TABLE it_orders
WHERE CUSTOMER_ID EQ it_CU_ID.

This statement uses a variable it_CU_ID to pass the customer ID to the statement and will run on the database ilk one of the above shown statements. To enable the recompilation you have to add a line with the database specific hint:

SELECT REGION, ITEMS FROM ORDERS
INTO TABLE it_orders
WHERE CUSTOMER_ID EQ it_CU_ID
%_HINTS MSSQLNT ‘&REPARSE&’.

The resulting SQL Server statement will include a WITH (RECOMPILE) clause. This statement is then compiled at every execution by SQL Server.
ABAP rewrite
This is only a last-resort option and should only be used if any other solution didn’t show a result. The idea behind this is, to use the unique ABAP statement ID to get separate plan. Each ABAP statement has a unique identification, which is part of the statement (Comment block at the end of the statement). So each ABAP statement will get its own plan. When we now separate the executions of the big and small customer from the “normal” ones, we will get a unique plan for each of them. The ABAP pseudo code might then look like:

IF it_CU_ID = 15 THEN

SELECT REGION, ITEMS FROM ORDERS
INTO TABLE it_orders
WHERE CUSTOMER_ID EQ it_CU_ID.

ELSE IF it_CU_ID = 2 THEN

SELECT REGION, ITEMS FROM ORDERS
INTO TABLE it_orders
WHERE CUSTOMER_ID EQ it_CU_ID.

ELSE

SELECT REGION, ITEMS FROM ORDERS
INTO TABLE it_orders
WHERE CUSTOMER_ID EQ it_CU_ID.

The selection are the same, but the underlying ABAP statement ID is different and so the plans are. There might be other ways to change the ABAP to work around distribution problems. Keep in mind that the maintenance effort and error rate might be higher when you implement something like this.

More information:

Execution Plan Caching and Reuse (MSDN article)
Parameters and Execution Plan Reuse (MSDN article)
SQL Execution Plans Part 1 or 3
SQL Execution Plans Part 2 or 3
SQL Execution Plans Part 3 or 3

SAP released a “Hot News” (SAP Note 2229228) on a significant problem using VMWare vSphere Storage APIs – Data Protection with VMWare ESXi 6.0.

Backing up a virtual machine (VM) with Changed Block Tracking (CBT) enabled fails after upgrading to or installing VMWare ESXI 6.0. You might encounter the following:

Databases running inside a VM can develop inconsistences
Powering on the VM can fail
Expanding the size of the virtual disk can fail
Taking VM quiesced snapshots can fail

SQL Server does not contribute to this problem but it can become impacted by the misbehavior shown by VMWare vSphere Storage APIs – Data Protection. Your database or the backup images could become physically inconsistent and have severe impacts to data availability.

To avoid inconsistencies caused by this problem, if you are using VMWare ESXi 6.0 move immediately to patch release ESXi600-201505001 (2116125) or later. More information on the patch can be found HERE. An upgradable version should be able to be found HERE.

AlwaysOn is a very large feature in SQL Server and we needed to talk about all of the parts so that you could understand how important different configuration steps are. But now we know you want to configure an AlwaysOn landscape and going back through all 12 blogs to make the list of the steps you need to do will take a lot of work.

So we’ve summarized here the main steps you need to decide/perform when configuring AlwaysOn and each step links to the blog entry which describes it in detail.

Preparation:

Installation

Configuration

Production

Additional readings

You can book mark this page so that whenever you need to configure an AlwaysOn landscape, you have this list of configuration steps. If we add any updates to the steps needed for AlwaysOn, we’ll update this list as well.

If you need more in-depth explanations and information about the functionality of an AlwaysOn System, you will find it in this complete list of the blog series:

Part 1: What is AlwaysOn ?

Part 2: Quorum detection / How do we achive quorum ?

Part 3: SAP Configuration with two replicas

Part 5: Preparing to build an AlwaysOn Availability Group

Part 7: Details behind an AlwaysOn Availability Group

Part 8: Failover Mechanism with SAP Netweaver

Part 10: Switching scheduled tasks automatically

Part 11+12: Performance Aspects and Performance Monitoring

We are pleased to announce certification and support for the three new SAP products on Microsoft Azure cloud:

SAP MaxDB database
SAP liveCache
SAP Content Server

SAP MaxDB Database

SAP MaxDB is one of the three databases owned by SAP. You can use the SAP MaxDB database in different context, for example as standalone database or in connection with SAP NetWeaver based products such as those using:

SAP NetWeaver ABAP technology
SAP NetWeaver Java technology
SAP NetWeaver ABAP + Java technology

SAP currently supports SAP MaxDB version 7.9 for use with SAP NetWeaver-based products in Azure.

SAP liveCache

SAP liveCache product is used in connection with SAP SCM (Supply Chain Management) and SAP APO (Advanced Planner and Optimizer), which is part of SAP SCM.

SAP SCM provides robust and advanced functionality like forecasting, capacity planning, production scheduling, and so on. All this involves the execution of huge algorithmically complex calculations. Calculation does not occur in the SAP SCM system but on a SAP liveCache system.

SAP liveCache is holding a huge amount of data in-memory, and in this way is speeding up access to the data and optimizing overall calculation process.

As SAP liveCache is an application that performs huge calculations, the amount and speed of RAM and CPU has a major influence on SAP liveCache performance.

For the Azure VM types supported by SAP (SAP Note 1928533 – SAP Applications on Azure: Supported Products and Azure VM types), all virtual CPU resources allocated to the VM are backed by dedicated physical CPU resources of the hypervisor. No overprovisioning (and therefore no competition for CPU resources) takes place.

Similarly, for all Azure VM instance types supported by SAP, the VM memory is backed by dedicated physical RAM resources of the hypervisor. No RAM overprovisioning takes place.

Due to the fact that you there is no overprovisioning of RAM and CPU, you obtain the best application CPU and RAM performance.

From this perspective it is highly recommended to use the new Azure D-series or DS-series (in combination with Azure premium storage) Azure VM type, or even bigger G-series or GS-series (in combination with Azure premium storage) with the latest Intel® Xeon® processor E5 v3 family.

Figure 1: SAP liveCache has to run on an dedicated Azure VM

Although SAP SCM is a SAP NetWeaver-based product that can use any supported database, SAP liveCache is based exclusively on SAP MaxDB technology.

The minimum supported version of SAP liveCache in Azure is 7.9.

SAP Content Server

The SAP Content Server is a separate, server-based component to store content such as electronic documents originating from different formats.

SAP Content Server is an application that runs on top of Microsoft IIS (Internet Information Server), using the SAP MaxDB database or file system to store files. An optional additional component is SAP Cache Server.

Typical content is training material and documentation from Knowledge Warehouse or technical drawings originating from the mySAP PLM Document Management System.

Figure 2: SAP Content Server in Azure

SAP Content Server in Azure version is supported with version 6.50 (and higher), running on Microsoft IIS (Internet Information Server) version 8.0 (and higher). If you choose to store the documents in the SAP MaxDB database, you have to use SAP MaxDB version 7.9.

You can find more details on SAP MaxDB for SAP NetWeaver, SAP liveCache, and SAP Content Server in the SAP DBMS in Azure Deployment Guide

Useful SAP Notes & Links

SAP Notes

767598 – Available SAP MaxDB documentation

826037 – FAQ: SAP MaxDB/liveCache Support

1139904 – FAQ: SAP MaxDB/liveCache database parameters

1173395 – FAQ: SAP MaxDB and liveCache configuration

1619726 – FAQ: SAP MaxDB Content Server

1928533 – SAP Applications on Azure: Supported Products and Azure VM types

Links

http://scn.sap.com/community/maxdb

http://scn.sap.com/community/scm/apo/livecache

https://service.sap.com/contentserver

We have released a new white paper detailing how to size SAP solutions on Microsoft Azure public cloud. The document is attached to this blog.

Initial work on this topic is mentioned in this blog: How to Size SAP Systems Running on Azure VMs.

Microsoft is Executing and Publishing SAP SD Standard Application Benchmark on Azure

Sizing is an important topic when deploying a SAP solution since we want to make sure we have enough computing resources to guarantee us the performance needed.

As Microsoft is a cloud provider, we are responsible for executing and publishing official SAP 2-tier and 3-tier benchmarks. All the SAP benchmarks certified and released on Azure can be found here: http://www.sap.com/benchmark and are also listed in SAP Note 1928533 – SAP Applications on Azure:Supported Products and Azure VM types

Recently we announced WorldRecord SAP Sales and Distribution Standard Application Benchmark for SAP cloud deploymentsreleased using Azure IaaS VMs.

Each benchmark specifies the exact configuration, VM size, and storage type used for the benchmark.

The sizing process is nevertheless a complex topic. To help our customers and partners with this process, we have developed this document as a set of guidelines to help in the sizing process.

“T-Shirt” Sizing

Picture 1: T-shirt Sizing

Covered SAP Applications and Components

Azure Premium Storage

A special chapter is dedicated to Azure fast SSD premium storage since as fast storage is crucial for performant storage-intensive IO workloads like database servers.

Advantages of Dynamic SAP Resizing on Azure Public Cloud

Picture 2: Scale Out of SAP Application Server (AS) – Initial State

We can start the new VM with SAP AS either manually or using Azure Autoscale in Azure for automatic scale out.

Picture 3: Scale Out of SAP Application Server (AS) – End State after the scale out

In a similar way, when SAP user load drops, we can scale in SAP application server(s), e.g. we can stop some SAP application server(s) and shut down VM(s), so reducing infrastructure costs.

Generally speaking, it is recommended to review utilization on infrastructure about every three months to determine whether the Azure resources provisioned match the demand from the SAP workload.

You can also actively monitor utilization, try to recognize the utilization patterns and identify peaks, and proactively resize Azure infrastructure for your SAP system.

SAP NetWeaver Sizing SAP Solutions on Azure Public Cloud.docx

For performing an r3load-based system copy of an SAP system you should use the SAP Software Provisioning Manager (SWPM). For a BW system it is mandatory to run the report SMIGR_CREATE_DDL before the system copy and RS_BW_POST_MIGRATION after the system copy. Both reports have been improved with the new SAP BW code delivered for SQL Server 2014 Columnstore. You benefit from these improvements also for SQL Server 2012.

The procedure of using SWPM for a homogeneous system copy using attach/restore has not changed. This BLOG is only related to an r3load-based homogeneous or heterogeneous system copy with SQL Server 2012 and 2014 (or higher) as the target database.

Overview: System Copy of SAP BW

For an r3load-based system copy of an SAP BW system, perform the following steps:

On the source system, install the necessary SAP BW support packages and correction instructions for SQL Server 2014 Columnstore as described in SAP note 2114876 – Release Planning SAP BW for SQL Server Column-Store. You should install this code even when using SQL Server 2012.
On the source system, install the latest patches of the new version of report SMIGR_CREATE_DDL as described SAP Note 888210 – NW 7.**: System copy (supplementary note). This includes a patch for report RS_BW_POST_MIGRATION as described in SAP Note 2230491 – use parallelism in RS_BW_POST_MIGRATION.
On the source system, run SMIGR_CREATE_DDL and choose the target database. You have the following, new options:

You might create all cubes on the target system with identical columnstore definition. This is the recommended way for a heterogeneous system copy. You have the following options:
- SQL Server 2014 (all column-store)
  This option creates all cubes on the target system with a writeable columnstore.
  It requires SQL Server 2014 or higher on the target Server.
- SQL Server 2012 (all column-store)
  This option creates all cubes on the target system with a read-only columnstore.
  It requires SQL Server 2012, 2014 or higher on the target Server.
- SQL Server 2012 (all row-store)
  This option creates all cubes on the target system without any columnstore index.
  It requires SQL Server 2012, 2014 or higher on the target Server.
Alternatively, you might keep the columnstore definition of the cubes as they were originally defined in report MSSCSTORE. You have the following options:
- SQL Server 2014
  This option creates all cubes on the target system using the same columnstore definition as on the source system.
- SQL Server 2012
  This option creates all cubes on the target system using the same columnstore definition as on the source system. However, cubes with writeable columnstore indexes on the source system will be created as cubes with read-only columnstore indexes on the target system.
Decide, whether you want to use tables spitting. Table splitting utilizes best the CPU threads during data export and import. However, it requires much more SQL Server memory and increases the likelihood of database errors (out of locks, out of memory). A system copy with table splitting is more complicated and has to be tested more intensively, compared with a system copy without table splitting. Be aware, that data export of BW fact tables may take longer than expected when using table splitting, because BW fact tables do not have a clustered, primary key.
Run SWPM as described in the SAP System Copy guide
- On the target system, set SQL Server trace flag 610
- Use at least as many r3load processes as available CPU threads on the target system
On the target system, do not forget to run report RS_BW_POST_MIGRATION in any case. The report may take several hours, since it creates all columnstore indexes.

SQL Server configuration

SQL Server Trace Flag 610

By default, R3load is using SQL Server Bulk Copy interface, which results is un-logged bulk inserts. However, for tables having a clustered index, it does only work for the first batch (for an empty table). You can fully enable it by setting SQL Server trace flag 610. See SAP Note 1241751 – SQL Server minimal logging extensions and https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx. One of the benefits of using trace flag 610 is the reduced memory consumption for SQL Server locks.

SQL Server memory

SQL Server memory consumption can be very high during data import, in particular when creating indexes in parallel, loading tables in parallel (by using table splitting) or when using a high DB commit size. A memory shortage may result in SQL error 8645 (A timeout occurred while waiting for memory resources) or SQL Error 1204 (The SQL Server cannot obtain a LOCK resource) during data load. However, you can simply repeat the step “IMPORT ABAP”, once these errors occur.
SWPM configures SQL Server memory to 40% of the physical server memory for a central SAP system. This is a good starting point for a productive system with SQL Server and the SAP application server running on the same box. However, the loading SAP system is not running during data import. Therefore, you might want to increase SQL Server memory during data import to 75% (or higher) of the physical server memory. As of SWPM 1.0 SP10 you can configure SQL Server memory as a custom configuration option.

SQL Server parallelism

Using SQL Server parallelism decreases index creation time, in particular for columnstore indexes. However, parallel index creation consumes much more memory, which may result in a memory bottleneck during high work load. SMIGR_CREATE_DDL does not create an MAXDOP optimizer hint for index creation. Therefore, the default SQL Server configuration option max degree of parallelism is used during data import. SWPM is setting max degree of parallelism to 1. Typically, this is the best configuration since it keeps memory consumption low during data import.

Frequency of DB commits

There are 3 parameters, which impact the frequency of DB commits during database load.

The R3load Batch Size is currently not configurable (yet) and has a default size 1MB. By dividing the R3load Batch size by the size of a row (as defined in SAP Data Dictionary), you get the number of rows within an R3load batch. This batch size has an impact on the R3load Commit Size, because R3Load only sends DB commits between distinct batches.
The R3load Commit Size has a default value of 10,000. After each batch, R3load checks the number rows processed since the last commit. If more than 10,000 rows have been processed, then a DB commit is executed.
The BCP Batch Size has a default of 10,000. It is used by the SAP DBSL for Microsoft SQL Server. The DBSL sends additional DB commits within the same R3load batch, if the number of uncommitted, processed rows is higher than the BCP batch size. You can configure the BCP Batch size by setting the environment variable BCP_BATCH_SIZE

You can increase the frequency of DB commits by reducing the BCP Batch Size. There is no need to change the R3load Batch Size or R3Load Commit Size. Decreasing the frequency of DB commit is normally counterproductive when loading data in parallel (which is always the case in SWPM). You can reduce the BCP Batch Size to 5000 by setting the environment variable BCP_BATCH_SIZE to 5000.

A reduced BCP Batch size reduces the likelihood of deadlocks, if table splitting is used. Furthermore, it reduces the memory consumption, if trace flag 610 is not used. When using trace flag 610 and no table splitting, then the value of BCP Batch size should not matter.

Creating columnstore indexes

The following changes are implemented in SMIGR_CREATE_DDL and RS_BW_POST_MIGRATION (once you have applied all required SAP Notes, as described in SAP Note 888210 for Microsoft SQL Server)

Choosing columnstore index in SMIGR_CREATE_DDL

You typically want to have a columnstore index on all BW cubes on the target system, after performing a system copy. Therefore, we recommended the following procedure in the past: Run report MSSCSTORE on the source system (e.g. ORACLE) for defining all BW cubes as columnstore cubes. Since the source system is not SQL Server, this has no impact on the source system. However, when performing a homogeneous system copy, the changes in MSSCSTORE would change the source system, which is typically not intended.

Therefore we changed SMIGR_CREATE_DDL. You now have the option to convert all BW cubes to columnstore without any impact on the source system. You simply select SQL Server 2014 (all column-store) or SQL Server 2012 (all column-store) as the target database. There is no need to run report MSSCSTORE anymore. For performing a system copy, you even do not need to know the report MSSCSTORE at all.

Creating columnstore index in RS_BW_POST_MIGRATION

Creating columnstore indexes is very memory intensive. It might result in out-of-memory errors during the high workload of SWPM phase “IMPORT ABAP”. Therefore, columnstore indexes are not created by SWPM. They are created after data import by running SAP report RS_BW_POST_MIGRATION. The report uses 2 kinds of parallelism for creating the columnstore indexes. It creates many columnstore indexes at the same point in time (by using SAP RFC calls) and it uses many CPU threads for creating a single columnstore index. The degree of parallelism can be changed by the RSADMIN parameter MSS_MAXDOP_INDEXING, but the default value 8 is normally sufficient. RS_BW_POST_MIGRATION automatically repeats the creation of a columnstore index, if it fails (with out-of-memory errors).

Creating DB partitions

For SQL Server 2012, 2014 or higher you do not have to take care of DB partitions. SMIGR_CREATE_DDL automatically creates the required partitions up to the maximum of 15,000 partitions.

SQL Server 2008 (R2) originally had a limitation of maximum 1000 partitions per table. This caused some issues when migrating a system to SQL Server. Typically this issue only occurs for the f-fact table of a BW cube. It is rumored that this is caused by more than 1000 partitions on the source system. In fact, the problem occurs when there are more than 1000 loaded requests in the f-fact table. SMIGR_CREATE_DDL creates a partition for each loaded request, independent from the actual number of partitions on the source system. Therefore it does not help to analyze the number of partitions on the source system. Instead, you should simply look at the output of SMIGR_CREATE_DDL: It gives you a warning for all tables, which will have more than 1000 partitions on the target system:

This warning was intended for SQL Server 2008 (R2) and older as a target release of the system copy. Unfortunately, SMIGR_CREATE_DDL currently also gives this warning for SQL Server 2012 and 2014. This issue will be fixed in the next BW support packages.

Summary

Follow the SAP system copy guide and check for the newest bug fixes described in SAP Note 888210. Use the system copy to convert all cubes to columnstore: Choose the option SQL Server 2014 (all column-store) or SQL Server 2012 (all column-store) for system copies with SQL Server 2012 or 2014 as target release.

The Azure Enhanced Monitoring Extension for SAP provides configuration information and performance data of the underlying Azure infrastructure and makes it available to the SAP application. It contains built-in self-diagnostics which enables identification of the health and completeness of the infrastructure data required by SAP services and support.

For details on the setup and for possible health checks of the Azure Monitoring infrastructure, see SAP NetWeaver on Azure Virtual Machines – Deployment Guide and check SAP Note 1999351 for known issues. For a list of metrics provided in SAP transaction ST06, see SAP Note 2178632.

Having set up the Azure Enhanced Monitoring Extension for SAP, the Windows service AzureEnhancedMonitoring is installed inside the VM and collects all Azure-related data. Afterwards, this data is consumed by SAP Host Agent and by saposcol.exe for further use in the SAP system transaction ST06.

Picture: SAP Azure Monitoring Architecture

SAP Host Agent (e.g. saposcol.exe) and the SAP system consume data of the Azure Enhanced Monitoring service. The Azure Enhanced Enhanced Monitoring service is based on the Azure Monitoring Extension for SAP. The Azure Enhanced Monitoring Extension for SAP collects data from different sources:

some local Windows performance and configuration data
configuration data (config.xml file) stored during setup of the VM
Azure storage analytics data (used if you use standard storage account for your SAP deployment)
Windows Azure Diagnostics (WAD) data (by accessing the Azure storage account tables)

As both, the Azure Enhanced Monitoring service and the SAP monitoring are based on the underlying Azure Enhanced Monitoring Extension, issues in the setup of the extension directly lead to an unhealthy status of the Azure Monitoring service as well as of the SAP monitoring.

Identify the Health Status of the Azure Enhanced Monitoring

To check the health status of the Azure Enhanced Monitoring service, log on to the VM and proceed as follows:

Open the Windows services and select the service ‘AzureEnhancedMonitoring’.
Right-click on the service name and choose ‘Properties’. In the upcoming window, check for the field ‘Path to executable’ (this field points to the installation folder of the Azure Monitoring Extension). This path is similar to
‘C:\Packages\Plugins\Microsoft.AzureCAT.AzureEnhancedMonitoring.AzureCATExtensionHandler\<version>\drop’ , where version indicates the current version of the extension.
Open a command prompt and switch to the installation folder indicated above.
In this folder, you can find the command line tool azperflib.exe which enables testing of the monitoring information provided by the Azure extension and its health status.
Execute ‘azperflib.exe’.Note: Azperflib.exe runs in a loop and updates the collected counters every 60 seconds. In order to finish the loop, you need to close the command window.
Pay attention to the summary metrics ‘Health status’ and ‘Diagnostics’, which are shown at the end of each loop of azperflib.exe:Health status: OK
Diagnostics: OK
The Health status is either OK or Failed:
– If the Health status is ‘OK’, there is nothing more to do.
– If the Health status is ‘Failed’, the field ‘Diagnostics’ contains further details on the root causes of the health issues.

Note: These root causes might be related to accessibility issues to WAD or to the storage accounts used, to misconfigurations or changes in configuration of the VM after execution of the setup script.

Use the Error IDs indicated in the ‘Diagnostics’ metric and follow up with their solutions provided in the table below.
Repeat steps 4 and 5 until azperflib.exe reports the ‘Health status’ ‘OK’.
As the health status of the SAP Monitoring is directly influenced by the health status of the Azure extension, a healthy status of the Azure Enhanced Monitoring is always reflected in a healthy status of the SAP monitoring.

Error Codes of the Azure Extension and Their Interpretation

Error ID	Error description	Solution
cfg/018	App configuration is missing.	run setup script
cfg/019	No deployment ID in app config.	contact support
cfg/020	No RoleInstanceId in app config.	contact support
cfg/022	No RoleInstanceId in app config.	contact support
cfg/031	Cannot read Azure configuration.	contact support
cfg/021	App configuration file is missing.	run setup script
cfg/015	No VM size in app config.	run setup script
cfg/016	GlobalMemoryStatusEx counter failed.	contact support
cfg/023	MaxHwFrequency counter failed.	contact support
cfg/024	NIC counters failed.	contact support
cfg/025	Disk mapping counter failed.	contact support
cfg/026	Processor name counter failed.	contact support
cfg/027	Disk mapping counter failed.	contact support
cfg/038	The metric ‘Disk type’ is missing in the extension configuration file config.xml. ‘Disk type’ along with some other counters was introduced in v2.2.0.68 12/16/2015. If you deployed the extension prior to 12/16/2015, it uses the old configuration file. The Azure extension framework automatically upgrades the extension to a newer version, but the config.xml remains unchanged. To update the configuration, download and execute the latest PowerShell setup script.	run setup script
cfg/039	No disk caching.	run setup script
cfg/036	No disk SLA throughput.	run setup script
cfg/037	No disk SLA IOPS.	run setup script
cfg/028	Disk mapping counter failed.	contact support
cfg/029	Last hardware change counter failed.	contact support
cfg/030	NIC counters failed	contact support
cfg/017	Due to sysprep of the VM your Windows SID has changed. You need to redeploy the Azure monitoring extension for SAP as described in the section 3 Redeploy after sysprep .	redeploy after sysprep
str/007	Access to the storage analytics failed. As population of storage analytics data on a newly created VM may need up to half an hour, the error might disappear after some time. If the error still appears, re-run the setup script.	run setup script
str/010	No Storage Analytics counters.	run setup script
str/009	Storage Analytics failed.	run setup script
wad/004	Bad WAD configuration.	run setup script
wad/002	Unexpected WAD format.	contact support
wad/001	No WAD counters found.	run setup script
wad/040	Stale WAD counters found.	contact support
wad/003	Cannot read WAD table. There is no connection to WAD table. There can be several causes of this: 1) outdated configuration 2) no network connection to Azure 3) issues with WAD setup	run setup script fix internet connection contact support
prf/011	Perfmon NIC metrics failed.	contact support
prf/012	Perfmon disk metrics failed.	contact support
prf/013	Some prefmon metrics failed.	contact support
prf/014	Perfmon failed to create a counter.	contact support
cfg/035	No metric providers configured.	contact support
str/006	Bad Storage Analytics config.	run setup script
str/032	Storage Analytics metrics failed.	run setup script
cfg/033	One of the metric providers failed.	run setup script
str/034	Provider thread failed.	contact support

Detailed Guidelines on Solutions Provided

1. Run the setup script

Download the latest version of the SAP specific PowerShell cmdlets.

For more information, read the SAP NetWeaver on Azure Virtual Machines – Deployment Guide

Please (re-)run the setup script.

Note that some counters might need up to 30 minutes for provisioning.

If the errors do not disappear,contact support.

2. Contact support

Unexpected error or there is no a general known solution. Collect the AzureEnhancedMonitoring_service.log file located in the folder C:\Packages\Plugins\Microsoft.AzureCAT.AzureEnhancedMonitoring.AzureCATExtensionHandler\<version>\drop and contact support for further assistance.

3. Redeploy after sysprep

If you plan to build a generalized sysprep’ed OS image (which can include SAP software), it is recommended that this image does not include the Azure monitoring extension for SAP. You should install the Azure monitoring extension for SAP after the new instance of the generalized OS image has been deployed.

However, if your generalized sysprep’ed OS image already contains the Azure monitoring extension for SAP, you can apply the following workaround to reconfigure the Azure monitoring extension for SAP, on the newly deployed VM instance:

a) On the newly deployed VM instance delete the content of the following folders:

C:\Packages\Plugins\Microsoft.AzureCAT.AzureEnhancedMonitoring.AzureCATExtensionHandler\<Version>\RuntimeSettings
C:\Packages\Plugins\Microsoft.AzureCAT.AzureEnhancedMonitoring.AzureCATExtensionHandler\<Version>\Status

b) Run the SAP specific PowerShell script against the newly deployed VM instance.

4. Fix internet connection

The Microsoft Azure Virtual Machine running the SAP Monitoring requires access to the Internet. If this Azure VM is part of an Azure Virtual Network or of an on-premises domain, make sure that the relevant proxy settings are set. These settings must also be valid for the LocalSystem account to access the Internet. For more details, check the SAP NetWeaver on Azure Virtual Machines – Deployment Guide.

In addition, if you need to set a fix static IP address for your Azure VM, do not set it manually inside the Azure VM, but set it using Azure PowerShell or Azure portal https://azure.microsoft.com/en-us/blog/static-internal-ip-address-for-virtual-machines/. The fix IP is propagated via the Azure DHCP service.

Manually setting a static IP address inside the Azure VM is not supported, and it might lead to problems with the SAP monitoring extension.

SQL Server AlwaysOn is one of the High Availability solutions available for an SAP system. It consists of two or more computers each hosting a SQL Server with a copy of the SAP database. A listener points to the actual primary copy and is used from the SAP system as the only connection point. For details how to setup and configure an SAP system together with SQL Server AlwaysOn see this blog post and its referenced blog posts.

During the setup the SAP System is configured from the current primary node and all non-database related objects such as SQL Server Agent Jobs, logins etc. are created only on the current primary database. In a case of a (automatic) failover to one of the secondary nodes of AlwaysOn these objects are then missing. Jürgen has introduced a script (sap_helprevlogin) in his initial blog post about the database load after setting up AlwaysOn. This script will transfer only the logins, but will fall short on transferring jobs, server level permissions and other assignments.

One of the SAP developers working in our team has built a comprehensive PowerShell script (sap_synchronize_always_on.ps1) to perform all these tasks and to transfer all the SAP objects from the initial installation to all the other nodes of the AlwaysOn system. The script connects to the primary instance, reads the configuration of the secondary nodes and then synchronizes the objects and jobs with these nodes. The script must be executed by a domain administrator which has SQL Server sysadmin privileges on all AlwaysOn instances.

The script uses up to three input variables:

The server name of the SQL Server instance or the listener name of the High-Availability group. The default is (local)
The name of the SAP database, which must be in an High-Availability group on the given server
Single login (optional): Only one login gets copied along with SAP CCMS jobs owned by the login. By default all logins mapped to the database are copied.

The script will execute:

Create a procedure CheckAccess in the master database (see this blog about the details about it)
Discover which logins are mapped to the database
Discover which SAP CCMS jobs belong to those logins
If the job does not use CheckAccess then change the job step to use CheckAccess and run the job step in master
Open a connection to each secondary and:
1. Create procedure CheckAccess in the master database
2. Create the logins if they don’t exist already using the same sid.
3. Create the jobs if they don’t exist already.
4. If a job exists and if the job does not CheckAccess then change the job step to use CheckAccess and run in master

If new SAP CCMS jobs are added because of remote monitoring from a different SAP system using DBACOCKPIT, the script can be re-executed. It will then copy only new objects which have not been copied before.

You can find this useful script attached, which makes the synchronization of the SAP Systems in an AlwaysOn environment so much easier. Please ensure that you test the execution in your test environment first, before you run it in production. Neither SAP nor Microsoft takes any responsibility from using this script, you run it on your own risk.

Best regards | Bless!

Clas & Guðmundur

sap_synchronize_always_on

It has been a long while since we published the last whitepaper about SAP workload on SQL Server. It also has been a long while that we were working on a successor paper. Too long because we needed to revise the document several times in order to include new product releases and new functionalities. Since we feel that we have a good state now, we wanted to release the paper in this blog. We also will work with SAP to get the paper into SCN. But for now this is the latest and greatest. If there are already more recent releases of products like System Center Data Protection Manager or other products we introduced here, we apologize. We’ll try to correct with the next release of the paper. You might miss references to SQL Server or SQL Server and SAP deployment on Azure Infrastructure as a Service. For this specific area, we have a whole lot of documentation collected here: https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-sap-getting-started/ . Did the paper get a bit overboard and is more like a book. Well, we guess so. Let’s see how we can provide the same information in the future in shorter cadence and in a less voluminous form in the future. This blog site could play a more important role in the future in providing information that we so far tried to bundle in papers like this one.

Have fun

SAP_SQL2012_SQL2014_Best Practices v1.0

Most of us have seen these errors in the SQL Server error log pointing to a IO subsystem problem:

2015/06/15 12:43:01 spid8s SQL Server has encountered 1016 occurrence(s) of I/O requests taking longer than 15 seconds
to complete on file [N:\ABCDATA8\ [ABCDATA8.ndf] in database [ABC] (5). The OS file handle
is 0x000000000000172C. The offset of the latest long I/O is: 0x0000073234

For a customer case this message has to be analyzed in more detail, because the storage group at the customer insists to get more information as they don’t see these high numbers on the storage system at that time. A deeper investigation reveals these findings:

the counter is collected and reported for each file separately
the message is printed by the lazy writer, the background process that kicks in every 5 seconds
it is printed in a minimum interval of 300 seconds (5 minutes) per file, means only one message per 5 minutes per file
it checks the amount of ticks (milliseconds) IO requests were active and if this exceeds 15000 (15 seconds) this IO request is counted
it is doing this for all parallel and asynchronously IO requests (per file)

The problem is that these are not the amount of IO requests that are counted, but the amount of times IO request were seen during the reporting time. Imagine you have a very long IO request going on (only one). The reporting kicks in after 5 seconds (by the lazy writer) and after 3 times (15 seconds) the IO request is the first time counted, but no message is printed yet as the five-minute minimum per file is not reached yet. The very same IO request is counted at each reporting cycle (every 5 seconds) until the 300 seconds threshold is reached and the message is printed (and the counter is reset). Until then this one IO request is counted 57 times (300 seconds / 5 seconds reporting interval = 60 occurrences – 3 time free (first 15 seconds)).

So if you have 1000 occurrences means that in this 60 reporting cycles of the 5 minutes reporting interval 1000 IO requests were seen that needed more than 15 seconds. That might were only a handful very long running IO requests (e.g. minimum is (1000 / 57) = 17.5 requests) or many shorter request (e.g. 1000 requests of 15 – 19 seconds).

The message is misleading as it is talking of occurrences and not of IO requests, but it still points to a storage problem. Note well, however, when you see even one such message it means you had at least one event where an I/O took more than 15 SECONDs to complete. This is VERY SLOW. This indicates a problematic I/O event. But there are not so many IO requests hanging as stated in the message, but at least there are some.

I hope this sheds some light on it.

We know we are a bit late. But honestly we were very busy, also with exploring the great SQL Server 2016 features. That is why we did not write anything about SQL Server 2016 despite the fact that we are in the last phase of development and SQL Server 2016 launch to which we contributed as a team as well. So over the course of the next few months we will pick topics of SQL Server 2016 as those relate to SAP or the task of operating and administrating SQL Server underneath SAP systems.

The first topic we want to touch is changes to SQL Server 2016 AlwaysOn.

SQL Server AlwaysOn is used by a lot of SAP customers meanwhile for local high availability. But also as functionality to provide disaster recovery capabilities. In the past 4 years we got a lot of feedback from you as a customer. Fortunately, the SQL Server 2016 development campaign gave us enough possibilities to incorporate many details of the feedback we received.

Feedback #1: SQL Server AlwaysOn is not checking the health of a single database within the Availability Group

As we developed AlwaysOn, we introduced a new unit of failover with an Availability Group (AG). In opposite to Database Mirroring where all the mechanics and unit of failover was strictly around a single database, an AlwaysOn Availability Group was introduced as a unit that could hold multiple databases which in case of a SQL Server instance issue could failover to a secondary replica. What was not covered by AlwaysOn Availability Groups at the point of introduction were failure scenarios which affected single databases within an AG. Or in other words, the health of each individual database within an Availability Group did not get checked. As a result, databases in an AG could close because e.g. the LUN of one or multiple data files were not reachable anymore, without actually triggering a failover of the AG. This was the reason for some customers to still stay with Database Mirroring where such a failure scenario of an unexpected close of a database would lead to a failover.

With the introduction of SQL Server 2016, we introduced database health checks as well. These can be individually chosen when creating the AG like shown below:

Once an AlwaysOn AG is created the settings can be changed and activated as well like shown in the Property page of an Availability Group as shown here:

Enabling database level health checks make it possible to trigger a failover of the whole AG in case one of the databases within the AG is encountering problems. Problems that force SQL Server to close the database. Means a similar behavior as with Database Mirroring is shown as it relates to react on an unexpected close of a database.

Feedback #2: Redo can’t keep up with the changes transmitted to the secondary

There are several bottlenecks that could occur when in AlwaysOn. Most of them in typical SAP scenarios are described in these articles:

The one issue some SAP customers indeed were reporting, during workloads with high data volumes being modified, was the fact that the redo thread on secondary replicas could not keep up with the incoming stream of changes. Just a short background what happens in such situation:

AlwaysOn send the transaction log records to the secondary replica.
In case of a synchronous availability mode, the acknowledge that is sent to the primary replicas is sent at the point the data is persisted in the transaction log of the secondary replica. Not earlier and not later.
Though the data can then be recognized as persisted on the secondary replica, it is actually not yet applied to the data or index pages of the database itself.
Applying the data to the data and index pages is task of a redo thread.
In opposite to Database Mirroring, AlwaysOn of SQL Server 2012 and 2014 only use one (1) redo thread per database within an Availability Group to apply the incoming changes to the associated database. This was a design decision to enable Read-Only capabilities when AlwaysOn got designed.
As a result of heavy data manipulation workload on large servers with often more than 100 CPU threads, situations could occur where the changes got all persisted in the transaction log of a database within an AG. However, the single redo thread simply could not keep up with applying the changes as persisted in the log.
With that a redo log queue build up.
As long as there was no read-only activity or no failover happening, that did not cause any impact. However, in case of a failover, the database on the secondary replica only can open after the redo log queue has been worked down and all the changes have been applied to the data. Or in the case of a read-only secondary you simply would read a slightly obsolete state of the data.
Hence it is not desirable to get to a state where a redo log queue builds up on the secondary replica.

SQL Server 2016 will solve this problem with parallel redo functionality on the secondary replicas. The number of redo threads per database is defined as a function of the number CPU threads available on the secondary replica. In SAP scenarios we got confronted sometimes with this scenario. Given the relatively large servers/VMs which are in use with SAP customers, parallel redo engaging several threads should solve those rare occurrences as well.

Feedback #3: Seeding the databases is a tedious business

In order to make seeding of a secondary replica easier, SQL Server 2016 introduces the same mechanism that is used by SQL Azure Databases to create secondary replicas. Means NOT creating a backup and restoring that backup. But opening a network pipe to the new secondary replica, create the secondary database on the replica and then transfer the data. Here is how it basically works. The setup of automatic seeding is not supported with SQL Server Management Tools. Instead automatic seeding needs to be applied to a replica when creating the AG or it can be applied to a replica of the AG after it got created.

Means the command to create an AG can look like:

CREATE AVAILABILITY GROUP [<Name of AG>] WITH (AUTOMATED_BACKUP_PREFERENCE = SECONDARY,
DB_FAILOVER = ON, DTC_SUPPORT = NONE)
FOR DATABASE [<database name>]
REPLICA ON N'<name of replica 1>’ WITH (ENDPOINT_URL = N’TCP://<FQD name of replica 1>:5022′,
FAILOVER_MODE = AUTOMATIC, AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, BACKUP_PRIORITY = 50,
SECONDARY_ROLE(ALLOW_CONNECTIONS = NO), SEEDING_MODE=AUTOMATIC);),
N'<name of replica 2′ WITH (ENDPOINT_URL = N’TCP://<FQD name of replica 1>:5022′,
FAILOVER_MODE = AUTOMATIC, AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, BACKUP_PRIORITY = 50,
SECONDARY_ROLE(ALLOW_CONNECTIONS = NO), SEEDING_MODE=AUTOMATIC);

The last option indicates that the seeding of the databases within the AG should be done automatically

After executing the command above on the primary replica you go to the replica and join the replica to the AG with a command like:

ALTER AVAILABILITY GROUP [<Name of AG>] JOIN;

As a necessary third step you also need to grant the AG the permission to create a database on the SQL Server instance of the secondary replica. This done with this command:

ALTER AVAILABILITY GROUP [<Name of AG>] GRANT CREATE ANY DATABASE;

In case of adding a new replica to an existing AG it can happen that the creation of the database and its synchronization is not started after having executed the commands above on the secondary replica. In such a case go back to the primary replica and trigger the seeding with:

ALTER AVAILABILITY GROUP [<Name of AG>] MODIFY REPLICA ON ‘<name of replica to be seeded>’
WITH (SEEDING_MODE = AUTOMATIC);

On the primary replica you can query the DMV sys.dm_hadr_automatic_seeding to check whether an automatic seeding process is going on. In this view all the seeding processes are listed. Expectation is to get one record per seeding process.

In the DMV sys.dm_hadr_physical_seeding_stats you can check the status of each the individual seeding processes. With the query:

SELECT * FROM sys.dm_hadr_physical_seeding_stats;

you can check on the progress of the seeding process to the replica.

The automatic seeding is using a fixed number of threads. On the primary side the number of threads used for reading the database is the same as backup would use (one reader thread per LUN), then there is a fixed number of threads for transferring what was read to the secondary. On the secondary, the number of threads engaged writing the data to the data files on disk follows the backup rules again of one thread per LUN. The fixed number of threads that perform the communication and data transfer will result in a different CPU utilization dependent on the overall CPU resources available on a server or VM. Means on a VM that only has 4 vCPUs as much has half of the CPU resources can be eaten up in the process. Whereas on a VM with 16 vCPUs the impact we measured is already less than 15% utilization of CPU resources. We expect that with even higher CPU count, the impact reduces and becomes hardly noteworthy.

Using traceflag 9567 on the AlwaysOn primary replica will enable compression of the data stream that gets send from the primary to the secondary. The usage of this compression can reduce the transfer times by factors. However, it also will increase the CPU usage. In first measurements we saw an increase of CPU utilization of the primary by using compression of a rough factor of 3.

We could easily provoke situations where the existing network bandwidth could be fully consumed by the automatic seeding. If the disks are fast enough on the primary and secondary, it is not a problem at all to saturate a 1Gbit Ethernet connection with this process. In tests where we ran two VMs on the same host, we could get up to 2Gbit transfer rates. Keep in mind if you run workload that relies on high network bandwidth that the automatic seeding can have some impact on the availability of that bandwidth.

Feedback #4: Since we need to run SQL Server in a Domain Account, we need to change passwords on those and since restart the SQL Server services

A solution to this problem can be Group Managed Service Accounts (https://technet.microsoft.com/en-us/library/hh831782.aspx ). These are supported now with SQL Server 2016 and can be used to work with AlwaysOn configurations as well.

Feedback#5: It is very hard to run a Windows Cluster Configuration that is spread over the whole continent

A lot of our customers recognized in AlwaysOn the ideal tool to cover local High Availability AND Disaster Recovery scenario. As a result, those customers were running a primary and a secondary replica in the main site and another secondary replica in the DR site. Basically a scenario as we described it in this article already: https://blogs.msdn.microsoft.com/saponsqlserver/2012/02/20/sql-server-2012-alwayson-part-3-sap-configuration-with-two-secondary-replicas/

As you read that article you realize that we recommended a few steps to make sure the DR site part of the Windows Cluster configuration had no voting into the quorum and other settings to make sure that the quorum in the main site is not impacted by what is going on in the DR site. However, the larger the distance between the sites, the more challenging it can become to have a reliable networking so that the cluster covering those two far distanced sites works properly. For SQL Azure Database which use AlwaysOn functionality to keep replicas we faced another issue that led the way to a different solution that avoids the usage of one Windows Cluster configuration which needs to cover main site and DR site. The solution is called Distributed Availability Groups. With this solution you can run separate Windows Server Failover Cluster configurations with two separate AlwaysOn Availability Groups and then link those AGs together with a so called Distributed Availability Group. Something that looks like this:

You can find documentation and principles here: https://msdn.microsoft.com/en-US/library/mt651673.aspx

As you certainly realized we just copied the picture above from the documentation. In terms of how this applies to SAP configuration, we’ll go into a separate article.

Other improvements of AlwaysOn that are more generic and also might have been in the feedback from SAP customers are listed here: https://blogs.technet.microsoft.com/dataplatforminsider/2015/12/15/enhanced-always-on-availability-groups-in-sql-server-2016/

and in subsequent SQL Server documentation.

Have fun