Real World Migration Story from BizTalk 2006/2006R2 to BizTalk 2010

Posted on 15 Apr 201315 Apr 2013 by escapebusinesssolutions in Uncategorized

Real World Migration Story from BizTalk 2006/2006R2 to BizTalk 2010

Writer:

Srikiran Tadikonda, Microsoft

Banupriya Thirumaran, Microsoft

Reviewed By:

Puneet Gupta, Microsoft

Shridhar Nadgaundi, Microsoft

Scott Zimmerman, Microsoft

Mandi Ohlinger, Microsoft

Contributors:

Gopinath P, Microsoft

Prethish Kumar Puthenparampil, Microsoft

Rachna Srivastava, Microsoft

Rajesh Ramamirtham, Microsoft

Sumant Kumar Pandey, Microsoft

Sneha Aswani, Microsoft

Published: January 2013

Applies to: BizTalk Server

Summary: The BizTalk Server infrastructure used by Microsoft IT (MSIT) included BizTalk Server 2006 R2 and BizTalk Server 2006 environments. Until recently, the platforms that hosted these applications were not Disaster Recovery enabled. Since BizTalk Server 2006 doesn’t support the Applicability Statement 2 (AS2) protocol, AS2 transactions are sent to BizTalk Server 2006 R2. Then, the transactions are routed back to BizTalk Server 2006 for further processing. As a result, there are many cross-server dependencies. With the transactions passed through multiple BizTalk servers, the architecture of these applications was complex. The integration team wanted to remove these cross-server dependencies and simplify the application architecture.

To simplify the architecture, the BizTalk Server 2006 R2/2006 environments are migrated to BizTalk Server 2010. The objectives of the migration are:

Enable Disaster Recovery for business critical applications.
Reduce the number of servers required to deploy the applications and reduce the support costs for the servers.
Minimize the number of pipelines deployed by leveraging the Pipeline Consolidation Framework of BizTalk Server 2010.
Enable MSIT to support a broader range of internal applications.
Stay current with the technologies: Windows Server 2008 R2, SQL Server 2008 R2, Visual Studio 2010, and the .NET Framework 4.

The migration to BizTalk Server 2010 is a great success, including the following results:

Reduced the physical servers for 32 applications from 51 servers to 22 servers
Uptime is greater than 99.9% and about 400,000 messages are processed per week
Utilize the benefits of Hyper-V
Zero major issues post-production

This white paper discusses the steps taken to migrate to BizTalk Server 2010.

This document is provided “as-is”. Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it.

Some examples depicted herein are provided for illustration only and are fictitious. No real association or connection is intended or should be inferred.

This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes.

Contents

Introduction 5

Introduction to BizTalk Server 2010 5

New Features of BizTalk Server 2010 5

Easier Integration of Enterprise Applications 5

Platform Support for Latest Microsoft Technologies 6

Simplify Solution Manageability for IT Pros 6

Enhanced Enterprise Interoperability 7

Deprecated Features of BizTalk Server 2010 7

Common Reasons for Migrating to BizTalk Server 2010 7

Microsoft IT (MSIT) BizTalk 2010 Migration Story 8

Problem Statement 8

Objectives 8

Migration Process 8

Development Strategy 8

Architecture – Before and After Migration 9

New Features used in Migration 10

Testing Strategy 11

Benefits of Migration 13

Challenges faced in Migration 14

Connect Bug: AS2 encryption and signing issue details 14

Host instance and hosts going down intermittently 14

BizTalk 2010 is stopping when input file is more than 5 MB 15

Best Practices 15

Appendix 16

Volume/ Throughput after migration 16

Weekly Volumes on BizTalk Servers 16

Patches applied to fix issues in BizTalk 2010 17

Conclusion 18

Acknowledgements 18

Introduction

This white paper describes the following:

Motivations for moving to BizTalk Server 2010
The process that Microsoft IT went through to migrate 32 mission-critical applications from BizTalk 2006/2006 R2 to BizTalk 2010
Suggestions for smooth administration of BizTalk Server 2010.

The new BizTalk installation went into production in phases from February 2011 to November 2011. Since then, uptime is greater than 99.9% while processing about 400,000 messages per week. This migration project allowed Microsoft IT to reduce the physical servers for the 32 applications from 51 to 22 servers, which includes eight 4-proc servers running Hyper-V to host BizTalk Server 2010. There have been zero major issues during post-production support. In many respects, the project has been a great success.

Introduction to BizTalk Server 2010

In most organizations today, there is a backlog of requirements to integrate different IT systems. Yet connecting software means more than just exchanging bytes. As organizations continue to move toward a service-oriented world, the real goal—creating effective business processes that unite disparate systems into a coherent whole—is now within reach. Microsoft BizTalk Server 2010 supports this goal.

Microsoft BizTalk Server is an integration server and connectivity solution. It helps organizations meet the challenge of integrating diverse systems. Including over 25 multiplatform adapters and a robust messaging infrastructure, BizTalk Server provides connectivity between core systems inside and outside of your organization. In addition to integration functionality, BizTalk also provides strong durable messaging, a rules engine, EDI connectivity, Enterprise Service Bus, Business Activity Monitoring (BAM), RFID capabilities, industry-standard SWIFT and HL7 support, and IBM Host/Mainframe connectivity.

New Features of BizTalk Server 2010

BizTalk Server 2010 introduces enhancements and new features in four main areas.

Easier Integration of Enterprise Applications

BizTalk 2010 accelerates time to solution with better and more accessible tools:

BizTalk Server 2010 includes easy to use tools that integrate Line-of-Business applications with Windows Server AppFabric Workflows. This includes Workflow activities that provide access to the BizTalk Transformation Engine and Mapper, and the BizTalk Line-of-Business adapters.
The BizTalk 2010 Mapper has been updated with a new user interface that improves development of large transformations and provides new search and predictive matching functionality. The new mapper also improves productivity by adding cut, copy, paste, move, and undo functions and stronger support for documenting maps and readability.
BizTalk Services are now accessible via SharePoint 2010 Business Connectivity Services.
BizTalk RFID includes new event handlers and event delivery capabilities. The new event handlers enable removal of duplicate events and filters based on Electronic Product Codes (EPC) and visibility requirements. The new delivery capabilities send events directly to .NET applications, BizTalk Server, and EPC Information Services (EPCIS).
Easier access to IBM applications and data sources. BizTalk 2010 Adapter for Host Systems offers efficient development tools and native runtime protocol engines to connect BizTalk 2010 to existing line-of-business programs and data on IBM mainframe systems.
Excellent support for industry-standard web services. BizTalk 2010 SOAP adapter enables sending and receiving messages by using SOAP over HTTP, thus enabling it to interact with the universe of web services. BizTalk 2010 also includes seven adapters and easy-to-use “wizards” that enable communicate to and from BizTalk Server and Web services-based applications via the Windows Communication Foundation (WCF).

Platform Support

for Latest Microsoft Technologies

BizTalk Server 2010 supports the Microsoft platform technologies, including Windows Server 2008 R2, SQL Server 2008 R2, Visual Studio 2010, and the .NET Framework 4.

BizTalk Server 2010 support for Windows Server 2008 R2 offers modular, minimal installation, improved network performance and control. It also includes improved high availability features, enhanced administration, and management with Server Manager, Windows PowerShell command-line, and enhanced virtualization with Hyper-V. By utilizing Windows Server 2008 R2 clustering, BizTalk Server can be deployed in multisite cluster scenarios, where cluster nodes can reside on separate IP subnets and avoid complicated VLANs.
BizTalk Server 2010 takes advantage of the virtualization improvements included as part of Windows Server 2008 R2 Hyper-V. These improvements lead to reduced costs through lower hardware, energy, and management overhead; plus the creation of a more dynamic IT infrastructure.
BizTalk Server 2010 support for SQL Server 2008 and SQL Server 2008 R2 offers better manageability and scalability, an optimized virtual SQL Server implementation and improved performance, especially in a 64-bit.
Support for Visual Studio 2010 and NET Framework (v4) introduces a number of improvements to the underlying Visual Studio-based BizTalk project system. Improvements include debugging support for artifacts and BizTalk maps (XSLT), enhanced BizTalk artifact property pages integrated into Project Designer tabs, the Visual Studio Project Update Wizard, and support for release and debug builds.

Simplify Solution Manageability for IT Pros

With BizTalk Server 2010 it’s now much easier to get things done faster.

A new settings dashboard consolidates all settings into a single place, providing granular settings at the group, host, or host instance level. The Settings Dashboard supports automated settings deployment with scriptable export/import operations.
The new System Center Operation Manager Pack includes a new model with separate application and deployment views, new alerts and diagnostics, and improved multimachine monitoring and cluster awareness.
Enhancements to the setup and upgrade include a new Smart Setup which scans and installs BizTalk upgrades automatically, upgrading from BizTalk Server 2006 R2 and BizTalk Server 2009.

Enhanced Enterprise Interoperability

BizTalk Server 2010 has better options for connecting with business partners and applications.

Enhanced Trading Partner Management provides easy on-boarding and lifecycle management of trading partners. A new partner management model reflects typical B2B relationships and is easier to manage large-scale TPM deployments. An improved user interface includes agreement templates and business profiles for rapid configuration and a new TPM operator role which provides access to users other than administrators.
New and enhanced adapters provide higher levels of security, better performance, and support for the latest releases of line-of-business applications.

Deprecated Features of BizTalk Server 2010

The following features and tools are available in previous BizTalk versions and are replaced with new features in BizTalk Server 2010.

BizTalk Web Services Publishing Wizard: This wizard is replaced with the new BizTalk WCF Publishing Wizard, which provides broader and more flexible support to publish web services from BizTalk.
BizTalk Adapter Pack 2010 version (BAP 2010): The SAP adapter, Oracle adapter, SQL Server adapter and Siebel adapter in BAP 2010, which comes with BizTalk 2010 replaces the corresponding adapters in BAP 1.0, which comes with BizTalk 2006/2006 R2.
Add Web Reference: The Consume WCF Service wizard is used to consume web services from BizTalk.
EDI Pipeline and Context Properties: The new Trading Partner Management feature in BizTalk Server 2010 provides the same capabilities as the EDI Pipeline and Context Properties and replaces these features of older BizTalk Server Versions.
Web Services Enhancements Adapter: This adapter is replaced with the WCF Adapter using a custom transport.

Common Reasons for Migrating to BizTalk Server 2010

Stay Current: Being able to use newer communication protocols makes it easy to stay current and take advantage of new software features and technologies, such as improved encryption.
Support: BizTalk Server 2006/R2, Visual Studio 2005, and SQL Server 2005 are no longer covered by Microsoft’s Mainstream Support Policy. To keep active with support for mission-critical IT assets, it is essential to upgrade to the latest versions.
Virtualization: BizTalk 2010 takes advantage of the latest virtualization improvements of Windows Server 2008 R2 Hyper-V. The improvements can lead to reduced costs through lower hardware, energy, and management overhead and in the creation of a more dynamic IT infrastructure.
Clustering: By using Windows Server 2008 R2 clustering, BizTalk Server can now be deployed in multisite cluster scenarios, where cluster nodes can reside on separate IP subnets and avoid complicated VLANs.

Microsoft IT (MSIT) BizTalk 2010 Migration Story

Problem Statement

The Microsoft Board of Directors mandates that Microsoft’s internal business critical applications, like Volume licensing and Order Processing, must have Business Continuity/Disaster Recovery (BC/DR). Until recently, the platforms hosting these applications (that is BizTalk 2006/2006 R2) were not DR enabled. Since BizTalk Server 2006 doesn’t support the AS2 (Applicability Statement 2) protocol, AS2 transactions went to BizTalk Server 2006 R2. Then, the transactions were routed back to 2006 for further processing. This led to many cross-server dependencies. With the transactions being passed through multiple BizTalk servers, the architecture of these applications was complex. The integration team wanted to remove these cross-server dependencies and simplify the application architecture.

Objectives

The objectives of this migration were:

To enable Disaster Recovery for business critical applications.
To reduce the number of servers required to deploy the applications and to reduce the support costs for them.
To minimize the number of pipelines deployed (by leveraging the Pipeline Consolidation Framework of BizTalk 2010).
To enable MSIT to support a broader range of internal applications.
To stay current with the technologies: Windows Server 2008 R2, SQL Server 2008 R2, Visual Studio 2010, and the .NET Framework 4.

Migration Process

Development Strategy

The migration process started with estimating the number of maps for the applications, frequency of transactions per LOB (Line of Business), size of the messages and so on; and then setting these values as the production benchmark. Based on the estimation, and the criticality of the applications and the business needs, the integration team divided the entire migration process into four phases.

In each phase, a set of applications were migrated from BizTalk 2006/2006R2 to BizTalk 2010. Once the applications were determined for each phase, the old BizTalk solutions were rebuilt using Visual Studio 2010. BizTalk solutions that contained legacy adapters like SAP and SQL are replaced with WCF-Custom adapters in BizTalk 2010. Due to the growing number of pipeline and pipeline components, an internal framework called Pipeline Consolidation Framework (PCF) was developed. The framework offers flexibility for pipeline configuration, minimizes the number of pipelines and pipeline components, centralizes the storage in the .NET global assembly cache (GAC) and makes the pipelines reusable with more shared functionality. Refer to this (http://blogs.msdn.com/b/srikiran/archive/2012/12/22/pipeline-consolidation-framework-pcf-biztalk-2010.aspx) blog for more information on PCF. Changes to the existing solutions are done in configuring adapters, pipelines, and party settings. In other artifacts like maps, schemas and orchestrations no changes were required.

The final solutions were built and deployed in System Integration Test (SIT), User Acceptance Test (UAT), and Production environments. This entire migration process took about 15 months for 32 applications. This migration process started with less critical applications like HR and Supply Chain applications, then moved to highly critical applications like Finance, and Order to Cash in later phases.

Architecture – Before and After Migration

Previously, there were two versions of BizTalk (BizTalk 2006, 2006R2) in which all applications were deployed (see the following diagram). This architecture was previously supported by 22 Production servers. This architecture led to many problems. See Figure1. For example, when one application was updated, it used to affect the other applications on the same server. Any patches applied to BizTalk 2006/2006R2 affected all the applications. To overcome these major problems, the integration team came up with the new, appropriate architecture. In the new architecture, three BizTalk 2010 LOB Servers are maintained. The applications which are high-critical are deployed on first server, medium-critical on second server and low-critical on third server. When changes are made to low-critical applications, it does not impact the high-, or medium-critical applications on the other BizTalk servers and vice versa. The maintenance of the applications is greatly simplified. The alerts on the host instances are set according to the criticality of the applications, which helps the business owners handle any issues effectively. See Figure2 for each LOB server configuration.

Figure 1. Old and New Architecture

Figure 2. BizTalk 2010 LOB Server Configuration

New Features used in Migration

Several BizTalk 2010 new features are used to help development productivity and solve some of the challenges faced with BizTalk 2006 and BizTalk 2006 R2:

Using the BizTalk 2006/2006R2 Trading Partner Management (TPM), the application could not use multiple AS2 transaction setups with AS2 (different AS2 Id with single MDN receive endpoint) for the same business partner. In BizTalk 2010, transaction setup is not only at the Party level but also at the Agreement level. So now, there is fine-grained control over Transaction Setup within the Agreement Level.
The BizTalk Mapper improvements reduced the time from 16 hours to develop complex maps to 4 hours. The key mapper improvements were:
- Organizing maps in multiple pages
- Ability to search in Schemas
- Predictive Matching
- Cut/Copy/Paste
BAM User Experience Improvements – There were no indexed properties in earlier versions of BAM tracking. BTS2010 has indexed based sorting. The indexed-based sorting resolves a performance issue when returning the most recent tracking records.
SQL and SAP Adapters are migrated to WCF adapters to give a uniform experience when configuring the settings, and more importantly, improved performance.

Testing Strategy

Three months of test data was captured from the archives from the production systems. This data was later cleaned up and categorized for testing. The business partners for testing were selected based on criteria, including production volume, encoding, location, and so on. The testing scenarios cover new changes, regression of the as-is functionality, data comparison, and negative testing.

Test is organized into four main phases.

Isolated Environment Testing:

In isolated environment testing, the BizTalk 2010 solution is tested with five or six partners per transaction. Individual BizTalk artifacts were tested including adapters, maps, schemas, receive locations and send ports. And the output for each transaction in BizTalk 2010 is compared with the old environment.

Volume testing was also done and ensured that the outputs between old and new translators were same.

System Integration Testing:

For system integration testing, the BizTalk 2010 system is connected with internal tools and downstream systems. Partner testing is simulated here. For inbound transaction, HTTP posts to the gateway system are simulated. For outbound transactions, transactions are sent to local AS2 sites to verify the correctness of the output.

User Acceptance and Rollback Testing:

For user acceptance and roll-back testing, there is collaboration with the partners at the start of the release so their environments are available during UAT testing.

Test certificates are used and shared with partners during this testing. The UAT test scenarios include positive and negative scenarios agreed with the partners. Scenarios include missing mandatory fields in purchase orders sent by partners, or incorrect EDI qualifiers or delimiters in TPM Agreement settings. Another criteria for successful UAT testing and signoff is that there is a minimum of five files tested with each partner.

Rollback testing happened in UAT. Scenarios were created depending on where the rollback happens. For example, different rollback procedures in gateway applications compared to BizTalk application-level procedures are used. The system was rolled back to the old environment and tested with one file for each transaction to verify that files were flowing through the old environment as expected.

Once the rollback tests were successful, again the automated deployment to switch to the new environment was initiated. This is followed by smoke testing the new applications in BizTalk 2010. This level of rollback testing obviously takes time, but it’s the best way to ensure that the project team is ready to make the switch, without any disruption to the business.

Performance Testing:

For Performance Testing, benchmark statistics are collected from BizTalk 2006 and BizTalk 2006 R2 production systems, covering a period of more than six months. The metrics included number of messages per transaction, size of the messages, peak volume, and frequency of the messages.

Using LoadGen (a simulation tool to measure the impact of clients on server), configuration sets are created for different adapters such as File, HTTP, WCF, and so on. The appropriate performance counters for a BizTalk clustered environment and SQL were selected and started before running LoadGen scenarios. For this project, the PAL (Performance Analysis of Logs) tool is used to analyze the performance counter logs and thresholds. It also documents the system, server, and network issues and presents it in HTML format.

The extreme scalability tests include increasing the message count and the size of the messages to manifold. In some cases, an increase up to 60 times more than the actual volume and 40 times more than the actual size of messages in production are used. See the following Figure 3; which shows the different scenarios. The team was satisfied that the measured CPU rates demonstrated the ability to handle peak load far beyond the requirements. A breakpoint scenario is also tested to ensure the stability and availability of the server even under high CPU utilization. It was seen that the files continued to be processed, even in the breakpoint scenario where there was more than 95% CPU utilization across the servers.

Figure 3. The CPU Utilization of BizTalk 2010 LOB1 during extreme load testing.

Preproduction Validation Testing:

The production deployment scripts, assemblies, and artifacts were validated and smoke tested before the Go-Live event. As the environments were new, there were network and firewall issues when connecting to external customer URLs and internal business systems such as SAP. So, the transactions are smoke tested with local end points and customer UAT URLs to ensure that the firewall issues or connectivity issues were overcome. After installing the BizTalk solutions and before the new receive ports and orchestrations were enabled, Build Validation Tests (BVT) are performed in the production environment. The BVTs verify that all the necessary artifacts are present.

Post Production Support:

Using System Center Operations Manager (SCOM) 2010, we monitored the following events:

Receive Location and Send port availability
BizTalk to BizTalk server services
Suspending Q monitoring
DB availability
Any disconnection between Database and processing nodes

Monitoring was automated. Alerts are sent for failed messages and SLAs were created based on the criticality of the applications. The 24/7 support team from Operations and Engineering ensures that any critical issue is resolved based on the SLA.

The post production support for applications is extended for a month from the date of Go-Live. Any suspended messages in BizTalk or any issues raised by customers are attended immediately with the 24/7 support model. There have been no major issues during the post-production support period.

Benefits of Migration

The migration of several applications from BizTalk 2006/2006R2 to BizTalk 2010 achieved many significant benefits for Microsoft.

The Integration Team successfully deployed 32 unique applications that have 150+ project artifacts (maps) and 300+ partners to production with zero severity-1 bugs or issues.
It resulted in the retirement of 51 servers, 70% of them being physical servers. This is a large cost savings.
There was no adverse impact on any business processes, and customers are using new applications without any issues.
It resulted in a reduction of the pipelines from 108 to 18, which significantly increased the performance of the applications.
After migration, the applications are able to take advantage of state-of-the-art virtualization technologies with Hyper-V, which leads to reduced costs through lower hardware, less management overhead and more flexible administration.
The migration has the applications on a newer platform, including Windows Server 2008 R2, SQL Server 2008 R2, and .NET Framework 4.0 which helped to leverage the dynamic features of all these new technologies.
BizTalk 2010 takes advantage of N-node Active-Active clustering of Windows Server 2008 which improves the availability and utilization of the resources for the applications. BizTalk 2006 R2 uses 2-Node Active-Passive clustering in Windows Server 2003.

Challenges faced in Migration

There were some challenges faced during the migration of applications to BizTalk 2010. The most important of them are listed here.

Connect Bug: AS2 encryption and signing issue details

Issue:

A signed and encrypted message from partner fails to be processed in BizTalk 2010 Beta, and was getting processed in BizTalk 2006 R2. A partner is able to send signed and encrypted message but the AS2 Decoder fails to decrypt the file and returns the following error

:“An error occurred when decrypting an AS2 message.
The sequence number of the suspended message is 1. “
Additional error generated in the event log whenever this error occurs is:
“The AS2 Decoder encountered an exception during processing. Details of the
message and exception are as follows: AS2-From :”<<Some Value>>” AS2-To:”<<
Some Value>>” MessageID:”<< Some Value>>” Message Type: “unknown”
Exception: “An error occurred when decrypting an AS2 message””

The more detailed exception is available here (http://connect.microsoft.com/BizTalk/feedback/details/564504/error-while-decrypting-as2-files-as2-over-file-scenario)

Fix:

This issue was caused by code issue in BizTalk 2010 beta version and fixed in the BizTalk 2010 RTM version.

Host instance and hosts going down intermittently

Issue:

When performing transactions on BizTalk 2010, some of the following errors occurred:

Error code: 0xC0002A21, an error occurred while attempting to access the SSO database.
A transport-level error has occurred when receiving results from the server. (Provider: TCP Provider, error: 0 – The semaphore timeout period has expired.) SQL Error code: 0x00000079

Fix:

To fix this problem, follow these steps:

On the BizTalk servers, update MaxUserPort and TCPTimeWaitDelay registry keys per KB article 319502 (http://support.microsoft.com/?id=319502) and then reboot the servers.
On the SQL backend servers of BizTalk, set the SynAttackProtect registry key to 0 per KB article 899599 (http://support.microsoft.com/kb/899599), Winsocklistenbacklog to 190 as per KB article 328476 (http://support.microsoft.com/?id=328476) , and reboot the servers.

BizTalk 2010 is stopping when input file is more than 5 MB

Issue:

When you install BizTalk 2010 in Windows 7, Windows 7 Service Pack 1 (SP1), Windows Server 2008 R2 or Windows Server 2008 R2 Service Pack 1 (SP1) and use AS2 decoder to decrypt an encrypted message larger than 5 MB, it fails with an “out of memory error while decrypting files larger than 5MB” exception.

Fix:

To resolve this issue, apply the hotfix described in Microsoft Knowledge Base (KB) article 2480994 (http://support.microsoft.com/kb/2480994)

Best Practices

The following are some of the best practices used to help the migration process go smoothly:

Ensure consistent monitoring and tracking of messages is present across all applications.
Avoid overloading individual hosts. Instead, use the appropriate number of BizTalk 2010 Servers and decide how many applications to deploy on each of them in the design phase.
Validate your message volume per host and modify default settings as appropriate.
Ensure that the BizTalk SQL Agents Jobs run successfully in SQL Server to release all the resources.
Ensure that you have a backup and replication strategy in place.
When possible, avoid large schemas with mostly optional fields.
Ensure that appropriate permissions exist for SSO, SQL, and BizTalk.
Modify default retries and timeouts as appropriate.

The optimizations performed by the Integration team to improve the performance of applications in the migration process include:

Utilized the SQL Server Backup Compression feature in BizTalk Server which eventually reduced the size of BizTalk backups and the time consumed for backing up the Database.
In BizTalk Server 2010 Management Pack for System Center 2010, set the discovery interval to lower levels which fixed the performance issues related to resource utilization in previous BizTalk versions.
To achieve higher performance, the Max receive interval is tuned at the host Level in BizTalk 2010. Performance was further improved by tuning the Max polling interval and Orchestration polling interval. Refer to this (http://blogs.msdn.com/b/prethishkumar/archive/2012/02/04/high-performing-biztalk-server-2009-biztalk-server-2010-server-setup.aspx) blog for more information on improving performance.
Upgraded the hardware (for example, increasing the RAM and disk space) which reduced disk I/O Latency.

Appendix

Volume/ Throughput after migration

The following table and graph show the typical weekly number of messages that flow through the new BizTalk Servers.

Weekly Volumes on BizTalk Servers

BizTalk Server	Weekly Volume
BizTalk 2010 LOB1	326,534
BizTalk 2010 LOB2	36,467
BizTalk 2010 LOB3	4,062

Patches applied to fix issues in BizTalk 2010

Certain patches that are applied to BizTalk 2010 to fix issues based on customer feedback received by the Microsoft BizTalk team. In the following table, some of the patches that helped the integration team in migrating the applications from BizTalk 2006/ 2006R2 to BizTalk 2010 are listed.

KB	Description
2570450	FIX: “Unable to sign outbound message” error after you upgrade to BizTalk Server 2006 R2 SP1 or to BizTalk Server 2010 (http://support.microsoft.com/kb/2570450)
2587237	FIX: Host instance of BizTalk Server 2010 does not resume operation after the connection for a BizTalk Management database resumes (http://support.microsoft.com/kb/2587237)
2563979	FIX: Performance decreases when a server that is running BizTalk Server processes a large batch of messages after you customize the settings of the “Large Message Threshold” option and of the “Large Message Fragment size” option (http://support.microsoft.com/kb/2563979)
2555003	FIX: AS2 messages remain active after they are processed when the “Track Message Bodies – Request message after port processing” option is enabled on a send port in a BizTalk Server environment (http://support.microsoft.com/kb/2555003)
2580202	XML Ex operation processes EDI messages in BizTalk Server 2010 if the separator is set to the less than sign (<) or to 0x3C (http://support.microsoft.com/kb/2580202)
2627804	FIX: A hotfix that improves the performance of the tracking feature in BizTalk Server 2010 is available (http://support.microsoft.com/kb/2627804)
2563231	FIX: “The formatter threw an exception while trying to deserialize the message” error when the WCF-SAP adapter in BizTalk Adapter Pack executes an RFC or BAPI to an SAP system (http://support.microsoft.com/kb/2563231)
2552332	FIX: WCF-SAP adapter may incorrectly keep connections open when the adapter sends some RFC requests (http://support.microsoft.com/kb/2552332)
2539412	A hotfix is available for the WCF-based SAP adapter to let you disable the behavior that trims leading zeros of NUMC values in BizTalk Adapter Pack (http://support.microsoft.com/kb/2539412)
2539769	FIX: BizTalk Server stops receiving IDOCs from a WCF-based SAP adapter in BizTalk Adapter Pack (http://support.microsoft.com/kb/2539769)
2556304	FIX: Remote function calls do not work after the WCF-based SAP adapter receives the RFC_FAILURE error code in BizTalk Adapter Pack (http://support.microsoft.com/kb/2556304)
2554806	FIX: “System.InvalidOperationException: Category does not exist” error if some registry keys for the WCF-based SAP adapter are corrupted in BizTalk Adapter Pack (http://support.microsoft.com/kb/2554806)
2598894	FIX: Error occurs or NULL values incorrectly inserted by the WCF-based SQL adapter in BizTalk Adapter Pack 2010 if an input message contains empty elements (http://support.microsoft.com/kb/2598894)
2478956	WCF-SQL Adapter does not return useful errors (http://support.microsoft.com/kb/2478956)
2481676	WCF-SQL adapter sometimes locks sql spid/table after PDAS stopping all operations against database (http://support.microsoft.com/kb/2481676)
2522459	Poll Statement is executed even if PolledDataAvailableStatement returns 0 (http://support.microsoft.com/kb/2522459)
2530219	BAP SAP Keep Context for WCF-SAP (http://support.microsoft.com/kb/2530219)
2505757	[Port from BAP 2010 CU1 to BAP 2.0 CU2] New WCF based SAP Adapter does not sort the string correctly in the String scenario with multiple grouped IDocs (Collated String option fix) (http://support.microsoft.com/kb/2505757)
899599	A BizTalk Server Host instance fails, and a “General Network” error is written to the Application log when the BizTalk Server-based server processes a high volume of documents (http://support.microsoft.com/kb/899599)
970406	TCP settings that can impact BizTalk Server (http://support.microsoft.com/kb/970406)

Conclusion

BizTalk 2010 is implemented by many organizations worldwide today for their mission-critical integration. BizTalk 2010 is for high volume, high availability, and high scalability which are required for business process solutions. Migrating to BizTalk 2010 from BizTalk 2006, BizTalk 2006 R2 gives many benefits to the organization and to the customers.

Acknowledgements

Many thanks for the technical information and input provided by Gopinath P, Prethish Kumar Puthenparampil, Rachna Srivastava, Rajesh Ramamirtham, Sumant Kumar Pandey and Sneha Aswani. We would like to acknowledge the leadership and support provided by Puneet Gupta and Shridhar Nadgaundi. A special thanks to Scott Zimmerman and Mandi Ohlinger for the technical review and valuable insights on this white paper.

For more information:

http://www.microsoft.com/sqlserver/: SQL Server Web site

http://technet.microsoft.com/en-us/sqlserver/: SQL Server TechCenter

http://msdn.microsoft.com/en-us/sqlserver/: SQL Server DevCenter

Did this paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to 5 (excellent), how would you rate this paper and why have you given it this rating? For example:

Are you rating it high due to having good examples, excellent screen shots, clear writing, or another reason?
Are you rating it low due to poor examples, fuzzy screen shots, or unclear writing?

This feedback will help us improve the quality of white papers we release.

Send feedback.

Active Directory LDAP Compliance

Posted on 12 Apr 201312 Apr 2013 by escapebusinesssolutions in Uncategorized

Active Directory LDAP Compliance

Microsoft Corporation

Published: October 2003

Abstract

Directories are public or private stores containing essential identifying information typically used in daily enterprise activities. Many application providers capitalize on directories offering integration into existing directories to extend their application’s functionality. Network operating systems also house vital network information, such as users and computers, within directories.

Lightweight Directory Access Protocol (LDAP) is a directory standard founded on the legacy X.500 directory. LDAP’s initial implementations provided gateway services between X.500 directory servers and clients. While LDAP was initially created to meet this requirement, it became clear that a parting from the cumbersome X.500 directory standard was needed to simplify deployments. In 1994, LDAP was transformed into a directory specification with its own database and structuring conventions.

This paper discusses the origins of LDAP within Microsoft products and, specifically, the implementation of, and conformance to, the LDAPv3 Proposed Standard within Microsoft Windows 2000 Server and Microsoft Windows Server 2003. Included for reference are matrixes detailing supported RFCs.

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, AS TO THE INFORMATION IN THIS DOCUMENT.

Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

Microsoft, Active Directory, Visual Basic, Windows, and Windows Server are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

Contents

Introduction 2

Directory Foundation: X.500 2

X.500: The Need for a Lightweight Alternative 2

What Is LDAP? 3

LDAP: First Generation 3

Enhancements with Version 2 3

The Current State of LDAP 3

What Does It Mean to Be LDAP Compliant? 5

Achieving Compliance: IETF Applicability Statement 5

Achieving Compliance: Third-Party Test Suites 5

The Open Group LDAP Certifications 5

Setting an LDAP Compliance Baseline 6

Active Directory’s LDAP Compliance 8

Windows 2000 Server 8

Windows Server 2003 8

Compliance Misconceptions 10

inetOrgPerson 10

Native LDAP Calls 10

Directory Interoperability 11

LDAP API 11

Active Directory Services Interface 11

Development Environments 12

Active Directory Application Mode 12

Directory Services Markup Language 12

Microsoft Identity Integration Server 2003, Enterprise Edition 12

Additional Resources 14

Lightweight Directory Access Protocol Version 3 14

Open Group and the Directory Interoperability Forum 14

Developing with Active Directory Services Interface 14

Miscellaneous 14

Introduction

Directories—public or private resource lists containing names, locations, and other identifying information—are essential tools often taken for granted in our daily activities. Typically these directories provide information about people, places, or organizations as part of an overall solution. For example, a telephone is virtually useless without a directory to correspond names with telephone numbers. Historically, most directories were only available in printed form.

As the computer revolution forged ahead, printed directories gave way to an electronic counterpart. Many application providers capitalized on the directory concept offering proprietary versions that extended their application’s functionality. Network operating systems also provided directories, typically housing user and device information. Unfortunately, these first generation directories were often developed with little or no concern for interoperability. Isolated and specific in function, they performed admirably. However, it was obvious directories needed to interact within a larger network ecosystem. This idea grew into the definition of the X.500 standard.

Directory Foundation: X.500

In 1988, the International Organization for Standardization (ISO) and the International Telecommunications Union (ITU) introduced the X.500 standard. X.500 defines the protocols and the information model for an application and network platform agnostic directory service. As a distributed directory based on hierarchically named information objects, X.500 specifications characterized a directory that users and applications could browse or search.

The X.500 paradigm includes one or more Directory System Agents (DSAs)—directory servers—with each holding a portion of the Directory Information Base (DIB). The DIB contains named information objects assembled in a tree structure—defined by a Directory Information Tree (DIT)—with each entry having an associated set of attributes. Every attribute has a pre-defined type and one or more associated values. Object classes, containing mandatory and optional attributes, are defined within a directory schema. End users communicate with an X.500 DSA using the Directory Access Protocol (DAP) while the Directory System Protocol (DSP) controls interaction between two or more DSAs.

X.500: The Need for a Lightweight Alternative

Understanding the need for a streamlined directory standard, several implementers proposed a lightweight alternative for connecting to X.500 directories. Ultimately, the first iteration of LDAP gained traction as a simple alternative to the X.500 Directory User Agent (DUA). The new LDAP definition:

Simplified protocol encoding
Used text encoding for names and attributes
Mapped directly onto the TCP/IP stack
Supplied a simple Application Programming Interface (API)

What Is LDAP?

Organized development of LDAP occurred on several fronts. However, the most notable work, and the first freely available implementation, was completed by the University of Michigan in 1993. The University focused efforts on developing a simpler TCP/IP version of X.500’s DAP. DAP was considered cumbersome as it pushed much of its workload to the client.

Although LDAP is well rooted as a simplified component of the X.500 directory, it has become the de facto directory protocol on the Internet today.

LDAP: First Generation

LDAP’s initial implementations provided gateway services between X.500 directory servers and clients. The clients communicated with an LDAP gateway through LDAP-enabled software. In turn, the gateway handled transactions—on behalf of the client—with the X.500 DSA. This model promoted directory interoperability allowing application providers to easily develop client software capable of communicating with an LDAP gateway service, regardless of the backend platform. While LDAP was initially created to meet this requirement, it became clear that a parting from X.500 was needed to simplify deployments. In 1994, LDAP was transformed into a directory specification with its own database and structuring conventions.

Once transformed, the LDAP specifications reflected a true client-server model with clients making requests directly to servers for information or operations. One or more directory servers may either perform the operation or refer the client to another directory server that may be able to provide the requested information, or perform the requested operation. The LDAP client will see the same view of the directory no matter which server is contacted. If necessary, the LDAP server can authenticate the client to the operating system in use. Once received, the LDAP server will convert a request into an appropriate format for the accessed directory. For X.500 directories, the LDAP server would convert the LDAP request into a DAP request.

Enhancements with Version 2

As interest in LDAP increased, several new developments extended its core functionality while streamlining its footprint. In 1995, Request for Comment (RFC) 1777 was introduced for LDAP Version 2. RFC 1777 eliminated many of the impracticable components of X.500 that were central to the original LDAP specifications. Furthermore, network connectivity was changed from the X.500 Open Standards Intercommunication (OSI) model to the TCP/IP model.

LDAPv2 is officially defined by the following RFCs:

RFC
1777 – Lightweight Directory Access Protocol (v2)
RFC
1778 – The String Representation of Standard Attribute Syntaxes
RFC
1779 – A String Representation of Distinguished Names

The Current State of LDAP

Developed by the Internet Engineering Task Force (IETF) in 1997, the current LDAPv3 implementation is a renovation of LDAPv2, which primarily tackles deployment limitations identified within the previous version. LDAPv3 also enriches compatibility with X.500 along with enhanced integration with non-X.500 directories. LDAPv3 encompasses LDAPv2 within a new set of RFCs.

LDAPv3 is a Proposed Standard currently defined by RFC 3377, which includes, the RFCs below in addition to support for the LDAPv2 RFCs listed above:

RFC 2251 – Lightweight Directory Access Protocol (v3)
RFC 2252 – LDAP (v3): Attribute Syntax Definitions
RFC 2253 – LDAP (v3): UTF-8 String Representation of Distinguished Names
RFC 2254 – String Representation of LDAP Search Filters
RFC 2255 – The LDAP URL Format
RFC 2256 – A Summary of X.500(96) User Schema for use with LDAP (v3)
RFC 2829 – Authentication Methods for LDAP
RFC 2830 – LDAP (v3): Extensions for Transport Layer Security

What Does It Mean to Be LDAP Compliant?

Claiming LDAP compliance seems to be a common occurrence among today’s directory-capable vendors. In truth, applications may be either directory-aware—capable of reading an LDAP directory—or directory-enabled—capable of reading and performing other defined LDAP operations on a directory. Both implementations should be considered LDAP-compliant if proposed standards are followed in achieving an application’s desired level of LDAP functionality. Compliance with standards, however, does not ensure interoperability. Standards may lack sufficient clarity, even after formalization, leading to varying guideline interpretations.

The compliance task for directory server vendors weighs on their ability to conform to defined standards while ensuring interoperability with those standards. In addition, the process of standard formalization is an ongoing effort compounding the difficulty of achieving full compliance. For example, LDAPv2 has been formalized with a defined set of RFCs, however, LDAPv3, a Proposed Standard, is theoretically still a work in progress as it moves toward an Internet Standard. In summary, a vendor’s compliance statement should be viewed as their ability to implement a known set of RFCs, accurately interpret those RFCs to ensure interoperability, and provide a framework capable of incorporating new RFCs.

Achieving Compliance: IETF Applicability Statement

The IETF has established an LDAPv3 Working Group—ldapbis—with the charter of steering the previously mentioned LDAPv3 RFCs through the Internet Standards process. In September 2002, the working group submitted a revised LDAP specification for consideration as a draft standard. Currently, no IETF LDAPv3 conformance or compliance directive exists; however, one of the most pervasive documents to be delivered by the working group is an LDAP applicability statement.

Once complete, an LDAP applicability statement will guide independent vendors toward IETF-based LDAP compliance. In all probability, third-party test suites, based on the applicability statement, will be developed, allowing vendors and end users to accurately determine LDAP compliance. Until then, vendor declarations combined with existing third-party testing suites are the most suitable alternatives.

Achieving Compliance: Third-Party Test Suites

Several LDAP compliance testing and certification solutions exist in the marketplace today. One of the most prominent test suites is offered by the Open Group, a technology-neutral consortium. The Open Group also sponsors the Directory Interoperability Forum (DIF), which compromises architects, strategists, and product managers from customers and vendors of directories and directory-enabled applications. According to the Open Group, the DIF is chartered to “provide testing and certification for directory servers and applications that use Open Directory Standards.” Microsoft Corporation is an active member of the DIF.

The Open Group LDAP Certifications

Recently released in 2003 are two directory Open Group/DIF test certifications: LDAP Ready and LDAP Certified. With an LDAP Certified directory, the vendor guarantees its LDAP directory server conforms to the LDAPv3 specifications published by the IETF. On the other hand, an LDAP Ready application warrants that it will interoperate with an LDAP Certified directory.

To become LDAP Certified, directory vendors typically submit a warranty of LDAPv3 conformance using the Open Group’s VSLDAP Test Suite as a compliance measuring tool. The warranty is intended to ensure conformity to industry standard specifications, conformity through a product’s lifecycle, and the quick remedy of any nonconformity. At the moment, no vendor has received either of these new certifications.

Setting an LDAP Compliance Baseline

According to the DIF, a baseline of LDAP compliance can be measured by verifying support of the following RFC components. By no means exhaustive, an LDAP compliance baseline can be supplemented by support of additional LDAP RFCs, features, or extensions.

RFC	Feature	Description
2251	Abandon	The server accepts abandon requests and performs the requested abandon operations.
2251	Add	The server accepts add requests and performs the requested add operations.
2251	Anonymous Bind over SSL	The server accepts a simple bind over SSL where the password is null and treats the client as anonymously authenticated.
2251	Authenticated Simple Bind	The server accepts a simple bind request with a password and authenticates the client by that password.
2251	BER	The server correctly encodes and decodes protocol elements using ASN.1 BER as required by LDAP.
2251	Client Server Communication	The client transmits an operation request to the server, which performs the operation on the directory and returns results/errors.
2251	Compare	The server accepts compare requests and performs the requested compare operations.
2251	Data Model	Each entry must have an objectClass attribute. The attribute specifies the object classes of an entry, which along with system and user schema determine permitted attributes of entries.
2251	Delete	The server accepts delete requests and performs the requested delete operations.
2251	Modify (Add, Delete, Replace)	The server accepts modify requests and performs the operations, including additions, deletions, and replacements.
2251	ModifyDN – Move Leaf Entry to New Parent	The server accepts modify DN requests to move leaf entries to new parents and performs the leaf move operations.
2251	ModifyDN – Move Renamed Leaf Entry to New Parent	The server accepts modify DN requests to rename leaf entries and move them to new parents and performs the requested leaf rename and move operations.
2251	ModifyDN – Rename a Leaf Entry	The server accepts modify DN requests to rename leaf entries and performs the requested leaf rename operations.
2251	Search	The server accepts search requests and performs the requested search operations.
2251	Simple Bind with Password over SSL	The server accepts a simple bind request over SSL with a password and authenticates the client by that password.
2251	Simple Common Elements	The server correctly encodes, decodes, and processes the simple common elements of LDAPMessage envelope PDUs.
2251	TCP as the Transporting Protocol	A server implements mapping of LDAP over TCP and provides a protocol listener IP port 389.
2251 2252	Definition of Attributes	The server’s entries have attributes in accordance with the X.500 model. The server supports entries with attributes.
2251 2252	Definition of Object Classes	The server associates entries with object classes in accordance with the X.500 model.
2251 2253	Distinguished Name	The server correctly encodes and decodes protocol representations of distinguished names.
2251 2829	Anonymous Simple Bind	The server accepts a simple bind request where the password is null and treats the client as anonymously authenticated.
2252	Definition of Matching Rules	The server supports matching rules in accordance with the X.500 model.
2253	Parsing	The server correctly parses string representations of distinguished names.
2253	Relationship with LDAP v2	The server accepts but does not generate certain protocol constructs that are legal in LDAP v2 but not in LDAP v3.
2253	Relative Distinguished Name	The server correctly encodes and decodes protocol representations of relative distinguished names.
2829 2830	SSL over TCP as the Transporting Protocol	A server implements mapping of LDAP over SSL over TCP and provides a protocol listener IP port 636.

Active Directory’s LDAP Compliance

Windows 2000 Server

With the release of Windows 2000 Server, Microsoft made its first introduction of Active Directory®directory service. With a foundation stemming from proven Microsoft technologies, Active Directory’s origins lie within the original 1996 release of Microsoft Exchange Server 4.0. Active Directory, however, was a significant paradigm shift from the familiar Windows NT 4.0 identity store. By envisioning a directory capable of extending beyond a typical NOS store, Microsoft’s first implementation of Active Directory became a viable repository for many forms of enterprise data.

To allow Active Directory to gain popularity as an enterprise repository, Microsoft had to make an early commitment to LDAP. The Windows 2000 implementation of Active Directory is an LDAP-compliant directory supporting the core LDAP RFCs available at the time of its release. Currently, Windows 2000 Active Directory reaches LDAP compliance through support of the following RFCs.

RFC	Core LDAP RFCs	RFC Status
2251	`Lightweight Directory Access Protocol (v3)`	Proposed
2252	Lightweight Directory Access Protocol (v3): Attribute Syntax Definitions	Proposed
2253	Lightweight Directory Access Protocol (v3): UTF-8 String Representation of Distinguished Names	Proposed
2254	The String Representation of LDAP Search Filters	Proposed
2255	The LDAP URL Format	Proposed
2256	A Summary of the X.500(96) User Schema for use with LDAPv3	Proposed
2829	Authentication Methods for LDAP	Proposed
2830	Lightweight Directory Access Protocol (v3): Extension for Transport Layer Security	Proposed
RFC	Additional LDAP RFC Support	RFC Status
2696	LDAP Control Extension for Simple Paged Results Manipulation	Proposed
2247	Using Domains in LDAP/X.500 Distinguished Names	Proposed

Windows Server 2003

Building on the foundation established in Windows 2000 Server, the Active Directory service in Windows Server 2003 extends beyond the baseline of LDAP compliance into one of the most comprehensive directory servers offering a wide range of LDAP support. Accordingly, the Windows Server 2003 Active Directory service introduces a number of new LDAP capabilities targeted for IT professionals and application developers. Some of the latest LDAP features include:

Dynamic Entries – Active Directory can store dynamic entries allowing the directory to assign Time-To-Live (TTL) values to determine automatic entry deletion.
Transport Layer Security (TLS) – Connections to Active Directory over LDAP can now be protected using the TLS security protocol.
Digest Authentication Mechanism – Connections to Active Directory over LDAP can now be authenticated using the DIGEST-MD5 Simple Authentication and Security Layer (SASL) authentication mechanism. The Windows Digest Security Support Provider (SSP) provides an interface for using Digest Authentication as an SASL mechanism.
Virtual List Views (VLV) – When an LDAP query has a large result set, it is inefficient for a client application to pull down the entire set from the server. VLV allows a client application to “window” through a large result set without having to transfer the entire set from the server.
Concurrent LDAP Binds – Although not specifically defined by the IETF, Concurrent LDAP Bind provides the ability to perform multiple LDAP binds on one connection for the purpose of authenticating users.

Although the LDAP compliance guidelines proposed by the Directory Interoperability Forum signify conformance at a base level, organizations demand directory servers capable of providing robust network and application services. This is one of the driving forces behind Microsoft’s support of virtually all IETF- recognized LDAP components. Although LDAP compliance should always be considered a work in progress until full IETF standardization, Microsoft’s current LDAP compliance in Windows Server 2003 includes support of the following RFCs.

RFC	Core LDAP Requirements – RFC 3377	RFC Status	Timeline
2251	`Lightweight Directory Access Protocol (v3)`	Proposed	Windows 2000
2252	Lightweight Directory Access Protocol (v3): Attribute Syntax Definitions	Proposed	Windows 2000
2253	Lightweight Directory Access Protocol (v3): UTF-8 String Representation of Distinguished Names	Proposed	Windows 2000
2254	The String Representation of LDAP Search Filters	Proposed	Windows 2000
2255	The LDAP URL Format	Proposed	Windows 2000
2256	A Summary of the X.500(96) User Schema for use with LDAPv3	Proposed	Windows 2000
2829	Authentication Methods for LDAP	Proposed	Windows 2000
2830	Lightweight Directory Access Protocol (v3): Extension for Transport Layer Security	Proposed	Windows 2000
RFC	Additional LDAP RFC Support	RFC Status	Timeline
2696	LDAP Control Extension for Simple Paged Results Manipulation	Proposed	Windows 2000
2247	Using Domains in LDAP/X.500 Distinguished Names	Proposed	Windows 2000
2589	LDAP Protocol (v3): Extensions for Dynamic Directory Services	Proposed	Windows Server 2003
2798	Definition of the inetOrgPerson LDAP Object Class	Proposed	Windows Server 2003
2831	Using Digest Authentication as an SASL Mechanism	Proposed	Windows Server 2003
2891	LDAP Control Extension for Server Side Sorting of Search Results	Proposed	Windows Server 2003

Compliance Misconceptions

Since the release of Windows 2000 Server Active Directory, many assertions have been made about its LDAPv3 compliance. As noted previously, Active Directory and Microsoft Exchange Server have supported a baseline of LDAP compliance since inception. Furthermore, Active Directory extends this baseline by supporting a majority of the LDAP Informational and Proposed RFCs. Although Microsoft’s commitment to building one of the most comprehensive and interoperable directories in the marketplace is obvious, there are still common compliance misconceptions worth discussing.

inetOrgPerson

One of the most widely publicized Windows 2000 Server LDAP non-compliance claims pivots on the support of RFC 2798 and inetOrgPerson. RFC 2798 was submitted as an Informational RFC in April 2000 defining the inetOrgPerson LDAP object class. The object class definition is considered an extension to LDAPv3 and, as such, is not included in the RFC 3377 LDAPv3 specifications.

Nevertheless, the timing of the Windows 2000 Server release in February 2000 and the introduction of RFC 2798 in April of that same year barred the inclusion of the inetOrgPerson Object Class within the Windows 2000 Server Active Directory schema. However, any Windows 2000 Active Directory implementation can easily accommodate inetOrgPerson through Active Directory’s extensible schema.

In response to customer requests for an inetOrgPerson object class for Windows 2000, Microsoft released the Windows 2000 inetOrgPerson Kit. The kit, which is freely available, derives inetOrgPerson from the user class within Active Directory’s schema. In addition to defining the inetOrgPerson, the kit delivers basic administrative control over the class object. See the “Additional Resources” section for information on obtaining the Microsoft Windows 2000 inetOrgPerson Kit.

As noted previously, Windows Server 2003’s Active Directory schema includes the full definition and provides complete manageability of RFC 2798 and the inetOrgPerson LDAP Object Class.

Native LDAP Calls

Another common misconception is that Active Directory does not support native LDAP calls, thereby forcing developers to use Active Directory Services Interface (ADSI) for access to the directory. In fact, Active Directory fully supports native LDAP calls through support of the LDAP API (RFC 1823) — it is actually the primary access method for the majority of Window’s directory operations. Microsoft merely provides ADSI as an abstraction layer to simplify directory development and promote directory interoperability.

It is important to note that Active Directory does have some functionality that is not exposed by the LDAP protocol since the functionality extends beyond the LDAP model. For instance, LDAP applications bind to specific directory servers via the server’s DNS name whereas Active Directory applications have the option of employing a distributed locator service to find a nearby replica of a given directory partition. This option does not force ISVs to rewrite their applications since they can bind to a specific LDAP server as usual. However, if an ISV wants to use the Active Directory distributed locator service to reduce the customer cost of managing an application, they can call the DsGetDcName API and then use LDAP. Conversely, the ISV may use ADSI, which abstracts away both DsGetDcName and LDAP. The choice is entirely up to the ISV.

Active Directory fully supports writing to the Server Principal Name via the Windows Server 2003 and Windows 2000 Server LDAP APIs.

Directory Interoperability

It is not uncommon to find a variety of directories—from resource to application-specific—deployed within the enterprise today. However, multiple directories can create complex challenges for the users, administrators, and developers of any organization. From multiple user logon requirements and management complexities to developers needing to support multiple versions of their applications, directory complexities may continue without a common management and development framework. Microsoft has introduced several interoperability tools to help customers reduce the complexities of using and managing heterogeneous directory servers.

LDAP API

An LDAP API is available through the Microsoft Developer Network (MSDN) Platform SDK. It provides native LDAP functionality for client applications to search for and retrieve information from an LDAP-compliant directory server. Additional functions, such as modifying directory entries or access control to allow clients to authenticate themselves, are also available from within the LDAP API. Actually, the API supports all LDAP compliance specific functions identified in the previous sections to assist in development efforts.

Active Directory Services Interface

ADSI is a set of Component Object Model (COM) programming interfaces that make it easy for customers and independent software vendors (ISVs) to build applications that register with, access, and manage multiple directory services with a single set of well-defined interfaces. ADSI is an open API.

One of the most familiar open programming APIs is Open Data Base Connectivity (ODBC). ODBC provides open interfaces for relational databases, thus allowing developers to write applications and tools that work with any database that supports ODBC. Because of the thriving ODBC development community, every major relational database now supports ODBC. ADSI is equivalent to ODBC for directory services.

ADSI provides developers with access to multiple directory service providers through a set of interfaces. Applications written to ADSI work with any directory service that offers an ADSI provider, thus addressing some of the complexities outlined previously. By abstracting the capabilities of directory services from different network providers, ADSI presents a single set of directory service interfaces for managing network resources.

For example, ADSI defines a naming convention that can uniquely identify an ADSI object in an LDAP-compliant or similar directory. These names are called AdsPath strings. In the following code example, AdsPath contains the LDAP provider’s moniker and the provider’s specific path to identify an organizational unit in the Sample Domain:

LDAP://OU=Sales, DC=Sample, DC=COM

As an interoperability tool, ADSI is:

Open – Any directory vendor can implement an ADSI provider; Microsoft has included an LDAP ADSI provider.
Directory Service Independent – Administrative applications are not bound to a vendor’s directory service as applications can be developed without understanding of vendor-specific directory APIs.
Multiple Development Language Capable – COM applications can be written in languages such as Microsoft® Visual Basic®, Microsoft Visual Basic Scripting Edition (VBScript), and C/C++.
Simple Programming Model – ADSI consists of a small, easy-to-learn set of interfaces.
Scriptable – Any automation-compatible language can be used to develop directory service applications. Administrators and developers can use the tools they already know.
Functionally Rich – ISVs and sophisticated end users can develop serious applications using the same ADSI models used for simple scripted administrative applications.
Extensible – Directory providers, ISVs, and end users can extend ADSI with new objects and functions to add value or meet unique needs.
OLE DB Aware – ADSI provides an OLE DB interface so that programmers familiar with database programming through OLE DB can use it effectively.

Development Environments

ADSI client applications can be written in many languages. For most administrative tasks, ADSI defines interfaces and objects accessible from automation-compliant languages like Microsoft^® Visual Basic^®, Microsoft Visual Basic Scripting Edition (VBScript), Active Server Pages (ASP), and Java to the more performance and efficiency-conscious languages such as C and C++. A good foundation in COM programming is useful to the ADSI programmer.

Active Directory Application Mode

Active Directory Application Mode (ADAM) is a new capability of Microsoft Active Directory that addresses certain deployment scenarios related to directory-enabled applications. ADAM runs as a non-operating system service and, as such, does not require deployment on a domain controller. Running as a non-operating system service means that multiple instances of ADAM can run concurrently on a single server, with each instance being independently configurable.

While the vision of a single enterprise directory is still unfulfilled, ADAM represents a significant breakthrough in directory services technology that separates typical Network Operating System (NOS) functionality from a native LDAP directory.

Directory Services Markup Language

The Directory Services Markup Language (DSML) provides a means of representing directory structural information and directory operations as an XML document. The intent of DSML is to allow XML-based enterprise applications to leverage profile and resource information from a directory in their native environment. DSML allows XML and directories to work together and provides a common ground for all XML-based applications to make better use of directories.

DSML Services for Windows extends the power of the Active Directory service. Because DSML Services for Windows uses open standards such as HTTP, XML, and SOAP, a greater level of interoperability is possible. For example, in addition to the already standard Lightweight Directory Access Protocol (LDAP), many devices and other platforms have other alternatives to communicate with Active Directory. This provides a number of key benefits for IT administrators and independent software vendors (ISVs), who can now have even more open-standard choices to access Active Directory.

DSML Services for Windows supports DSML version 2 (DSMLv2). DSMLv2 is a standard approved by the Organization for the Advancement of Structural Information Standards (OASIS) and backed by many directory services vendors. Supporting the DSMLv2 specification allows better interoperability with other directory services vendors who are also supporting this standard. For information about the DSMLv2 specification and schema, see the OASIS Web site.

Microsoft Identity Integration Server 2003, Enterprise Edition

For organizations that must support heterogeneous directories, Microsoft Identity Integration Server (MIIS) 2003, Enterprise Edition, enables the integration and management of identity information across multiple repositories, systems, and platforms. Although the administration of these multiple repositories often leads to time-consuming and redundant efforts, it also causes frustration for users who may need to remember multiple IDs and passwords for varying applications or systems. The larger the organization, the greater the potential for multiple repositories and the effort required to keep them current.

MIIS augments Active Directory by providing broad interoperability capabilities, including integration with a wide range of identity repositories; provisioning and synchronizing identity information across multiple stores; and brokering changes to identity information by automatically detecting updates and sharing the changes across systems.

Additional Resources

See the following resources for further information:

Lightweight Directory Access Protocol Version 3

The IETF LDAPv3 Working Group:

http://www.ietf.org/html.charters/ldapbis-charter.html

The LDAPv3 Working Group archived newsgroup:

http://www.openldap.org/lists/ietf-ldapbis/

RFC 3377, the current definition of LDAPv3:

ftp://ftp.rfc-editor.org/in-notes/rfc3377.txt

Open Group and the Directory Interoperability Forum

The Open Group’s VSLDAP compliance testing suite overview:

http://www.opengroup.org/directory/mats/ldap2000/dsvsldap.pdf

The Directory Interoperability Forum (DIF):

http://www.opengroup.org/directory/

Developing with Active Directory Services Interface

Active Directory Services Interface developer’s reference:

http://msdn.microsoft.com/library/default.asp?url=/downloads/list/directorysrv.asp

Miscellaneous

The Microsoft Active Directory Web site:

http://www.microsoft.com/ad

Active Directory Application Mode (ADAM):

http://www.microsoft.com/windowsserver2003/adam/default.mspx

Directory Services Markup Language (DSML):

http://www.microsoft.com/windows2000/server/evaluation/news/bulletins/dsml.asp

Microsoft Identity Integration Server 2003, Enterprise Edition:

http://www.microsoft.com/windowsserver2003/technologies/directory/miis/default.mspx

The Microsoft Windows 2000 inetOrgPerson Kit:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnactdir/html/inetopkit.asp

LDAP reference for developers on MSDN:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/netdir/ldap/ldap_reference.asp

LDAP API reference for developers on MSDN:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wceldap/htm/cmconusingldapapi.asp

For the latest information about Windows Server 2003, see the Windows Server 2003 Web site at http://www.microsoft.com/windowsserver2003.

Identity and Access

Posted on 12 Apr 201312 Apr 2013 by escapebusinesssolutions in Uncategorized

Identity and Access

Windows Server 2012

Table of contents

Identity and access enhancements in Windows Server 2012………… 5

Protecting digital assets with previous versions of Windows Server ……………………………………………. 5

Protecting digital assets with Windows Server 2012 …………………………………………………………………………. 6

Dynamic Access Control ………………………………………………………………………. 7

Classification …………………………………………………………………………………………………………………………………………………… 8

Control access ………………………………………………………………………………………………………………………………………………… 8

The structure of central access policies ……………………………………………………………………………………………………………… 9

Central access policies and file servers ……………………………………………………………………………………………………………. 10

Policy staging ……………………………………………………………………………………………………………………………………………………….. 11

Access-denied remediation ………………………………………………………………………………………………………………………………. 11

Security auditing ………………………………………………………………………………………………………………………………………….. 13

Protection ………………………………………………………………………………………………………………………………………………………. 14

Active Directory Domain Services ……………………………………………………..16

Simplified deployment ………………………………………………………………………………………………………………………………. 16

Deployment with cloning …………………………………………………………………………………………………………………………. 17

Safer virtualization of domain controllers ……………………………………………………………………………………………. 17

Windows PowerShell script generation ……………………………………………………………………………………………….. 18

Active Directory for client activation ……………………………………………………………………………………………………… 19

Group-managed service accounts …………………………………………………………………………………………………………. 19

DirectAccess and remote access ……………………………………………………….20

Integrated remote access …………………………………………………………………………………………………………………………. 21

Cross-premises connectivity ……………………………………………………………………………………………………………………. 23

Improved management experience ……………………………………………………………………………………………………… 24

Easier deployment ………………………………………………………………………………………………………………………………………. 24

Improved deployment scenarios ……………………………………………………………………………………………………………. 25

Windows Server 2012: Identity and Access

Scalability improvements ………………………………………………………………………………………………………………………….. 25

Summary ……………………………………………………………………………………………….26

List of charts, tables, and figures ………………………………………………………..27

Windows Server 2012: Identity and Access

© 2012 Microsoft Corporation. All rights reserved. This document is provided “as-is.” Information and views expressed in this document, including URL and other Internet website references, may change without notice. You bear the risk of using it. This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes. You may modify this document for your internal, reference purposes.

Windows Server 2012: Identity and Access

Identity and access enhancements in Windows Server 2012

Today’s organizations need the flexibility to respond rapidly to new opportunities. They also need to give workers access to data and information—across varied networks, devices, and applications—while still keeping costs down. Innovations that meet these needs—such as virtualization, multitenancy, and cloud-based applications—help organizations maximize existing infrastructure investments, while exploring new services, improving management, and increasing availability.

While some factors—such as hybrid cloud implementations, a mobile workforce, and increased work with third-party business partners—add flexibility and reduce costs, they also lead to a more porous network perimeter. When organizations move more and more resources into the cloud, and grant network access to mobile workers and business partners outside the firewall, managing security, identity, and access control becomes a greater challenge. Adding to this challenge are increasingly stringent regulatory requirements, such as Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule and Sarbanes-Oxley Act of 2002 (SOX), both of which increase the cost of compliance.

New and enhanced capabilities in Windows Server 2012 help organizations meet these challenges by making it easier and less costly to ensure secure access to valuable digital assets and comply with regulations. These identity and access improvements include:

 Dynamic Access Control. A new feature that enables automated information governance on file servers for compliance with business and regulatory requirements.

 Active Directory Domain Services. Storing directory data and managing communication between users and domains is improved by making it easier to deploy and virtualize domain services both locally and remotely, as well as simplifying management tasks.

 DirectAccess. Always-available connectivity to the corporate network now includes simplified deployment, a streamlined management experience, and improved scalability and performance.

This paper provides an introduction to these Windows Server 2012 identity and access technologies for IT professionals.

Protecting digital assets with previous versions of Windows Server

A major obstacle faced by IT professionals today is the sheer volume of repetitive work that needs to be done, particularly when large deployments of physical or virtual desktops are involved. Manual work is not only time-consuming, but it makes systems less secure by introducing many opportunities for errors and misconfiguration. In previous versions of Windows Server, Microsoft helped reduce this type of work.

For example, Windows Server 2008 R2 included the following improvements:

Windows Server 2012: Identity and Access

 File Classification Infrastructure. This new infrastructure for classifying files by tagging them enabled IT professionals to more easily manage unstructured data in files based on their organizational value.

 Active Directory Domain Services. As part of the Information Protection Solution, Active Directory Domain Services was improved to make domain controllers easier to deploy, both on-premises and in the cloud.

 System Management and Security. The Windows PowerShell 2.0 command-line interface enabled IT professionals to automate many common tasks involved in deploying and managing desktops.

Protecting digital assets with Windows Server 2012

With Windows Server 2012, Microsoft builds on these previous improvements by making it even easier to configure, manage, and monitor users, resources, and devices to improve security and automate auditing. In addition, Windows Server 2012 includes the following new and enhanced identity, access and data protection features:

 Dynamic Access Control gives you the ability to automatically control and audit access to files in file shares across your organization based on the characteristics of both the files and the users requesting access to them. It uses claims to achieve this high degree of access control specificity. Claims, contained in security tokens, consist of assertions about a user or device, such as name or type, department, or security clearance, for example. You can employ user claims, device claims, and file classification tags to centrally control and audit access to files, as well as use Rights Management Services (RMS) to protect information in files across your organization.

 Active Directory Domain Services is important in new hybrid cloud infrastructures because it supports the increased need for security, compliance, and access control. Furthermore, unlike many competing cloud services, which require users to have a separate set of credentials hosted with the provider, Active Directory Domain Services provides continuity between your on-premises and cloud resources so that users only need a single set of credentials no matter where the resources are located. A new deployment wizard and support for cloning virtual domain controllers makes Active Directory Domain Services easier to virtualize and simpler to deploy, both locally and remotely. The introduction of Active Directory Domain Services and Windows PowerShell 3.0 integration, and the ability to capture and record command-line syntax as you perform tasks in the Service Manager interface, improves automation of manual tasks. Other new features include desktop activation support using Active Directory Domain Services and Group Managed Service Accounts.

 DirectAccess allows nearly any user who has an Internet connection to more securely access corporate resources, such as email servers, shared folders, or internal websites, with the experience of being easily connected to the corporate network. Windows Server 2012 offers a new, unified management experience, allowing administrators to configure DirectAccess and legacy virtual private network (VPN) connections from one location. Other enhancements simplify deployment, and improve performance and scalability.

The remaining sections in this paper describe these new features in more detail.

Windows Server 2012: Identity and Access

Dynamic Access Control

Dynamic Access Control in Windows Server 2012 gives IT professionals new ways to control access to file data and monitor regulatory compliance. It provides next-generation authorization and auditing controls, along with classification capabilities that let you apply information governance to the unstructured data on file servers.

Until now, file security was handled at the file and folder level. IT professionals had little control over the way security was handled by users day to day. However, by using Dynamic Access Control, you can restrict access to sensitive files regardless of user actions by establishing and enforcing file security policy at the domain level, which is then enforced across all Windows Server 2012 file servers. For instance, if a development engineer accidentally posts confidential files to a publicly shared folder, those files can still be protected from access by unauthorized users.

In addition, security auditing is now more powerful than ever, and audit tools make it easier to prove compliance with regulatory standards, such as the requirement that access to Health and Biomedical Information (HBI) is appropriately guarded and regularly monitored.

Windows Server 2012 provides the following new and enhanced ways to control access to your files while providing authorized users the resources they need:

 Classify. Automatic and manual file classification using an improved file classification infrastructure. There are several methods to manually or automatically apply classification tags to files on file servers across the organization.

 Control. Central access control for information governance. You can control access to classified files by applying central access policies (CAPs). CAPs is a new feature of Active Directory that enables you to define and enforce very specific requirements for granting access to classified files. For example, you can define which users can have access to files that contain health information within the organization by using claims that might include employment status (such as full-time or contractor) or access method (such as managed computer or guest). You can even require two-factor authentication, such as that provided by smartcards. Central access control functionality includes the ability to provide automated assisted access-denied remediation when users have problems gaining access to files and shares.

 Audit. File access auditing for forensic analysis and compliance. You can audit access to files by using central audit policies, for example, to identify who gained (or tried to gain) access to highly sensitive information.

 Protect. Classification-based encryption. You can apply protection by using automatic RMS encryption for sensitive documents. For example, you can configure Dynamic Access Control to automatically apply RMS protection to all documents that contain HIPAA information. This feature requires a previously provisioned RMS environment.

Dynamic Access Control can provide these classification, control, audit, and protection capabilities because it is built on top of the following technologies:

 A new Windows authorization and audit engine that can process conditional expressions and central policies

 Kerberos support for user claims and device claims within Active Directory Domain Services

Windows Server 2012: Identity and Access

 Improvements to the File Classification Infrastructure

 RMS extensibility support so that partners can provide solutions that encrypt third-party files

You can use the Dynamic Access Control application programming interface (API) to extend these technologies and create custom classification tools, audit software, and more.

Classification

The first step in establishing file access policies that are more secure is to identify the files, and then classify them by applying tags to group files based on the information they contain. In Windows Server 2012, files are tagged in one of four ways:

 By location. When a file is stored on a file server, it inherits the tags from its parent folder. Folder tags are specified by the folder owner.

 Manually. Users and administrators can manually apply tags through the Windows 8 operating system File Explorer interface, or use data entry applications to apply them.

 Automatically. Automatic classification processes in Windows Server 2012 can automatically tag files, depending on the content of the file. This method is useful for applying tags to large numbers of files.

 By application API. Applications can use APIs to tag files that they manage. For example, tags can be specified by line-of-business (LOB) applications that store information on file servers, or by data management applications.

Control access

Controlling access to files is enforced by central access policies—sets of authorization policies that you centrally manage in Active Directory Domain Services, and deploy to file servers using Group Policy. You can use CAPs to comply with both organizational and regulatory requirements. CAPs help you to create more complete access policies by pairing information about files in the form of file tags with user and device claims.

In earlier versions of Windows Server, claims were used only by Active Directory Federation Services to authorize users in one domain to use applications in different, federated domains based on attributes submitted to Active Directory Federation Services. In Dynamic Access Control, the functionality of claims is essentially the same. In both cases, a claim consists of one or more statements (for example, name, identity, key, group, privilege, or capability) made about a user or device. These statements are contained in a security token that is issued and signed by a trusted partner or entity (such as Active Directory Domain Services) and used for authorizing that user or device to access a resource. You can create claim properties, either manually by using the Active Directory management tools or by using an identity management tool. In the case of Dynamic Access Control, the token is issued by Active Directory Domain Services and the resource to be accessed is a file.

In Dynamic Access Control, claims can be combined into logical policies that enable fine-grained control over arbitrarily-defined subsets of files. The following are two examples of situations where you might want to apply such policies:

 To obtain access to high-business-impact (HBI) information, a user might be required to be a full-time employee. In this scenario, you would have to do the following:

o Identify and tag the files that contain HBI information.

Windows Server 2012: Identity and Access

o Identify the full-time employees in your organization.

o Create a central access policy that applies to all files that contain HBI on all file servers across the organization.

 To enforce an organization-wide requirement to restrict access to personally identifiable information (PII) in files so that only the file owner and members of the human resources (HR) department are allowed to view it, you might implement a policy that applies to all PII files independent of their location. In this scenario, you would have to do the following:

o Identify and classify (tag) the files that contain PII.

o Identify the group of HR members that are allowed to view PII information.

o Create a CAP that applies to all files that contain PII on all file servers across the organization.

The motivation to deploy and enforce an authorization policy can arise for different reasons and from multiple levels of the organization. The following are some examples:

 Organization-wide authorization policy. Most commonly initiated from the information security office, this type of authorization policy arises from compliance or another high-level requirement that is relevant across the organization. For example, HBI files should be accessed by full-time employees only.

 Departmental authorization policy. Various departments in an organization may have special data-handling requirements that they want to enforce. For example, the finance department might want to limit access to finance servers for the finance employees.

 Specific data-management policy. This type of policy usually arises from compliance and organizational requirements for protecting information that is being managed, such as to prevent modification or deletion of files that are under retention or files that are under electronic discovery (eDiscovery).

 Need-to-know policy. This type of policy is typically used in conjunction with the policy types mentioned earlier. The following are two examples:

o Vendors should be able to access and edit only those files that relate to a project that they are working on.

o In financial institutions, information barriers are important so that analysts do not access brokerage information and brokers do not access analysis information.

The structure of central access policies

CAPs stored in Active Directory Domain Services act as security umbrellas that an organization applies across its file servers. These policies supplement—but do not replace—the local access policy, or discretionary access control list (DACL), applied to files and folders. For example, if a local DACL allows access to a specific user, but a CAP that is applied to the file denies access to the same user, the user cannot access the file. The reverse also applies: if a CAP allows access to a user but the local DACL denies it, the user cannot access the file. File access is only possible when permitted by both local DACLs and CAPs.

A CAP can contain many rules—each of which are evaluated and deployed as part of an overall CAP. Each rule contained in the CAP has the following logical parts:

 Applicability. This is a condition that defines which files the rule applies to. For example, the condition can define all files tagged as containing personal information.

Windows Server 2012: Identity and Access

 Access conditions. This is a list of one or more access control entries (ACEs) that define who can access the data, such as allow read and write access if the user has a high clearance level and their device is a managed device.

Figure 1 shows the components of a CAP rule, and how they can be combined to create very explicit data access policies.

Figure 1: CAP components

Central access policies and file servers

Figure 2 shows the interrelationships between Active Directory Domain Services, where CAPs, user claims, and property definitions are defined and stored; the file server, where these policies are applied; and the user, who is trying to gain access to a file on the file server.

Figure 2: CAP structure

Figure 3 shows how you can combine different rules (blue boxes) into a CAP (green boxes) that can then be applied to shares on file servers across the organization.

Windows Server 2012: Identity and Access

Figure 3: Combining multiple policies into policy lists and applying them to resources

Policy staging

When you want to change a policy, Microsoft Windows 2012 lets you test a proposed policy that runs parallel to the current policy so that you can identify the consequences of the new policy without enforcing it. This feature, which is known as policy staging, lets you measure the effects of a new policy in the production environment.

When policy staging is enabled, Windows Server 2012 continues to use the current policy to authorize user access to files. However, if the access allowed by the proposed policy differs from that of the current policy, the system logs an event with the details. You can use the logged events to determine whether the policy has to be changed or is ready to be deployed.

Access-denied remediation

Of course, denying access is only part of an effective central access control strategy, and sometimes access must be granted after at first being denied. Today, when access is denied, the user does not receive additional information on how to get access. This causes a lot of pain for both users and help desk or IT administrators. To mitigate this problem, assisted access-denied remediation in Windows Server 2012 enables you to provide the user with additional information and the opportunity to send an access request email message to the appropriate owner. Access-denied remediation reduces the need for manual intervention by providing the following three different processes for granting users access to resources:

 Self-remediation. If users can determine what the issue is and correct the problem so that they can get the requested access, the impact on the organization is low, and minimal special exceptions are needed in the organization policy. Windows Server 2012 helps you to author a general access-denied message to help users self-remediate when access is denied. This message can include URLs to direct the users to self-remediation websites provided by the organization.

Windows Server 2012: Identity and Access

 Remediation by the file owner. Windows Server 2012 enables you to create a distribution list of file or folder owners so that users can directly connect with them to request access. This resembles the Microsoft SharePoint model, where the share owner receives a user’s request for access to a file. Remediation can range from adding user rights to the appropriate file or folder to editing share permissions. For example, if a local DACL on a file allows access to a specific user but a CAP restricts access to the same user, the user will be unable to gain access to the file. In Windows Server 2012 when a user requests access to a file or a folder, an email message with the request details is sent to the file owner. When additional help is required, the file owner can in turn forward this information to the appropriate IT administrator.

 Remediation by help desk and file server administrators. When the user cannot self-remediate an access problem and the file owner cannot help, the issue can be corrected manually by the help desk or file server administrator. This is the most costly and time-consuming remediation. Windows Server 2012 provides a UI to view the effective permissions for users on a file or a folder so that it is easier to troubleshoot access issues.

Figure 4 shows the series of events involved in access-denied remediation.

Figure 4: Access-denied remediation

Access denied remediation provides a user access to a file when it has been initially denied:

Windows Server 2012: Identity and Access

1. The user attempts to read a file.

2. The server returns an “access denied” error message because the user has not been assigned the appropriate claims.

3. On a computer running Windows 8, Windows retrieves the access information from the File Server Resource Manager on the file server, and presents a message with the access remediation options, which may include a link for requesting access.

4. When the user has satisfied the access requirements (for example, by signing a non-disclosure agreement, or providing other authentication) the user’s claims are updated.

5. The user can access the file.

Security auditing

Security auditing of file access is one of the most powerful tools to help maintain the security of an organization. A key goal of security auditing is regulatory compliance. For example, industry standards such as SOX, HIPAA, and Payment Card Industry (PCI) require organizations to follow a strict set of rules related to data security and privacy. Security audits help establish the presence or absence of such policies and thereby prove compliance or noncompliance with these standards. Additionally, security audits help detect anomalous behavior, identify and reduce gaps in security policy, and deter irresponsible behavior by creating a trail of user activity that can be used for forensic analysis. Audit policy requirements typically arise from three levels:

 Information security. File access audit trails are frequently used for forensic analysis and intrusion detection. The ability to monitor specified events regarding access to high-value information lets organizations significantly improve their response time and investigation accuracy.

 Organizational policy. For example, organizations regulated by PCI standards can have a central policy to monitor access to all files that are marked as containing credit card information and PII, or organizations may want to monitor all unauthorized attempts to view information about their projects.

 Departmental policy. For example, the finance department may require that the ability to modify certain finance documents (for example, a quarterly earnings report) be restricted to the finance department, and thus want to monitor all other attempts to change these documents. Additionally, the compliance department may want to monitor all changes to central policies and policy constructs such as user, computer, and resource attributes.

One of the biggest considerations for security audits is the cost of collecting, storing, and analyzing audit events. If the audit policies are too broad, the volume of audit events that are collected increases, making it more time-consuming and expensive to identify the most important audit events. However, if the audit policies are too narrow, you risk missing important events.

With Windows Server 2012, you can author audit policies by combining claims and resource properties (file tags). This leads to audit policies that are richer, more selective, and easier to manage by reducing the number of potential audit events to those most relevant to your auditing requirements. It enables scenarios that until now were either impossible, or too difficult to implement. The following are examples of such audit policies:

Windows Server 2012: Identity and Access

 Audit everyone who does not have a high security clearance and yet tries to access an HBI document. In this example, “high security clearance” is a claim, and “HBI” is a file tag. The audit policy triggers an event when a user who does not have a high security clearance tries to gain access to an HBI document.

 Audit all vendors when they try to access documents related to projects that they are not working on. In this example, the user’s claims include an employment status of “vendor” as well as a list of projects the user is authorized to work on. In addition, each file in the organization to which a vendor could potentially have access was tagged to identify its associated project. The audit policy triggers an event if a vendor tries to access a project file, and none of the projects in the vendor’s claim list matches the file’s project tag.

To view and query audit events you can use familiar tools such as Event Viewer on the local server or Microsoft System Center Operations Manager Audit Collection Service across multiple servers. The Dynamic Access Control API also provides support for integrating DAC audit events into third- party audit software consoles. These tools help you to answer such questions as, “Who is accessing my HBI data?” or “Was there an unauthorized attempt to access sensitive data?”

Figure 5 shows the file-access auditing workflow and the interrelationships between Active Directory, where claim types and resource properties are created; Group Policy, where the audit policies are defined and stored; the file server, where policies and resource properties are applied as file tags; and the user, who is trying to access information on the file server.

Figure 5: Central auditing workflow

Protection

Protecting sensitive information involves reducing risk for the organization. Various regulations, such as HIPAA or Payment Card Industry Data Security Standard (PCI-DSS), require encryption of information, and there are many business reasons to also encrypt sensitive information. However, encryption is expensive and can adversely affect productivity. Therefore, organizations usually have different approaches and priorities for it.

Windows Server 2012: Identity and Access

To support this scenario, Windows Server 2012 lets you automatically encrypt sensitive Microsoft Office files based on their classification. This is done through automatic file management of tasks that are running on the server, and that start RMS protection for sensitive Microsoft Office documents a few seconds after the file is identified as being a sensitive file on the file server (continuous file management tasks).

RMS encryption provides another layer of protection for files. If a person with access to a sensitive file inadvertently sends that file out through email, the file is still protected by the RMS encryption. Nearly any user who wants to gain access to the file must first authenticate to an RMS server to receive the decryption key. This process is illustrated in Figure 6.

Figure 6: Classification-based RMS protection

Dynamic Access Control allows sensitive information to be automatically protected using Active Directory RMS:

1. A rule is created to automatically apply RMS protection to nearly any file that contains the word “confidential.”

2. A user creates a file with the word “confidential” in the text and saves it.

3. The RMS Dynamic Access Control classification engine, following rules set in the CAP, discovers the document with the word “confidential,” and initiates RMS protection accordingly.

4. The RMS template and encryption are applied to the document on the file server, and it is classified and encrypted.

Note that support for third-party file formats is available through third-party vendors. Also be aware that if a file is RMS-protected, data management features such as search- or content-based classification are no longer available for that file.

Windows Server 2012: Identity and Access

Active Directory Domain Services

Active Directory has been at the center of IT infrastructure for more than 10 years, and its features, adoption, and business value have grown with each new release. Today, most Active Directory infrastructure remains on-premises, but the trend toward cloud computing is creating the need to deploy Active Directory in the cloud as well.

New hybrid infrastructures are emerging, and Active Directory Domain Services must support the needs of new and unique deployment models that include services hosted entirely in the cloud, services that consist of both cloud and on-premises components, and services that remain exclusively on-premises. These new hybrid models further increase the importance of security and compliance, and compound the already complex and time-consuming exercise of ensuring that access to information and services is appropriately audited, and accurately expresses the intent of the organization.

Active Directory Domain Services in Windows Server 2012 addresses these emerging needs with features that help you more quickly and easily deploy domain controllers both on-premises and in the cloud, audit and authorize access to files, and perform administrative tasks at scale—either locally or remotely—through consistent graphical and scripted management interfaces. Active Directory Domain Services in Windows Server 2012 improvements include the following:

 Simpler on-premises deployment, which replaces DCpromo with a new, streamlined domain controller configuration wizard that is integrated with Server Manager and built on Windows PowerShell 3.0.

 More rapid deployment of virtual domain controllers through cloning.

 Better support for public and private cloud implementations through safer virtualization of domain controllers.

 A consistent graphical and scripted management interface that enables you to perform tasks in the Active Directory Administrative Center and automatically generate the syntax required to fully automate the task in Windows PowerShell 3.0.

 Functionality that uses Active Directory to simplify client activations.

 Group Managed Services Accounts for groups of servers such as server clusters that share their identity and service principal name.

Using these features, you can effectively and efficiently deploy and manage Active Directory Domain Services over multiple servers, locally and around the globe.

Simplified deployment

The new Active Directory Domain Services configuration wizard in Windows Server 2012 integrates all the required steps to deploy new domain controllers into a single graphical interface. It requires only one organization-level credential, and can prepare the forest or domain by remotely targeting the appropriate operations master role holders. It conducts extensive prerequisite validation tests that minimize the opportunity for errors that might have otherwise blocked or slowed the installation. The wizard is built on

Windows Server 2012: Identity and Access

Windows PowerShell 3.0, and is integrated with Server Manager. It can configure multiple servers and remotely deploy domain controllers, resulting in a deployment experience that is simpler, more consistent, and less time-consuming.

The Active Directory Domain Services configuration wizard includes the following features:

 Adprep integration into the Active Directory Domain Services deployment process. This reduces the time that is required to deploy Active Directory Domain Services, and reduces the chances for errors that might block domain controller promotion.

 Remote execution against multiple servers. This greatly reduces the probability of administrative errors and the overall time required for deployment, especially when you deploy multiple domain controllers across global countries/regions and domains.

 Prerequisite validation. This identifies potential errors before the deployment begins. You can correct error conditions before they occur without the concerns that result from a partially complete upgrade.

 Configuration pages grouped in a sequence that mirror the requirements of the most common promotion options, with related options grouped in fewer wizard pages. This provides better context for making installation choices, and reduces the number of steps and time that is required to complete domain controller installation.

 Options that were specified in the wizard are exported into a Windows PowerShell script. This simplifies the process of automating later Active Directory Domain Services installations through automatically generated Windows PowerShell scripts.

Deployment with cloning

With earlier versions of Windows Server, administrators found that deploying virtualized replica domain controllers was as labor-intensive as deploying physical domain controllers. In theory, this should not be the case because virtualization brings the possibility of cloning domain controllers, instead of performing all deployment steps separately for each one. Domain controllers within the same domain/forest are nearly identical, except for name, IP address, and so on. Therefore virtualization should be fairly easy. With earlier versions of Windows Server, however, deployment still involved many (redundant) steps.

With Windows Server 2012, you can deploy replica virtual domain controllers by “cloning” existing virtual domain controllers. This significantly reduces the number of steps and time involved by eliminating repetitive deployment tasks and also lets you fully deploy additional domain controllers that are authorized and configured for cloning by the Active Directory domain administrator.

Safer virtualization of domain controllers

Active Directory Domain Services has been successfully virtualized for several years, but features present in most hypervisors can invalidate strong assumptions made by the Active Directory replication algorithms—primarily, the assumption that the logical clocks used by domain controllers to determine relative levels of convergence only go forward in time. Windows Server 2012 includes improvements that enable virtual domain controllers to detect when snapshots are applied to a virtual machine or a virtual machine is copied, causing the domain controller clock to go backward in time.

This new functionality is made possible by a virtual domain controller that uses a unique ID exposed by the hypervisor, called the virtual machine GenerationID. The virtual machine GenerationID changes when

Windows Server 2012: Identity and Access

the virtual machine experiences an event that affects its position in time. The virtual machine GenerationID is exposed to the virtual machine’s address space within its Basic Input Output System (BIOS), and is made available to its operating system and applications through a Windows Server 2012 driver.

During startup, and before completing any transactions, a Windows Server 2012 virtual domain controller compares the current value of the virtual machine GenerationID against the value that it stored in the directory. A mismatch is interpreted as a “rollback” event, and the domain controller uses safeguards in Active Directory Domain Services that are new to Windows Server 2012. The safeguards enable the virtual domain controller to converge with other domain controllers and also prevent it from creating duplicate security principals.

For Windows Server 2012 virtual domain controllers to gain this extra level of protection, the virtual domain controller must be hosted on a virtual machine GenerationID–aware hypervisor such as Windows Server 2012 Hyper-V.

Windows PowerShell script generation

The Windows PowerShell cmdlets for Active Directory are a set of tools that help you to manipulate and query Active Directory Domain Services by using Windows PowerShell commands, and to create scripts that automate common administrative tasks. The Active Directory Administrative Center uses these cmdlets to query and modify Active Directory Domain Services according to the actions that are performed within the Active Directory Administrative Center UI.

In Windows Server 2012, the Windows PowerShell History viewer in the Active Directory Administrative Center, as shown in the following figure, lets an administrator view the Windows PowerShell commands as they execute in real time. For example, when you create a new fine-grained password policy, the Active Directory Administrative Center displays the equivalent Windows PowerShell commands in the Windows PowerShell History viewer task pane. You can then use those commands to automate the process by creating a Windows PowerShell script.

Figure 7: Windows Server 2012 Windows PowerShell History viewer

By combining scripts with scheduled tasks, you can automate everyday administrative duties that were previously completed manually. Because the cmdlets and required syntax are created for you, very little experience with Windows PowerShell is required. Because the Windows PowerShell commands are the same as the ones executed by the Active Directory Administrative Center, they should replicate their original function exactly.

Windows Server 2012: Identity and Access

Active Directory for client activation

Client licensing is an additional labor-intensive task that can be eased with the improved Active Directory functionality in Windows Server 2012. With earlier versions of Windows Server, volume licensing for Windows and Office required Key Management Service (KMS) servers. These entail several drawbacks:

 They require remote procedure call (RPC) network traffic, which some organizations want to disable.

 Additional training is necessary.

 The turnkey solution only covers approximately 90 percent of deployments.

 There is no graphical administration console, so the process is more complex than it needs to be.

 KMS does not support any kind of authentication, because the Microsoft Software License terms prohibit the customer from connecting the KMS server to any external network.

 Access to the service means that anyone can be activated.

This situation is improved in Windows Server 2012 because it helps leverage your existing Active Directory infrastructure to help you activate clients. No additional computers are required, and no RPC is needed. The activation uses Lightweight Directory Access Protocol (LDAP) exclusively and includes support for read-only domain controllers (RODCs).

In this activation process, the only data written back to the directory is what’s required for the installation and service. Activating the initial customer-specific volume license key (CSVLK) requires the following:

 One-time contact with Microsoft Activation Services over the Internet (identical to retail activation).

 A key entered using volume activation server role or the command line.

 Repetition of the activation process for additional forests (by default, up to six times).

Another benefit of Active Directory integration is that the activation object is maintained in the configuration partition. This represents proof-of-purchase, and means that the activated computers can be a member of any domain in the forest. And, perhaps most important, with Active Directory activation integration, all computers that are running Windows 8 will automatically activate. This represents a significant workflow improvement over earlier versions of Windows Server, and is achieved by using additional resources that are provided in Active Directory.

Group-managed service accounts

Managed service accounts (MSAs) were a new type of account introduced in Windows Server 2008 R2 and Windows 7 to enhance the service isolation and manageability of network applications such as Microsoft SQL Server and Exchange Server. They eliminate the need for an administrator to manually administer the service principal name (SPN) and credentials for domain-level service accounts.

Until now, however, this feature has not been available for server groups, such as clusters, that share their identity and service principal name. This creates a problem for IT administrators.

When a client connects to a service hosted on a server farm using network load balancing (NLB) or some other method where all the servers appear to be the same service to the client, authentication protocols supporting mutual authentication, such as Kerberos, cannot be used unless all the instances of the

Windows Server 2012: Identity and Access

services use the same principal (which means that they use the same passwords/keys to prove their identity). Service administrators find managing this to be difficult.

When a client connects to a shared service it cannot know in advance which instance it will connect to, so the authentication must succeed regardless of the host. This requires that each instance of the server use the same security principal. Today, services have four principals to choose from, each with their own issue: computer, virtual, managed service, or user.

Computer, MSA, or virtual accounts cannot be shared across multiple systems. This only leaves the option of using a user account for services on server farms. Because user accounts do not have password management, each organization then has to create a solution to update keys for the service in Active Directory, and distribute the keys to all instances of the services. This is expensive and problematic.

By creating a group MSA, services or service administrators do not have to manage password synchronization between service instances. The group MSA supports credential reset, hosts that are kept offline for some time, and seamless management of member host group management for all instances of a service.

 You can deploy single-identity server farms/clusters on Windows Server 2012 to which domain clients can authenticate without knowing which instance of a server farm/cluster they are connecting.

 You can configure services by using Service Control Manager to use a shared domain identity that automatically manages passwords.

 As soon as the group MSA is created, a domain administrator can delegate management of the group MSA to a service administrator.

 You can deploy single identity server farms/clusters on Windows Server 2012 servers running Windows 8 for identities in mixed-mode domains.

DirectAccess and remote access

Windows Server 2012 provides an integrated remote access solution that is easier to deploy and manage when compared to earlier versions that relied on multiple tools and consoles. Employees can gain access to corporate network resources while they work remotely, and IT administrators can manage corporate computers in Active Directory that are located outside the internal network. Windows Server 2012 accomplishes this by integrating two existing remote access technologies: DirectAccess for automatic, transparent connectivity, and traditional virtual private networks (VPNs) for compatibility.

DirectAccess was introduced in Windows 7 and Windows Server 2008 R2 to help remote users to more securely access shared resources, websites, and applications on an internal network without connecting to a VPN. DirectAccess establishes bidirectional connectivity with an organization’s corporate network every time a DirectAccess-enabled computer is connected to the Internet. Users never have to think about connecting to the corporate network, and IT administrators can manage remote computers outside the office, even when the computers are not connected to the VPN. Windows Server 2012 continues to offer

Windows Server 2012: Identity and Access

this transparent connection to the corporate network, with improvements around deployment, management, performance, and scalability. Remote access improvements in Windows Server 2012 include:

 Integrated remote access: DirectAccess and VPN can be configured together in the Remote Access Management console by using a single wizard. The new role allows easier migration of Windows 7 Routing and Remote Access service (RRAS) and DirectAccess deployments.

 Cross-premises connectivity. Windows Server 2012 provides a highly cloud-optimized operating system. VPN site-to-site functionality in remote access provides cross-premises connectivity between enterprises and hosting service providers, including Azure.

 Improved management experience: By using the new Remote Access Management console, you can configure, manage, and monitor multiple DirectAccess and VPN remote access servers in a single location. The console provides a dashboard that allows you to view information about server and client activity.

 Simplified deployment: In simple deployments, you can configure DirectAccess without being required to set up a certificate infrastructure. DirectAccess clients can now authenticate themselves by using only Active Directory credentials; no computer certificate is required.

 New deployment scenarios. Remote access in Windows Server 2012 includes integrated deployment for several scenarios that required manual configuration in Windows Server 2008 R2. These include force tunneling (which sends all traffic through the DirectAccess connection), Network Access Protection (NAP) compliance, support for locating the nearest remote access server from DirectAccess clients in different geographical locations, and deploying DirectAccess for only remote management.

 Improved scalability: Remote access in Windows Server 2012 offers several scalability improvements than support more users while providing better performance and lower costs. These include support for network load balancing, better performance in virtualized environments, and underlying platform improvements.

Integrated remote access

With DirectAccess, users who have an Internet connection can more securely access corporate resources, such as email servers, shared folders, or internal websites, with the experience of being easily connected to the corporate network.

DirectAccess transparently connects client computers to the internal network whenever the computer connects to the Internet, even before the user logs on, as shown in the following figure. This transparent, automatic connectivity means that access is provided without additional steps or configuration required by the user.

Windows Server 2012: Identity and Access

Figure 8: DirectAccess connection architecture

DirectAccess also lets you easily monitor connections and remotely manage DirectAccess client computers on the Internet.

At the same time, RRAS provides traditional client VPN connectivity for unmanaged client computers, such as computers running client operating systems earlier than Windows 7. In addition, RRAS site-to-site VPN provides connectivity between VPN servers, as shown in the following figure.

Figure 9: VPN connection architecture

The remote access server role in Windows Server 2012 integrates DirectAccess and RRAS VPN. You can configure DirectAccess and VPNs together in the Remote Access Management console by using a single wizard. You also can configure other RRAS features by using the legacy RRAS management console. The new role allows easier migration of Windows 7 RRAS and DirectAccess deployments, and it provides new features and improvements.

Windows Server 2012: Identity and Access

Windows Server 2012

Cross-premises connectivity

Windows Server 2012 provides an operating system that is highly optimized for the cloud. VPN site-to-site functionality in remote access provides cross-premises connectivity between enterprises and hosting service providers. Cross-premises connectivity enables organizations to connect to private subnetworks in a hosted cloud network. It also enables connectivity between geographically separate enterprise locations.

With cross-premises connectivity, you can use existing networking equipment to connect to hosting providers by using the industry standard Internet Key Exchange version 2 (IKEv2) and IP security (IPsec) protocol.

Figure 10 demonstrates how these two organizations implement cross-premises deployments by using Windows Server 2012.

Figure 10: Example of a cross-premises deployment

Windows Server 2012: Identity and Access

The following steps describe the procedures that Contoso and Woodgrove use for the cross-premises deployment shown in Figure 10:

1. Contoso.com and Woodgrove.com offload some of their enterprise infrastructure in a hosted cloud.

2. The hosting provider provides private clouds for each organization.

3. In the hosted cloud, virtual machines running Windows Server 2012 are configured as remote access servers running site-to-site VPN.

4. In each hosted private cloud, a cluster of two or more remote access servers is deployed to provide continuous availability and failover.

5. Contoso.com has two branch office locations. In each location, a Windows Server 2012 remote access server is deployed to provide a cross-premises connectivity solution to the hosted cloud and between the branch offices.

6. The Contoso.com branch office computers running the unified remote access server role in Windows Server 2012 are also configured as DirectAccess servers in a multisite deployment. DirectAccess clients can access any resource in the Contoso.com public cloud or Contoso.com branch offices from nearly any location on the Internet.

7. Woodgrove.com can use existing routers to connect to the hosted cloud because cross-premises functionality in Windows Server 2012 complies with IKEv2 and IPsec standards.

Improved management experience

By using the new Remote Access Management console, you can configure, manage, and monitor multiple DirectAccess and VPN remote access servers in a single location. The console provides a dashboard that allows you to view information about server and client activity. You can also generate reports for additional, more detailed information. Operations status provides comprehensive monitoring information about specific server components. Event logs and tracing help diagnose specific issues. By using client monitoring, you can see detailed views of connected users and computers, and you can even monitor which resources the clients are accessing. Accounting data can be logged to a local database or a Remote Authentication Dial-In User Service (RADIUS) server.

In addition to the Remote Access Management console, you can use Windows PowerShell command-line interface tools and automated scripts for remote access setup, configuration, management, monitoring, and troubleshooting.

On client computers, users can access the Network Connectivity Assistant application, integrated with Windows Network Connection Manager, to see a concise view of the DirectAccess connection status and links to corporate help resources, diagnostics tools, and troubleshooting information. Users can also enter one-time password (OTP) credentials if OTP authentication for DirectAccess is configured.

Easier deployment

The enhanced installation and configuration design in Windows Server 2012 allows you to set up a working deployment without changing your internal networking infrastructure. In simple deployments, you can configure DirectAccess without setting up a certificate infrastructure. DirectAccess clients can now authenticate themselves by using only Active Directory credentials; no computer certificate is required. In

Windows Server 2012: Identity and Access

addition, you can choose to use a self-signed certificate created automatically by DirectAccess for IP-HTTPS and for authentication of the network location server.

To further simplify deployment, DirectAccess in Windows Server 2012 supports access to internal servers that are running IPv4 only. An IPv6 infrastructure is not required for DirectAccess deployment.

Improved deployment scenarios

The remote access server role in Windows Server 2012 includes additional enhancements, including integrated deployment for several scenarios.

 With Windows Server 2012, you can now configure a DirectAccess server with two network adapters at the network edge or behind an edge device, or with a single network adapter running behind a firewall or network address translation device. The ability to use a single adapter removes the requirement to have dedicated public IPv4 addresses for DirectAccess deployment. With this configuration, clients connect to the DirectAccess server by using IP-HTTPS.

 In Windows Server 2012, you can configure remote access servers in a multisite deployment that allows users in dispersed geographical locations to connect to the multisite entry point closest to them. You can distribute and balance traffic across the multisite deployment by using an external global load balancer. To support fault tolerance, redundancy, and scalability, DirectAccess servers can now be deployed in a cluster configuration that uses Windows load balancer or an external hardware load balancer.

 DirectAccess in Windows Server 2012 adds support for two-factor authentication that uses an one-time password (OTP). For two-factor smart card authentication; Windows Server 2012 supports the use of Trusted Platform Module (TPM)-based virtual smart card capabilities that are available in Windows 8. The TPM of client computers can act as a virtual smart card for two-factor authentication, which reduces the overhead and costs incurred in smart card deployment.

 Windows Server 2012 also introduces the ability of computers to join an Active Directory domain and receive domain settings remotely via the Internet. By using this capability, you will find that deployment of new computers in remote offices and provisioning of client settings to DirectAccess clients is easier. You can configure client computers running Windows 8, Windows 7, and Windows Server 2008 R2 as DirectAccess clients. Clients running Windows 8 have access to all DirectAccess features, and they have an improved experience when connecting from behind a proxy server that requires authentication. Clients not running Windows 8 have the following limitations:

o They must download and install the DirectAccess Connectivity Assistant tool.

o They require a computer certificate for authentication.

o In a multisite deployment, they must be configured to always connect through the same entry point.

Scalability improvements

Remote access offers several scalability improvements, including support for more users with better performance and lower costs:

Windows Server 2012: Identity and Access

 You can cluster multiple remote access servers for load balancing, continuous availability, and failover. Cluster traffic can be load-balanced by using NLB or a third-party load balancer. Servers can be added to or removed from the cluster with few interruptions to the connections in progress.

 The remote access server role takes advantage of Single Root I/O Virtualization (SR-IOV) for improved I/O performance when running on a virtual machine. In addition, remote access improves the overall scalability of the server host with support for IPsec hardware offload capabilities, available on many server interface cards that perform packet encryption and decryption in hardware.

 Optimization improvements in IP-HTTPS use the encryption that IPsec provides. This optimization, combined with the removal of the Secure Sockets Layer (SSL) encryption requirement, increases scalability and performance.

With the new DirectAccess and remote access enhancements, you can easily provide more secure remote access connections for your users, as well as log reports for monitoring and troubleshooting those connections. The new features in Windows Server 2012 support deployments in dispersed geographical locations, improved scalability with continuous availability, and improved performance in virtualized environments.

Summary

Identity and access control are two areas that require critical attention from IT professionals, particularly when you move to virtualized and private or public cloud environments. Windows Server 2012 makes these tasks easier by offering simple but powerful new and enhanced features to provide intelligent, auditable security; easier deployment and management of Active Directory Domain Services; and secure, always-on connectivity to the corporate resources.

For more information, see the Windows Server 2012 website at: http://www.microsoft.com/windowsserver2012/.

Windows Server 2012: Identity and Access

List of charts, tables, and figures

Figure 1: CAP components ………………………………………………………………………………………………………………10

Figure 2: CAP structure ……………………………………………………………………………………………………………………10

Figure 3: Combining multiple policies into policy lists and applying them to resources ……………………..11

Figure 4: Access-denied remediation ……………………………………………………………………………………………….12

Figure 5: Central auditing workflow …………………………………………………………………………………………………14

Figure 6: Classification-based RMS protection ………………………………………………………………………………….15

Figure 7: Windows Server 2012 Windows PowerShell History viewer ………………………………………………18

Figure 8: DirectAccess connection architecture ………………………………………………………………………………..22

Figure 9: VPN connection architecture …………………………………………………………………………………………….22

Figure 10: Example of a cross-premises deployment ………………………………………………………………………..23

Microsoft Press eBook Introducing Microsoft SQL Server 2012

Posted on 4 Apr 20134 Apr 2013 by escapebusinesssolutions in Uncategorized

Contents

Introduction. xv

PART 1 DATABASE ADMINISTRATION

Chapter 1 SQL Server 2012 Editions and Engine Enhancements 3

SQL Server 2012 Enhancements for Database Administrators. 4

Availability Enhancements. 4

Scalability and Performance Enhancements. 6

Manageability Enhancements. 7

Security Enhancements. 10

Programmability Enhancements. 11

SQL Server 2012 Editions. 12

Enterprise Edition. 12

Standard Edition. 13

Business Intelligence Edition. 14

Specialized Editions. 15

SQL Server 2012 Licensing Overview. 15

Hardware and Software Requirements. 16

Installation, Upgrade, and Migration Strategies. 17

The In-Place Upgrade. 17

Side-by-Side Migration. 19

What do you think of this book? We want to hear from you!

Microsoft is interested in hearing your feedback so we can continually improve our books and learning

resources for you. To participate in a brief online survey, please visit:

microsoft.com/learning/booksurvey

viii Contents

Chapter 2 High-Availability and Disaster-Recovery

Enhancements 21

SQL Server AlwaysOn: A Flexible and Integrated Solution. 21

AlwaysOn Availability Groups . 23

Understanding Concepts and Terminology . 24

Configuring Availability Groups . 29

Monitoring Availability Groups with the Dashboard. 31

Active Secondaries. 32

Read-Only Access to Secondary Replicas . 33

Backups on Secondary . 33

AlwaysOn Failover Cluster Instances. 34

Support for Deploying SQL Server 2012 on Windows Server Core . 36

SQL Server 2012 Prerequisites for Server Core. 37

SQL Server Features Supported on Server Core. 38

SQL Server on Server Core Installation Alternatives . 38

Additional High-Availability and Disaster-Recovery Enhancements. 39

Support for Server Message Block . 39

Database Recovery Advisor. 39

Online Operations. 40

Rolling Upgrade and Patch Management. 40

Chapter 3 Performance and Scalability 41

Columnstore Index Overview. 41

Columnstore Index Fundamentals and Architecture. 42

How Is Data Stored When Using a Columnstore Index?. 42

How Do Columnstore Indexes Significantly Improve the

Speed of Queries? . 44

Columnstore Index Storage Organization. 45

Columnstore Index Support and SQL Server 2012. 46

Columnstore Index Restrictions. 46

Columnstore Index Design Considerations and Loading Data. 47

When to Build a Columnstore Index. 47

When Not to Build a Columnstore Index. 48

Loading New Data. 48

Beyond-Relational Example . 76

FILESTREAM Enhancements. 76

FileTable . 77

FileTable Prerequisites. 78

Creating a FileTable. 80

Managing FileTable. 81

Full-Text Search. 81

Statistical Semantic Search. 82

Configuring Semantic Search. 83

Semantic Search Examples. 85

Spatial Enhancements . 86

Spatial Data Scenarios . 86

Spatial Data Features Supported in SQL Server . 86

Spatial Type Improvements. 87

Additional Spatial Improvements. 89

Extended Events. 90

PART 2 BUSINESS INTELLIGENCE DEVELOPMENT

Chapter 6 Integration Services 93

Developer Experience . 93

Add New Project Dialog Box. 93

General Interface Changes. 95

Getting Started Window. 96

SSIS Toolbox. 97

Shared Connection Managers. 98

Scripting Engine. 99

Expression Indicators. 100

Undo and Redo . 100

Package Sort By Name. 100

Status Indicators. 101

Control Flow. 101

Expression Task. 101

Execute Package Task. 102

Data Flow . 103

Sources and Destinations. 103

Transformations. 106

Column References. 108

Collapsible Grouping. 109

Data Viewer. 110

Change Data Capture Support. 111

CDC Control Flow. 112

CDC Data Flow. 113

Flexible Package Design. 114

Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .115

Expressions. 115

Deployment Models. 116

Supported Deployment Models. 116

Project Deployment Model Features. 118

Project Deployment Workflow . 119

Parameters. 122

Project Parameters. 123

Package Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .124

Parameter Usage. 124

Post-Deployment Parameter Values. 125

Integration Services Catalog. 128

Catalog Creation . 128

Catalog Properties. 129

Environment Objects. 132

Administration. 135

Validation. 135

Package Execution. 135

Logging and Troubleshooting Tools. 137

Security . 139

Package File Format. 139

Miscellaneous Changes. 197

SharePoint Integration . 197

Metadata. 197

Bulk Updates and Export . 197

Transactions . 198

Windows PowerShell. 198

Chapter 9 Analysis Services and PowerPivot 199

Analysis Services. 199

Server Modes. 199

Analysis Services Projects. 201

Tabular Modeling. 203

Multidimensional Model Storage. 215

Server Management . 215

Programmability. 217

PowerPivot for Excel. 218

Installation and Upgrade . 218

Usability. 218

Model Enhancements. 221

DAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .222

PowerPivot for SharePoint. 224

Installation and Configuration. 224

Management. 224

Chapter 10 Reporting Services 229

New Renderers . 229

Excel 2010 Renderer . 229

Word 2010 Renderer. 230

SharePoint Shared Service Architecture. 230

Feature Support by SharePoint Edition. 230

Shared Service Architecture Benefits. 231

Service Application Configuration . 231

Power View. 232

Data Sources. 233

Power View Design Environment . 234

Data Visualization . 237

Sort Order. 241

Multiple Views . 241

Highlighted Values. 242

Filters. 243

Display Modes . 245

PowerPoint Export. 246

Data Alerts. 246

Data Alert Designer. 246

Alerting Service . 248

Data Alert Manager. 249

Alerting Configuration . 250

Index 251

What do you think of this book? We want to hear from you!

Microsoft is interested in hearing your feedback so we can continually improve our books and learning

resources for you. To participate in a brief online survey, please visit:

microsoft.com/learning/booksurvey

CHAP TE R 1

SQL Server 2012 Editions and

Engine Enhancements

SQL Server 2012 is Microsoft’s latest cloud-ready information platform. Organizations can use SQL

Server 2012 to efficiently protect, unlock, and scale the power of their data across the desktop,

mobile device, datacenter, and either a private or public cloud. Building on the success of the SQL

Server 2008 R2 release, SQL Server 2012 has made a strong impact on organizations worldwide with

its significant capabilities. It provides organizations with mission-critical performance and availability,

as well as the potential to unlock breakthrough insights with pervasive data discovery across the

organization. Finally, SQL Server 2012 delivers a variety of hybrid solutions you can choose from. For

example, an organization can develop and deploy applications and database solutions on traditional

nonvirtualized environments, on appliances, and in on-premises private clouds or off-premises public

clouds. Moreover, these solutions can easily integrate with one another, offering a fully integrated

hybrid solution. Figure 1-1 illustrates the Cloud Ready Information Platform ecosystem.

Hybrid IT

Traditional

Nonvirtualized

Private

Cloud

On-premises cloud

Public

Cloud

Off-premises cloud

Nonvirtualized

applications

Pooled (Virtualized)

Elastic

Self-service

Usage-based

Pooled (Virtualized)

Elastic

Self-service

Usage-based

Managed Services

FIGURE 1-1 SQL Server 2012, cloud-ready information platform

To prepare readers for SQL Server 2012, this chapter examines the new SQL Server 2012 features,

capabilities, and editions from a database administrator’s perspective. It also discusses SQL Server

2012 hardware and software requirements and installation strategies.

4 PART 1 Database Administration

SQL Server 2012 Enhancements for Database Administrators

Now more than ever, organizations require a trusted, cost-effective, and scalable database platform

that offers mission-critical confidence, breakthrough insights, and flexible cloud-based offerings.

These organizations face ever-changing business conditions in the global economy and challenges

such as IT budget constraints, the need to stay competitive by obtaining business insights, and

the ability to use the right information at the right time. In addition, organizations must always be

adjusting because new and important trends are regularly changing the way software is developed

and deployed. Some of these new trends include data explosion (enormous increases in data usage),

consumerization IT, big data (large data sets), and private and public cloud deployments.

Microsoft has made major investments in the SQL Server 2012 product as a whole; however, the

new features and breakthrough capabilities that should interest database administrators (DBAs) are

divided in the chapter into the following categories: Availability, Manageability, Programmability,

Scalability and Performance, and Security. The upcoming sections introduce some of the new features

and capabilities; however, other chapters in this book conduct a deeper explanation of the major

technology investments.

Availability Enhancements

A tremendous amount of high-availability enhancements were added to SQL Server 2012, which is

sure to increase both the confidence organizations have in their databases and the maximum uptime

for those databases. SQL Server 2012 continues to deliver database mirroring, log shipping, and replication.

However, it now also offers a new brand of technologies for achieving both high availability

and disaster recovery known as AlwaysOn. Let’s quickly review the new high-availability enhancement

AlwaysOn:

¦ ¦ AlwaysOn Availability Groups For DBAs, AlwaysOn Availability Groups is probably the

most highly anticipated feature related to the Database Engine for DBAs. This new capability

protects databases and allows for multiple databases to fail over as a single unit. Better data

redundancy and protection is achieved because the solution supports up to four secondary

replicas. Of these four secondary replicas, up to two secondaries can be configured as synchronous

secondaries to ensure the copies are up to date. The secondary replicas can reside

within a datacenter for achieving high availability within a site or across datacenters for disaster

recovery. In addition, AlwaysOn Availability Groups provide a higher return on investment

because hardware utilization is increased as the secondaries are active, readable, and can be

leveraged to offload backups, reporting, and ad hoc queries from the primary replica. The

solution is tightly integrated into SQL Server Management Studio, is straightforward to deploy,

and supports either shared storage or local storage.

Figure 1-2 illustrates an organization with a global presence achieving both high availability

and disaster recovery for mission-critical databases using AlwaysOn Availability Groups. In

addition,

the secondary replicas are being used to offload reporting and backups.

CHAPTER 1 SQL Server 2012 Editions and Engine Enhancements 5

70%

50%

25%

15%

70%

50%

25%

15%

Primary

Datacenter

Replica2

Replica3

Reports

Backups

Reports

Backups

Secondary

Datacenter

Replica4

Synchronous Data Movement

Asynchronous Data Movement

Secondary Replica

Primary Replica

Replica1

FIGURE 1-2 AlwaysOn Availability Groups for an organization with a global presence

¦ ¦ AlwaysOn Failover Cluster Instances (FCI) AlwaysOn Failover Cluster Instances provides

superior instance-level protection using Windows Server Failover Clustering and shared

storage.

However, with SQL Server 2012 there are a tremendous number of enhancements to

improve availability and reliability. First, FCI now provides support for multi-subnet failover

clusters. These subnets, where the FCI nodes reside, can be located in the same datacenter

or in geographically dispersed sites. Second, local storage can be leveraged for the TempDB

database.

Third, faster startup and recovery times are achieved after a failover transpires.

Finally, improved cluster health-detection policies can be leveraged, offering a stronger and

more flexible failover.

¦ ¦ Support for Windows Server Core Installing SQL Server 2012 on Windows Server Core

is now supported. Windows Server Core is a scaled-down edition of the Windows operating

system and requires approximately 50 to 60 percent fewer reboots when patching servers.

This translates to greater SQL Server uptime and increased security. Server Core deployment

options using Windows Server 2008 R2 SP1 and higher are required. Chapter 2, “High-Availability

and Disaster-Recovery Options,” discusses deploying SQL Server 2012 on Server

Core, including the features supported.

¦ ¦ Recovery Advisor A new visual timeline has been introduced in SQL Server Management

Studio to simplify the database restore process. As illustrated in Figure 1-3, the scroll bar

beneath

the timeline can be used to specify backups to restore a database to a point in time.

FIGURE 1-3 Recovery Advisor visual timeline

Note For detailed information about the AlwaysOn technologies and other high-availability

enhancements, be sure to read Chapter 2.

Scalability and Performance Enhancements

The SQL Server product group has made sizable investments in improving scalability and

performance

associated with the SQL Server Database Engine. Some of the main enhancements that

allow organizations to improve their SQL Server workloads include the following:

¦ ¦ Columnstore Indexes More and more organizations have a requirement to deliver

breakthrough

and predictable performance on large data sets to stay competitive. SQL

Server 2012 introduces a new in-memory, columnstore index built directly in the relational

engine. Together with advanced query-processing enhancements, these technologies provide

blazing-fast performance and improve queries associated with data warehouse workloads

by 10 to 100 times. In some cases, customers have experienced a 400 percent improvement

in performance. For more information on this new capability for data warehouse workloads,

review Chapter 3, “Blazing-Fast Query Performance with Columnstore Indexes.”

¦ ¦ Partition Support Increased To dramatically boost scalability and performance associated

with large tables and data warehouses, SQL Server 2012 now supports up to 15,000 partitions

per table by default. This is a significant increase from the previous version of SQL Server,

which was limited to 1000 partitions by default. This new expanded support also helps enable

large sliding-window scenarios for data warehouse maintenance.

¦ ¦ Online Index Create, Rebuild, and Drop Many organizations running mission-critical

workloads use online indexing to ensure their business environment does not experience

downtime during routine index maintenance. With SQL Server 2012, indexes containing

varchar(max), nvarchar(max), and varbinary(max) columns can now be created, rebuilt, and

dropped as an online operation. This is vital for organizations that require maximum uptime

and concurrent user activity during index operations.

¦ ¦ Achieve Maximum Scalability with Windows Server 2008 R2 Windows Server 2008 R2

is built to achieve unprecedented workload size, dynamic scalability, and across-the-board

availability and reliability. As a result, SQL Server 2012 can achieve maximum scalability when

running on Windows Server 2008 R2 because it supports up to 256 logical processors and

2 terabytes of memory in a single operating system instance.

Manageability Enhancements

SQL Server deployments are growing more numerous and more common in organizations. This fact

demands that all database administrators be prepared by having the appropriate tools to successfully

manage their SQL Server infrastructure. Recall that the previous releases of SQL Server included

many new features tailored toward manageability. For example, database administrators could easily

leverage Policy Based Management, Resource Governor, Data Collector, Data-tier applications, and

Utility Control Point. Note that the product group responsible for manageability never stopped

investing in manageability. With SQL Server 2012, they unveiled additional investments in SQL Server

tools and monitoring features. The following list articulates the manageability enhancements in SQL

Server 2012:

¦ ¦ SQL Server Management Studio With SQL Server 2012, IntelliSense and Transact-SQL

debugging

have been enhanced to bolster the development experience in SQL Server Management

Studio.

¦ ¦ IntelliSense Enhancements A completion list will now suggest string matches based on

partial words, whereas in the past it typically made recommendations based on the first character.

¦ ¦ A new Insert Snippet menu This new feature is illustrated in Figure 1-4. It offers developers

a categorized list of snippets to choose from to streamline code. The snippet picket tooltip can

be launched by pressing CTRL+K, pressing CTRL+X, or selecting it from the Edit menu.

¦ ¦ Transact-SQL Debugger This feature introduces the potential to debug Transact-SQL

scripts on instances of SQL Server 2005 Service Pack 2 (SP2) or later and enhances breakpoint functionality.

FIGURE 1-4 Leveraging the Transact-SQL code snippet template as a starting point when writing new

Transact-SQL statements in the SQL Server Database Engine Query Editor

¦ ¦ Resource Governor Enhancements Many organizations currently leverage Resource

Governor to gain predictable performance and improve their management of SQL Server

workloads and resources by implementing limits on resource consumption based on incoming

requests. In the past few years, customers have also been requesting additional improvements

to the Resource Governor feature. Customers wanted to increase the maximum number of

resource pools and support large-scale, multitenant database solutions with a higher level of

isolation between workloads. They also wanted predictable chargeback and vertical isolation

of machine resources.

The SQL Server product group responsible for the Resource Governor feature introduced

new capabilities to address the requests of its customers and the SQL Server community. To

begin, support for larger scale multitenancy can now be achieved on a single instance of SQL

Server because the number of resource pools Resource Governor supports increased from 20

to 64. In addition, a maximum cap for CPU usage has been introduced to enable predictable

chargeback

and isolation on the CPU. Finally, resource pools can be affinitized to an individual

schedule or a group of schedules for vertical isolation of machine resources.

A new Dynamic Management View (DMV) called sys.dm_resource_governor_resource_pool_

affinity improves database administrators’ success in tracking resource pool affinity.

Let’s review an example of some of the new Resource Governor features in action. In the

following

example, resource pool Pool25 is altered to be affinitized to six schedulers (8, 12,

13, 14, 15, and 16), and it’s guaranteed a minimum 5 percent of the CPU capacity of those

schedulers.

It can receive no more than 80 percent of the capacity of those schedulers. When

there is contention for CPU bandwidth,

the maximum average CPU bandwidth that will be

allocated is 40 percent.

ALTER RESOURCE POOL Pool25

WITH(

MIN_CPU_PERCENT = 5,

MAX_CPU_PERCENT = 40,

CAP_CPU_PERCENT = 80,

AFFINITY SCHEDULER = (8, 12 TO 16),

MIN_MEMORY_PERCENT = 5,

MAX_MEMORY_PERCENT = 15,

);

¦ ¦ Contained Databases Authentication associated with database portability was a challenge

in the previous versions of SQL Server. This was the result of users in a database being associated

with logins on the source instance of SQL Server. If the database ever moved to another

instance of SQL Server, the risk was that the login might not exist. With the introduction of

contained databases in SQL Server 2012, users are authenticated directly into a user database

without the dependency of logins in the Database Engine. This feature facilitates better

portability of user databases among servers because contained databases have no external

dependencies.

¦ ¦ Tight Integration with SQL Azure A new Deploy Database To SQL Azure wizard, pictured

in Figure 1-5, is integrated in the SQL Server Database Engine to help organizations deploy

an on-premise database to SQL Azure. Furthermore, new scenarios can be enabled with SQL

Azure Data Sync, which is a cloud service that provides bidirectional data synchronization

between databases across the datacenter and cloud.

FIGURE 1-5 Deploying a database to SQL Azure with the Deploy Database Wizard

¦ ¦ Startup Options Relocated Within SQL Server Configuration Manager, a new Startup

Parameters

tab was introduced for better manageability of the parameters required for

startup. A DBA can now easily specify startup parameters compared to previous versions of

SQL Server, which at times was a tedious task. The Startup Parameters tab can be invoked

by right-clicking a SQL Server instance name in SQL Server Configuration Manager and then

selecting Properties.

¦ ¦ Data-Tier Application (DAC) Enhancements SQL Server 2008 R2 introduced the concept

of data-tier applications. A data-tier application is a single unit of deployment containing

all of the database’s schema, dependent objects, and deployment requirements used by

an application.

SQL Server 2012 introduces a few enhancements to DAC. With the new SQL

Server, DAC upgrades are performed in an in-place fashion compared to the previous side-byside

upgrade process we’ve all grown accustomed to over the years. Moreover, DACs can be

deployed, imported and exported more easily across premises and public cloud environments,

such as SQL Azure. Finally, data-tier applications now support many more objects compared to

the previous SQL Server release.

Security Enhancements

It has been approximately 10 years since Microsoft initiated its trustworthy computing initiative. Since

then, SQL Server has had the best track record with the least amount of vulnerabilities and exposures

among the major database players in the industry. The graph shown in Figure 1-6 is from the National

Institute of Standards and Technology (Source: ITIC 2011: SQL Server Delivers Industry-Leading Security).

It shows common vulnerabilities and exposures reported from January 2002 to June 2010.

Oracle

100

150

200

250

300

350

DB2 MySQL SQL Server

FIGURE 1-6 Common vulnerabilities and exposures reported to NIST from January 2002 to January 2010

With SQL Server 2012, the product continues to expand on this solid foundation to deliver

enhanced

security and compliance within the database platform. For detailed information

of all the security enhancements associated with the Database Engine, review Chapter 4,

”

Security Enhancements.”

For now, here is a snapshot of some of the new enterprise-ready security

capabilities and controls that enable organizations to meet strict compliance policies and regulations:

¦ ¦ User-defined server roles for easier separation of duties

¦ ¦ Audit enhancements to improve compliance and resiliency

¦ ¦ Simplified security management, with a default schema for groups

¦ ¦ Contained Database Authentication, which provides database authentication that uses

self-

contained access information without the need for server logins

¦ ¦ SharePoint and Active Directory security models for higher data security in end-user reports

Programmability Enhancements

There has also been a tremendous investment in SQL Server 2012 regarding programmability.

Specifically,

there is support for “beyond relational” elements such as XML, Spatial, Documents, Digital

Media, Scientific Records, factoids, and other unstructured data types. Why such investments?

Organizations have demanded they be given a way to reduce the costs associated with managing

both structured and nonstructured data. They wanted to simplify the development of applications

over all data, and they wanted the management and search capabilities for all data improved. Take a

minute to review some of the SQL Server 2012 investments that positively impact programmability.

For more information associated with programmability and beyond relational elements, please review

Chapter 5, “Programmability and Beyond-Relational Enhancements.”

¦ ¦ FileTable Applications typically store data within a relational database engine; however, a

myriad of applications also maintain the data in unstructured formats, such as documents,

media files, and XML. Unstructured data usually resides on a file server and not directly in

a relational database such as SQL Server. As you can imagine, it becomes challenging for

organizations

to not only manage their structured and unstructured data across these disparate

systems, but to also keep them in sync. FileTable, a new capability in SQL Server 2012,

addresses these challenges. It builds on FILESTREAM technology that was first introduced

with SQL Server 2008. FileTable offers organizations Windows file namespace support and

application

compatibility with the file data stored in SQL Server. As an added bonus, when

applications

are allowed to integrate storage and data management within SQL Server, fulltext

and semantic search is achievable over unstructured and structured data.

¦ ¦ Statistical Semantic Search By introducing new semantic search functionality, SQL Server

2012 allows organizations to achieve deeper insight into unstructured data stored within the

Database Engine. Three new Transact-SQL rowset functions were introduced to query not only

the words in a document, but also the meaning of the document.

¦ ¦ Full-Text Search Enhancements Full-text search in SQL Server 2012 offers better query

performance and scale. It also introduces property-scoped searching functionality, which

allows

organizations the ability to search properties such as Author and Title without the need

Specialized Editions

Above and beyond the three main editions discussed earlier, SQL Server 2012 continues to deliver

specialized editions for organizations that have a unique set of requirements. Some examples include

the following:

¦ ¦ Developer The Developer edition includes all of the features and functionality found in the

Enterprise edition; however, it is meant strictly for the purpose of development, testing, and

demonstration. Note that you can transition a SQL Server Developer installation directly into

production by upgrading it to SQL Server 2012 Enterprise without reinstallation.

¦ ¦ Web Available at a much more affordable price than the Enterprise and Standard editions,

SQL Server 2012 Web is focused on service providers hosting Internet-facing web services

environments. Unlike the Express edition, this edition doesn’t have database size restrictions,

it supports four processors, and supports up to 64 GB of memory. SQL Server 2012 Web does

not offer the same premium features found in Enterprise and Standard editions, but it still

remains the ideal platform for hosting websites and web applications.

¦ ¦ Express This free edition is the best entry-level alternative for independent software

vendors, nonprofessional developers, and hobbyists building client applications. Individuals

learning about databases or learning how to build client applications will find that this edition

meets all their needs. This edition, in a nutshell, is limited to one processor and 1 GB of

memory, and it can have a maximum database size of 10 GB. Also, Express is integrated with

Microsoft Visual Studio.

Note Review “Features Supported by the Editions of SQL Server 2012” at

http://msdn.microsoft.com/en-us/library/cc645993(v=sql.110).aspx and

http://www.microsoft.com/sqlserver/en/us/future-editions/sql2012-editions.aspx for a

complete

comparison of the key capabilities of the different editions of SQL Server 2012.

SQL Server 2012 Licensing Overview

The licensing models affiliated with SQL Server 2012 have been both simplified to better align to

customer solutions and optimized for virtualization and cloud deployments. Organizations should

process knowledge of the information that follows. With SQL Server 2012, the licensing for computing

power is core-based and the Business Intelligence and Standard editions are available under

the Server + Client Access License (CAL) model. In addition, organizations can save on cloud-based

computing costs by licensing individual database virtual machines. Because each customer environment

is unique, we will not have the opportunity to provide an overview of how the license changes

affect your environment. For more information on the licensing changes and how they impact your

organization,

please contact your Microsoft representative or partner.

Software Component

Requirements

Windows PowerShell

Windows PowerShell 2.0

SQL Server support tools and

software

SQL Server 2012 – SQL Server Native Client

SQL Server 2012 – SQL Server Setup Support Files

Minimum: Windows Installer 4.5

Internet Explorer

Minimum: Windows Internet Explorer 7 or later version

Virtualization

Windows Server 2008 SP2 running Hyper-V role

Windows Server 2008 R2 SP1 running Hyper-V role

Note The server hardware has supported both 32-bit and 64-bit processors for several

years; however, Windows Server 2008 R2 is 64-bit only. Take this into serious consideration

when planning SQL Server 2012 deployments.

Installation, Upgrade, and Migration Strategies

Like its predecessors, SQL Server 2012 is available in both 32-bit and 64-bit editions. Both can be

installed

with either the SQL Server Installation Wizard through a command prompt or with Sysprep

for automated deployments with minimal administrator intervention. As mentioned earlier in the

chapter, SQL Server 2012 can now be installed on the Server Core, which is an installation option of

Windows Server 2008 R2 SP1 or later. Finally, database administrators also have the option to upgrade

an existing installation of SQL Server or conduct a side-by-side migration when installing SQL Server

2012. The following sections elaborate on the different strategies.

The In-Place Upgrade

An in-place upgrade is the upgrade of an existing SQL Server installation to SQL Server 2012.

When an in-place upgrade is conducted, the SQL Server 2012 setup program replaces the previous

SQL Server binaries with the new SQL Server 2012 binaries on the existing machine. SQL Server data is

automatically converted from the previous version to SQL Server 2012. This means data does not have

to be copied or migrated. In the example in Figure 1-8, a database administrator is conducting an

in-place upgrade on a SQL Server 2008 instance running on Server 1. When the upgrade is complete,

Server 1 still exists, but the SQL Server 2008 instance and all of its data is upgraded to SQL Server 2012.

Note SQL Server 2005 with SP4, SQL Server 2008 with SP2, and SQL Server 2008 R2 with

SP1 are all supported for an in-place upgrade to SQL Server 2012. Unfortunately, earlier

versions such as SQL Server 2000, SQL Server 7.0, and SQL Server 6.5 cannot be upgraded

to SQL Server 2012.

Side-by-Side Migration Pros and Cons

The greatest benefit of a side-by-side migration over an in-place upgrade is the opportunity to build

out a new database infrastructure on SQL Server 2012 and avoid potential migration issues that can

occur with an in-place upgrade. The side-by-side migration also provides more granular control

over the upgrade process because you can migrate databases and components independent of one

another. In addition, the legacy instance remains online during the migration process. All of these

advantages result in a more powerful server. Moreover, when two instances are running in parallel,

additional testing and verification can be conducted. Performing a rollback is also easy if a problem

arises during the migration.

However, there are disadvantages to the side-by-side strategy. Additional hardware might need to

be purchased. Applications might also need to be directed to the new SQL Server 2012 instance, and

it might not be a best practice for very large databases because of the duplicate amount of storage

required during the migration process.

SQL Server 2012 High-Level, Side-by-Side Strategy

The high-level, side-by-side migration strategy for upgrading to SQL Server 2012 consists of the following

steps:

1. Ensure the instance of SQL Server you plan to migrate meets the hardware and software

requirements for SQL Server 2012.

2. Review the deprecated and discontinued features in SQL Server 2012 by referring to

“Deprecated

Database Engine Features in SQL Server 2012″ at http://technet.microsoft.com

/en-us/library/ms143729(v=sql.110).aspx.

3. Although a legacy instance will not be upgraded to SQL Server 2012, it is still beneficial to

run the SQL Server 2012 Upgrade Advisor to ensure the data being migrated to the new SQL

Server 2012 is supported and there is no possibility of a break occurring after migration.

4. Procure the hardware, and install your operating system of choice. Windows Server 2012 is

recommended.

5. Install the SQL Server 2012 prerequisites and desired components.

6. Migrate objects from the legacy SQL Server to the new SQL Server 2012 database platform.

7. Point applications to the new SQL Server 2012 database platform.

8. Decommission legacy servers after the migration is complete.

CHAP TE R 2

High-Availability and Disaster-

Recovery Enhancements

Microsoft SQL Server 2012 delivers significant enhancements to well-known, critical capabilities

such as high availability (HA) and disaster recovery. These enhancements promise to assist

organizations

in achieving their highest level of confidence to date in their server environments.

Server

Core support,

breakthrough features such as AlwaysOn Availability Groups and active

secondaries,

and key improvements

to features such as failover clustering are improvements that

provide organizations a range of accommodating options to achieve maximum application availability

and data protection for SQL Server instances and databases within a datacenter and across

datacenters.

This chapter’s goal is to bring readers up to date with the high-availability and disaster-recovery

capabilities that are fully integrated into SQL Server 2012 as a result of Microsoft’s heavy investment

in AlwaysOn.

SQL Server AlwaysOn: A Flexible and Integrated Solution

Every organization’s success and service reputation is built on ensuring that its data is always

accessible

and protected. In the IT world, this means delivering a product that achieves the highest

level of availability and disaster recovery while minimizing data loss and downtime. With the

versions of SQL Server, organizations achieved high availability and disaster recovery by

using technologies

such as failover clustering, database mirroring, log shipping, and peer-to-peer

replication.

Although organizations achieved great success with these solutions, they were tasked with

combining these native SQL Server technologies to achieve their business requirements related to

their Recovery Point Objective (RPO) and Recovery Time Objective (RTO).

Figure 2-1 illustrates a common high-availability and disaster-recovery strategy used by

organizations

with the previous versions of SQL Server. This strategy includes failover clustering to

protect SQL Server instances within each datacenter, combined with asynchronous database mirroring

to provide disaster-recovery capabilities for mission-critical databases.

22 PART 1 Database Administration

Primary

Datacenter

Asynchronous Data Movement

with Database Mirroring

Secondary

Datacenter

SQL Server 2008 R2

Failover Cluster

SQL Server 2008 R2

Failover Cluster

FIGURE 2-1 Achieving high availability and disaster recovery by combining failover clustering with database

mirroring

in SQL Server 2008 R2

Likewise, for organizations that either required more than one secondary datacenter or that did

not have shared storage, their high-availability and disaster-recovery deployment incorporated

synchronous

database mirroring with a witness within the primary datacenter combined with log

shipping for moving data to multiple locations. This deployment strategy is illustrated in Figure 2-2.

Primary

Datacenter

Log Shipping

Disaster Recover

Datacenter 1

SQL Server 2008 R2

Database Mirroring

Witness

Disaster Recover

Datacenter 2

SQL Server 2008 R2

Synchronous Data Movement

with Database Mirroring

Log Shipping

FIGURE 2-2 Achieving high availability and disaster recovery with database mirroring combined with log shipping

in SQL Server 2008 R2

CHAPTER 2 High-Availability and Disaster-Recovery Enhancements 23

Figures 2-1 and 2-2 both reveal successful solutions for achieving high availability and disaster

recovery. However, these solutions had some limitations, which warranted making changes. In addition,

with organizations constantly evolving, it was only a matter of time until they voiced their own

concerns and sent out a request for more options and changes.

One concern for many organizations was regarding database mirroring. Database mirroring is

a great way to protect databases; however, the solution is a one-to-one mapping, making multiple

secondaries unattainable. When confronted with this situation, many organizations reverted to log

shipping as a replacement for database mirroring because it supports multiple secondaries. Unfortunately,

organizations encountered limitations with log shipping because it did not provide zero data

loss or automatic failover capability. Concerns were also experienced by organizations working with

failover clustering because they felt that their shared-storage devices, such as a storage area network

(SAN), could be a single point of failure. Similarly, many organizations thought that from a cost

perspective their investments were not being used to their fullest potential. For example, the passive

servers in many of these solutions were idle. Finally, many organizations wanted to offload reporting

and maintenance tasks from the primary database servers, which was not an easy task to achieve.

SQL Server has evolved to answer many of these concerns, and this includes an integrated solution

called AlwaysOn. AlwaysOn Availability Groups and AlwaysOn Failover Cluster Instances are new

features,

introduced in SQL Server 2012, that are rich with options and promise the highest level of

availability and disaster recovery to its customers. At a high level, AlwaysOn Availability Groups is

used for database protection and offers multidatabase failover, multiple secondaries, active secondaries,

and integrated HA management. On the other hand, AlwaysOn Failover Cluster Instances is a

feature tailored to instance-level protection, multisite clustering, and consolidation, while consistently

providing flexible failover polices and improved diagnostics.

AlwaysOn Availability Groups

AlwaysOn Availability Groups provides an enterprise-level alternative to database mirroring, and it

gives organizations the ability to automatically or manually fail over a group of databases as a single

unit, with support for up to four secondaries. The solution provides zero-data-loss protection and is

flexible. It can be deployed on local storage or shared storage, and it supports both synchronous and

asynchronous data movement. The application failover is very fast, it supports automatic page repair,

and the secondary replicas can be leveraged to offload reporting and a number of maintenance

tasks, such as backups.

Take a look at Figure 2-3, which illustrates an AlwaysOn Availability Groups deployment strategy

that includes one primary replica and three secondary replicas.

70%

50%

25%

15%

70%

50%

25%

15%

Primary

Datacenter

Replica2

Replica3

Reports

Reports Backups

Backups

Secondary

Datacenter

Synchronous Data Movement

Asynchronous Data Movement

Replica4

Secondary Replica

Primary Replica

SQL Server

Replica1

FIGURE 2-3 Achieving high availability and disaster recovery with AlwaysOn Availability Groups

In this figure, synchronous data movement is used to provide high availability within the primary

datacenter and asynchronous data movement is used to provide disaster recovery. Moreover, secondary

replica 3 and replica 4 are employed to offload reports and backups from the primary replica.

It is now time to take a deeper dive into AlwaysOn Availability Groups through a review of the new

concepts and terminology associated with this breakthrough capability.

Understanding Concepts and Terminology

Availability groups are built on top of Windows Failover Clustering and support both shared and

nonshared

storage. Depending on an organization’s RPO and RTO requirements, availability groups

can use either an asynchronous-commit availability mode or a synchronous-commit availability

mode to move data between primary and secondary replicas. Availability groups include built-in

compression

and encryption as well as support for file-stream replication and auto page repair.

Failover between replicas is either automatic or manual.

When deploying AlwaysOn Availability Groups, your first step is to deploy a Windows Failover

Cluster. This is completed by using the Failover Cluster Manager Snap-in within Windows Server 2008

R2. Once the Windows Failover Cluster is formed, the remainder of the Availability Groups configurations

is completed in SQL Server Management Studio. When you use the Availability Groups wizards

to configure your availability groups, SQL Server Management Studio automatically creates the

appropriate

services, applications, and resources in Failover Cluster Manager; hence, the deployment

is much easier for database administrators who are not familiar with failover clustering.

Now that the fundamentals of the AlwaysOn Availability Groups have been laid down, the most

natural question that follows is how an organization’s operations are enhanced with this feature.

Unlike

database mirroring, which supports only one secondary, AlwaysOn Availability Groups supports

one primary replica and up to four secondary replicas. Availability groups can also contain more

than one availability database. Equally appealing is the fact you can host more than one availability

group within an implementation. As a result, it is possible to group databases with application

dependencies

together within an availability group and have all the availability databases seamlessly

fail over as a single cohesive unit as depicted in Figure 2-4.

FIGURE 2-4 Dedicated availability groups for Finance and HR availability databases

In addition, as shown in Figure 2-4, there is one primary replica and two secondary replicas with

two availability groups. One of these availability groups is called Finance, and it includes all the

Finance databases; the other availability group is called HR, and it includes all the Human Resources

databases. The Finance availability group can fail over independently of the HR availability group,

and unlike database mirroring, all availability databases within an availability group fail over as a

single unit. Moreover, organizations can improve their IT efficiency, increase performance, and reduce

total cost of ownership with better resource utilization of secondary/passive hardware because these

secondary replicas can be leveraged for backups and read-only operations such as reporting and

maintenance. This is covered in the “Active Secondaries” section later in this chapter.

Now that you have been introduced to some of the benefits AlwaysOn Availability Groups offers

for an organization, let’s take the time to get a stronger understanding of the AlwaysOn Availability

Groups concepts and how this new capability operates. The concepts covered include the following:

¦ ¦ Availability replica roles

¦ ¦ Data synchronization modes

¦ ¦ Failover modes

¦ ¦ Connection mode in secondaries

¦ ¦ Availability group listeners

Availability Replica Roles

Each AlwaysOn availability group comprises a set of two or more failover partners that are referred to

as availability replicas. The availability replicas can consist of either a primary role or a secondary role.

Note that there can be a maximum of four secondaries, and of these four secondaries only a maximum

of two secondaries can be configured to use the synchronous-commit availability mode.

The roles affiliated with the availability replicas in AlwaysOn Availability Groups follow the same

principles as the legendary Sith rule of two doctrines in the Star Wars saga. In Star Wars, there can be

only two Siths at one time, a master and an apprentice. Similarly, a SQL Server instance in an availability

group can be only a primary replica or a secondary replica. At no time can it be both because the

role swapping is controlled by Windows Server Failover Cluster (WSFC).

Each of the SQL Server instances in the availability group is hosted on either a SQL Server failover

cluster instance (FCI) or a stand-alone instance of SQL Server 2012. Each of these instances resides on

different nodes of a WSFC. WSFC is typically used for providing high availability and disaster recovery

for well-known Microsoft products. As such, availability groups use WSFC as the underlying mechanism

to provide internode health detection, failover coordination, primary health detection, and

distributed change notifications for the solution.

Each availability replica hosts a copy of the availability databases in the availability group. Because

there are multiple copies of the databases being hosted on each availability replica, there isn’t a

prerequisite for using shared storage like there was in the past when deploying traditional SQL Server

failover clusters. On the flip side, when using nonshared storage, an organization must realize that

storage requirements increase as the number of replicas it plans on hosting increases.

Data Synchronization Modes

To move data from the primary replica to the secondary replica, each mode uses either

synchronous-

commit availability mode or asynchronous-commit availability mode. Give

consideration

to the following items when selecting either option:

¦ ¦ When you use the synchronous-commit mode, a transaction is committed on both replicas

to guarantee transactional consistency. This, however, means increased latency. As such, this

option might not be appropriate for partners who don’t share a high-speed network or who

reside in different geographical locations.

¦ ¦ The asynchronous-commit mode commits transactions between partners without waiting for

the partner to write the log to disk. This maximizes performance between the application and

the primary replica and is well suited for disaster-recovery solutions.

Availability Groups Failover Modes

When configuring AlwaysOn availability groups, database administrators can choose from two

failover modes when swapping roles from the primary to the secondary replicas. For administrators

who are familiar with database mirroring, you’ll see that to obtain high availability and disaster

recovery the failover modes are very similar to the modes in database mirroring. These are the two

AlwaysOn failover modes available when using the New Availability Group Wizard:

¦ ¦ Automatic Failover This replica uses synchronous-commit availability mode, and it supports

both automatic failover and manual failover between the replica partners. A maximum of two

failover replica partners are supported when choosing automatic failover.

¦ ¦ Manual Failover This replica uses synchronous or asynchronous commit availability mode

and supports only manual failovers between the replica partners.

Connection Mode in Secondaries

As indicated earlier, each of the secondaries can be configured to support read-only access for

reporting

or other maintenance tasks, such as backups. During the final configuration stage of

AlwaysOn

availability groups, database administrators decide on the connection mode for the

secondary

replicas. There are three connection modes available:

¦ ¦ Disallow connections In the secondary role, this availability replica does not allow any connections.

¦ ¦ Allow only read-intent connections In the secondary role, this availability replica allows

only read-intent connections.

¦ ¦ Allow all connections In the secondary role, this availability replica allows all connections

for read access, including connections running with older clients.

Availability Group Listeners

The availability group listener provides a way of connecting to databases within an availability group

via a virtual network name that is bound to the primary replica. Applications can specify the network

name affiliated with the availability group listener in connection strings. After the availability group

fails over from the primary replica to a secondary replica, the network name directs connections to

the new primary replica. The availability group listener concept is similar to a Virtual SQL Server Name

when using failover clustering; however, with an availability group listener, there is a virtual network

name for each availability group, whereas with SQL Server failover clustering, there is one virtual

network name for the instance.

You can specify your availability group listener preferences when using the Create A New

Availability

Group Wizard in SQL Server Management Studio, or you can manually create or modify

an availability group listener after the availability group is created. Alternatively, you can use

Transact-

SQL to create or modify the listener too. Notice in Figure 2-5 that each availability group

listener requires a DNS name, an IP address, and a port such as 1433. Once the availability group

listener is created, a server name and an IP address cluster resource are automatically created within

Failover Cluster Manager. This is certainly a testimony to the availability group’s flexibility and tight

integration with SQL Server, because the majority of the configurations are done within SQL Server.

FIGURE 2-5 Specifying availability group listener properties

Be aware that there is a one-to-one mapping between availability group listeners and availability

groups. This means you can create one availability group listener for each availability group. However,

if more than one availability group exists within a replica, you can have more than one availability

group listener. For example, there are two availability groups shown in Figure 2-6: one is for

the Finance

availability databases, and the other is for the Accounting availability databases. Each

availability

group has its own availability group listener that clients and applications connect to.

FIGURE 2-6 Illustrating two availability group listeners within a replica

Configuring Availability Groups

When creating a new availability group, a database administrator needs to specify an availability

group name, such as AvailablityGroupFinance, and then select one or more databases to be part of

in the availability group. The next step involves first specifying one or more instances of SQL Server

to host secondary availability replicas, and then specifying your availability group listener preference.

The final step is selecting the data-synchronization preference and connection mode for the secondary

replicas. These configurations are conducted with the New Availability Group Wizard or with

Transact-SQL PowerShell scripts.

Prerequisites

To deploy AlwaysOn Availability Groups, the following prerequisites must be met:

¦ ¦ All computers running SQL Server, including the servers that will reside in the disaster-recovery

site, must reside in the same Windows-based domain.

¦ ¦ All SQL Server computers must participate in a single Windows Server failover cluster even if

the servers reside in multiple sites.

¦ ¦ All servers must partake in a Windows Server failover cluster.

¦ ¦ AlwaysOn Availability Groups must be enabled on each server.

¦ ¦ All the databases must be in full recovery mode.

¦ ¦ A full backup must be conducted on all databases before deployment.

¦ ¦ The server cannot host the Active Directory Domain Services role.

Deployment Examples

Figure 2-7 shows the Specify Replicas page you see when using the New Availability Group Wizard.

In this example, there are three SQL Server instances in the availability group called Finance:

SQL01\Instance01, SQL02\Instance01, and SQL03\Instance01. SQL01\Instance01 is configured as the

Primary replica, whereas SQL02\Instance01 and SQL03\Instance01 are configured as secondaries.

SQL01\Instance01 and SQL02\Instance01 support automatic failover with synchronous data movement,

whereas SQL-03\Instance01 uses asynchronous-commit availability mode and supports only

a forced failover. Finally, SQL01\Instance01 does not allow read-only connections to the secondary,

whereas SQL02\Instance01 and SQL03\Instance01 allow read-intent connections to the secondary. In

addition, for this example, SQL01\Instance01 and SQL02\Instance01 reside in a primary datacenter for

high availability within a site, and SQL03\Instance01 resides in the disaster recovery datacenter and

will be brought online manually in the event the primary datacenter becomes unavailable.

FIGURE 2-7 Specifying the SQL Server instances in the availability group

One thing becomes vividly clear from Figure 2-7 and the preceding example: there are many

different

deployment configurations available to satisfy any organization’s high-availability and

disaster-recovery requirements. See Figure 2-8 for the following additional deployment alternatives:

¦ ¦ Nonshared storage, local, regional, and geo target

¦ ¦ Multisite cluster with another cluster as the disaster recovery (DR) target

¦ ¦ Three-node cluster with similar DR target

¦ ¦ Secondary targets for backup, reporting, and DR

Nonshared Storage, Local,

Regional, and Geo Target

A A

A A A

A A

A A A

Multisite Cluster with Another

Cluster as Disaster Recovery Target

Three-Node Cluster with Similar

Disaster Recovery Target

Secondary Targets for Backup,

Reporting, and Disaster Recovery

FIGURE 2-8 Additional AlwaysOn deployment alternatives

Monitoring Availability Groups with the Dashboard

Administrators have an opportunity to leverage a new and remarkably intuitive manageability

dashboard in in SQL Server 2012 to monitor availability groups. The dashboard, shown in Figure 2-9,

reports the health and status associated with each instance and availability database in the availability

group. Moreover, the dashboard displays the specific replica role of each instance and provides synchronization

status. If there is an issue or if more information on a specific event is required, a database

administrator can click the availability group state, server instance name, or health status hyperlinks

for additional information. The dashboard is launched by right-clicking the Availability Groups

folder in the Object Explorer in SQL Server Management Studio and selecting Show Dashboard.

FIGURE 2-9 Monitoring availability groups with the new Availability Group dashboard

Active Secondaries

As indicated earlier, many organizations communicated to the SQL Server team their need to improve

IT efficiency by optimizing their existing hardware investments. Specifically, organizations hoped their

production systems for passive workloads could be used in some other capacity instead of remaining

in an idle state. These same organizations also wanted reporting and maintenance tasks offloaded

from production servers because these tasks negatively affected production workloads. With SQL

Server 2012, organizations can leverage the AlwaysOn Availability Group capability to configure

secondary replicas, also referred to as active secondaries, to provide read-only access to databases

affiliated with an availability group.

All read-only operations on the secondary replicas are supported by row versioning and are

automatically

mapped to snapshot isolation transaction level, which eliminates reader/writer

contention.

In addition, the data in the secondary replicas is near real time. In many circumstances,

data latency between the primary and secondary databases should be within seconds. Note that the

latency of log synchronization affects data freshness.

For organizations, active secondaries are synonymous with performance optimization on a primary

replica and increases to overall IT efficiency and hardware utilization.

Read-Only Access to Secondary Replicas

Recall that when configuring the connection mode for secondary replicas, you can choose Disallow

Connections, Allow Only Read-Intent Connections, and Allow All Connections. The Allow Only

Read-

Intent Connections and Allow All Connections options both provide read-only access to

secondary

replicas. The Disallow Connections alternative does not allow read-only access as implied

by its name.

Now let’s look at the major differences between Allow Only Read-Intent Connections and Allow

All Connections. The Allow Only Read-Intent Connections option allows connections to the databases

in the secondary replica when the Application Intent Connection property is set to Read-only in the

SQL Server native client. When using the Allow All Connections settings, all client connections are

allowed independent of the Application Intent property. What is the Application Intent property in

the connection string? The Application Intent property declares the application workload type when

connecting

to a server. The possible values are Read-only and Read Write. Commands that try to

create

or modify data on the secondary replica will fail.

Backups on Secondary

Backups of availability databases participating in availability groups can be conducted on any of the

replicas. Although backups are still supported on the primary replica, log backups can be conducted

on any of the secondaries. Note that this is independent of the replication commit mode being

used.synchronous-commit or asynchronous-commit. Log backups completed on all replicas form a

single log chain, as shown in Figure 2-10.

Primary Replica Backups

Backups Supported

If Desired

Secondary Replica 2 Backups

Step 2: Log Backup

LSN 41-60

Secondary Replica 1 Backups

Step 1: Log Backup

LSN 21-40

Secondary Replica 3 Backups

Step 3: Log Backup

LSN 61-80

FIGURE 2-10 Forming a single log chain by backing up the transaction logs on multiple secondary replicas

As a result, the transaction log backups do not all have to be performed on the same replica.

This in no way means that serious thought should not be given to the location of your backups. It is

recommended that you store all backups in a central location because all transaction log backups are

required to perform a restore in the event of a disaster. Therefore, if a server is no longer available

and it contained the backups, you will be negatively affected. In the event of a failure, use the new

Database Recovery Advisor Wizard; it provides many benefits when conducting restores. For example,

if you are performing backups on different secondaries, the wizard generates a visual image of a

chronological timeline by stitching together all of the log files based on the Log Sequence Number

(LSN).

AlwaysOn Failover Cluster Instances

You’ve seen the results of the development efforts in engineering the new AlwaysOn Availability

Groups capability for high availability and disaster recovery, and the creation of active secondaries.

Now you’ll explore the significant enhancements to traditional capabilities such as SQL Server failover

clustering that leverage shared storage. The following list itemizes some of the improvements that

will appeal to database administrators looking to gain high availability for their SQL Server instances.

Specifically, this section discusses the following features:

¦ ¦ Multisubnet Clustering This feature provides a disaster-recovery solution in addition to

high availability with new support for multisubnet failover clustering.

¦ ¦ Support for TempDB on Local Disk Another storage-level enhancement with failover

clustering is associated with TempDB. TempDB no longer has to reside on shared storage as

it did in previous versions of SQL Server. It is now supported on local disks, which results in

many practical benefits for organizations. For example, you can now offload TempDB I/O from

shared-storage devices (SSD) like a SAN and leverage fast SSD storage locally within the server

nodes to optimize TempDB workloads, which are typically random I/O.

¦ ¦ Flexible Failover Policy SQL Server 2012 introduces improved failure detection for the SQL

Server failover cluster instance by adding failure condition-level properties that allow you to

configure a more flexible failover policy.

Note AlwaysOn failover cluster instances can be combined with availability groups to offer

maximum SQL Server instance and database protection.

With the release of Windows Server 2008, new functionality enabled cluster nodes to be

connected

over different subnets without the need for a stretch virtual local area network (VLAN)

across networks. The nodes could reside on different subnets within a datacenter or in another

geographical

location, such as a disaster recovery site. This concept is commonly referred to as

multisite

clustering, multisubnet clustering, or stretch clustering. Unfortunately, the previous versions

of SQL Server could not take advantage of this Windows failover clustering feature. Organizations

that wanted to create either a multisite or multisubnet SQL Server failover cluster still had to create

a stretch VLAN to expose a single IP address for failover across sites. This was a complex and challenging

task for many organizations. This is no longer the case because SQL Server 2012 supports

multisubnet

and multisite clustering out of the box; therefore, the need for implementing stretch

VLAN technology no longer exists.

Figure 2-11 illustrates an example of a SQL Server multisubnet failover cluster between two

subnets

spanning two sites. Notice how each node affiliated with the multisubnet failover cluster

resides on a different subnet. Node 1 is located in Site 1 and resides on the 192.168.115.0/24 subnet,

whereas Node 2 is located in Site 2 and resides on the 192.168.116.0/24 subnet. The DNS IP address

associated with the virtual network name of the SQL Server cluster is automatically updated when a

failover from one subnet to another subnet occurs.

SQL Server 2012

Node 1

192.168.115.0/24 Subnet

SQL Server 2012

Node 2

192.168.116.0/24 Subnet

SQL Server Failover Cluster Instance

Node 1 – 192.168.115.5

Node 2 – 192.168.116.5

Data Replication Between Storage Systems

Failover

Site 1 Site 2

FIGURE 2-11 A multisubnet failover cluster instance example

For clients and applications to connect to the SQL Server failover cluster, they need two IP

addresses

registered to the SQL Server failover cluster resource name in WSFC. For example, imagine

your server name is SQLFCI01 and the IP addresses are 192.168.115.5 and 192.168.116.5. WSFC automatically

controls the failover and brings the appropriate IP address online depending on the node

that currently owns the SQL Server resource. Again, if Node 1 is affiliated with the 192.168.115.0/24

subnet and owns the SQL Server failover cluster, the IP address resource 192.168.115.6 is brought

online

as shown in Figure 2-12. Similarly, if a failover occurs and Node 2 owns the SQL Server

resource,

IP address resource 192.168.115.6 is taken offline and the IP address resource 192.168.116.6

is brought online.

FIGURE 2-12 Multiple IP addresses affiliated with a multisubnet failover cluster instance

Because there are multiple IP addresses affiliated with the SQL Server failover cluster instance

virtual name, the online address changes automatically when there is a failover. In addition,

Windows

failover cluster issues a DNS update immediately after the network name resource name

comes online.

The IP address change in DNS might not take effect on clients because of cache

settings; therefore, it is recommended that you minimize the client downtime by configuring the

HostRecordTTL

in DNS to 60 seconds. Consult with your DNS administrator before making any DNS

changes, because additional load requests could occur when tuning the TTL time with a host record.

Support for Deploying SQL Server 2012 on Windows Server

Core

Windows Server Core was originally introduced with Windows Server 2008 and saw significant

enhancements

with the release of Windows Server 2008 R2. For those who are unfamiliar with Server

Core, it is an installation option for the Windows Server 2008 and Windows Server 2008 R2 operating

systems. Because Server Core is a minimal deployment of Windows, it is much more secure because

its attack surface is greatly reduced. Server Core does not include a traditional Windows graphical

interface and, therefore, is managed via a command prompt or by remote administration tools.

Unfortunately, previous versions of SQL Server did not support the Server Core operating system,

but that has all changed. For the first time, Microsoft SQL Server 2012 supports Server Core installations

for organizations running Server Core based on Windows Server 2008 R2 with Service Pack 1 or

later.

Why is Server Core so important to SQL Server, and how does it positively affect availability?

When you are running SQL Server 2012 on Server Core, operating-system patching is drastically

reduced.by up to 60 percent. This translates to higher availability and a reduction in planned

downtime for any organization’s mission-critical databases and workloads. In addition, surface-area

attacks are greatly reduced and overall security of the database platform is strengthened, which again

translates to maximum availability and data protection.

When first introduced, Server Core required the use and knowledge of command-line syntax to

manage it. Most IT professionals at this time were accustomed to using a graphical user interface

(GUI) to manage and configure Windows, so they had a difficult time embracing Server Core. This

affected its popularity and, ultimately, its implementation. To ease these challenges, Microsoft created

SCONFIG, which is an out-of-the-box utility that was introduced with the release of Windows Server

2008 R2 to dramatically ease server configurations. To navigate through the SCONFIG options, you

only need to type one or more numbers to configure server properties, as displayed in Figure 2-13.

FIGURE 2-13 SCONFIG utility for configuring server properties in Server Core

The following sections articulate the SQL Server 2012 prerequisites for Server Core, SQL Server

features supported on Server Core, and the installation alternatives.

SQL Server 2012 Prerequisites for Server Core

Organizations installing SQL Server 2012 on Windows Server 2008 R2 Server Core must meet the

following

operating system, features, and components prerequisites.

The operating system requirements are as follows:

¦ ¦ Windows Server 2008 R2 SP1 64-bit x64 Data Center Server Core

¦ ¦ Windows Server 2008 R2 SP1 64-bit x64 Enterprise Server Core

¦ ¦ Windows Server 2008 R2 SP1 64-bit x64 Standard Server Core

¦ ¦ Windows Server 2008 R2 SP1 64-bit x64 Web Server Core

Here is the list of features and components:

¦ ¦ .NET Framework 2.0 SP2

¦ ¦ .NET Framework 3.5 SP1 Full Profile

¦ ¦ .NET Framework 4 Server Core Profile

¦ ¦ Windows Installer 4.5

¦ ¦ Windows PowerShell 2.0

Once you have all the prerequisites, it important to become familiar with the SQL Server

components

supported on Server Core.

SQL Server Features Supported on Server Core

There are numerous SQL Server features that are fully supported on Server Core. They include

Database

Engine Services, SQL Server Replication, Full Text Search, Analysis Services, Client Tools

Connectivity, and Integration Services. Likewise, Sever Core does not support many other features,

including

Reporting Services, Business Intelligence Development Studio, Client Tools Backward

Compatibility, Client Tools SDK, SQL Server Books Online, Distributed Replay Controller, SQL Client

Connectivity SDK, Master Data Services, and Data Quality Services. Some features such as Management

Tools – Basic, Management Tools – Complete, Distributed Replay Client, and Microsoft Sync

Framework are supported only remotely. Therefore, these features can be installed on editions of the

Windows operating system that are not Server Core, and then used to remotely connect to a SQL

Server instance running on Server Core. For a full list of supported and unsupported features review

the information at this link: http://msdn.microsoft.com/en-us/library/hh231669(SQL.110).aspx.

Note To leverage Server Core, you need to plan your SQL Server installation ahead of

time. Give yourself the opportunity to fully understand which SQL Server features are

required

to support your mission-critical workloads.

SQL Server on Server Core Installation Alternatives

The typical SQL Server Installation Setup Wizard is not supported when installing SQL Server 2012 on

Server Core. As a result, you need to automate the installation process by either using a commandline

installation, using a configuration file, or leveraging the DefaultSetup.ini methodology. Details

and examples for each of these methods can be found in Books Online: http://technet.microsoft.com

/en-us/library/ms144259(SQL.110).aspx.

Note When installing SQL Server 2012 on Server Core, ensure that you use Full Quiet

mode by using the /Q parameter or Quiet Simple mode by using the /QS parameter.

Additional High-Availability and Disaster-Recovery

Enhancements

This section summarizes some of the additional high-availability and disaster recovery enhancements

found in SQL Server 2012.

Support for Server Message Block

A common trend for organizations in recent years has been the movement toward consolidating

databases and applications onto fewer servers.specifically, hosting many instances of SQL Server

running on a failover cluster. When using failover clustering for consolidation, the previous versions of

SQL Server required a single drive letter for each SQL Server failover cluster instance. Because there

are only 23 drive letters available, without taking into account reservations, the maximum amount of

SQL Server instances supported on a single failover cluster was 23. Twenty-three instances sounds like

an ample amount; however, the drive letter limitation negatively affects organizations running powerful

servers that have the compute and memory resources to host more than 23 instances on a single

server. Going forward, SQL Server 2012 and failover clustering introduces support for Server Message

Block (SMB).

Note You might be thinking you can use mount points to alleviate the drive-letter pain

point. When working with previous versions of SQL Server, even with mount points, you

need at least one drive letter for each SQL Server failover cluster instance.

Some of the SQL Server 2012 benefits brought about by SMB are, of course, database-storage

consolidation and the potential to support more than 23 clustering instances in a single WSFC. To

take advantage of these features, the file servers must be running Windows Server 2008 or later

versions

of the operating system.

Database Recovery Advisor

The Database Recovery Advisor is a new feature aimed at optimizing the restore experience for

database

administrators conducting database recovery tasks. This tool includes a new timeline feature

that provides a visualization of the backup history, as shown in Figure 2-14.

FIGURE 2-14 Database Recovery Advisor backup and restore visual timeline

Online Operations

SQL Server 2012 also includes a few enhancements for online operation that reduce downtime during

planned maintenance operations. Line-of-business (LOB) re-indexing and adding columns with

defaults

are now supported.

Rolling Upgrade and Patch Management

All of the new AlwaysOn capabilities reduce application downtime to only a single manual failover

by supporting rolling upgrades and patching of SQL Server. This means a database administrator

can apply

a service pack or critical fix to the passive node or nodes if using a failover cluster or to

secondary

replicas if using availability groups. Once the installation is complete on all passive nodes

or secondaries, a database administrator can conduct a manual failover and then apply the service

pack or critical fix to the node in an FCI or replica. This rolling strategy also applies when upgrading

the database platform.

CHAP TE R 3

Performance and Scalability

Microsoft SQL Server 2012 introduces a new index type called columnstore. The columnstore

index feature was originally referred to as project Apollo during the development phases of SQL

Server 2012 and during the distribution of the Community Technology Preview (CTP) releases of the

product. This new index combined with the advanced query-processing enhancements offer blazingfast

performance optimizations for data-warehousing workloads and other similar queries. In many

cases, data-warehouse query performance has improved by tens to hundreds of times.

This chapter aims to teach, enlighten, and even dispel flawed beliefs about the columnstore index

so that database administrators can greatly increase query performance for their data warehouse

workloads. The questions this chapter focuses are the following:

¦ ¦ What is a columnstore index?

¦ ¦ How does a columnstore index drastically increase the speed of data warehouse queries?

¦ ¦ When should a database administrator build a columnstore index?

¦ ¦ Are there any well-established best practices for a columnstore index deployment?

Let’s look under the hood to see how organizations will benefit from significant data-warehouse

performance gains with the new, in-memory, columnstore index technology, which also helps with

managing increasing data volumes.

Columnstore Index Overview

Because of a proliferation of data being captured across devices, applications, and services,

organizations

today are tasked with storing massive amounts of data to successfully operate their

businesses. Using traditional tools to capture, manage, and process data within an acceptable time is

becoming increasingly challenging as the data users want to capture continues to grow. For example,

the volume of data is overwhelming the ability of data warehouses to execute queries in a timely

manner, and considerable time is spent tuning queries and designing and maintaining indexes to try

to get acceptable query performance. In many cases, so much time might have elapsed between the

time the query is launched and the time the result sets are returned that organizations have difficulty

recalling what the original request was about. Equally unproductive are the cases where the delay

causes the business opportunity to be lost.

42 PART 1 Database Administration

With the many issues organizations are facing as one of their primary concerns, the Query

Processing

and Storage teams from the SQL Server product group set to work on new technologies

that would allow very large data sets to be read quickly and accurately while transforming the data

into useful information and knowledge for organizations in a timely manner. The Query Processing

team reviewed academic research in columnstore data representations and analyzed improved

query-execution capabilities for data warehousing. In addition, they collaborated with the SQL Server

product group Analysis Services team to gain a stronger understanding about the other team’s work

with their columnstore implementation known as PowerPivot for SQL Server 2008 R2. Their research

and analysis led the Query Processing team to create the new columnstore index and query optimizations

based on vector-based execution capability, which significantly improves data-warehouse

query performance.

When developing the new columnstore index, the Query Processing team committed to a number

of goals. They aimed to ensure that end users who consume data had an interactive and positive

experience with all data sets, whether large or small, which meant the response time on data must be

swift. These strategies also apply to ad hoc and reporting queries. Moreover, database administrators

might even be able to reduce their needs for manually tuning queries, summary tables, indexed

views, and in some cases OLAP cubes. All these goals naturally impact total cost of ownership (TCO)

because hardware costs are lowered and fewer people are required to get a task accomplished.

Columnstore Index Fundamentals and Architecture

Before designing, implementing, or managing a columnstore index, it is beneficial to understand how

they work, how data is stored in a columnstore index, and what type of queries can benefit from a

columnstore index.

How Is Data Stored When Using a Columnstore Index?

With traditional tables (heaps) and indexes (B-trees), SQL Server stores data in pages in a row-based

fashion. This storage model is typically referred to as a row store. Using column stores is like turning

the traditional storage model 90 degrees, where all the values from a single column are stored contiguously

in a compressed form. The columnstore index stores each column in a separate set of disk

pages rather than storing multiple rows per page, which has been the traditional storage format. The

following examples illustrate the differences.

Let’s use a common table populated with employee data, as illustrated in Table 3-1, and then

evaluate the different ways data can be stored. This employee table includes typical data such as an

employee ID number, employee name, and the city and state the employee is located in.

CHAPTER 3 Performance and Scalability 43

TABLE 3-1 Traditional Table Containing Employee Data

EmployeeID

Name

City

State

Ross

San Francisco

Sherry

New York

Gus

Seattle

Stan

San Jose

Lijon

Sacramento

Depending on the type of index chosen—traditional or columnstore—database administrators can

organize their data by row (as shown in Table 3-2) or by column (as shown in Table 3-3).

TABLE 3-2 Employee Data Stored in a Traditional “Row Store” Format

Row Store

1 Ross San Francisco CA

2 Sherry New York NY

3 Gus Seattle WA

4 Stan San Jose CA

5 Lijon Sacramento CA

TABLE 3-3 Employee Data Stored in the New Columnstore Format

Columnstore

1 2 3 4 5

Ross Sherry Gus Stan Lijon

San Francisco New York Seattle San Jose Sacramento

CA NY WA CA CA

As you can see, the major difference between the columnstore format in Table 3-3 and the row

store method in Table 3-2 is that a columnstore index groups and stores data for each column and

then joins all the columns to complete the whole index, whereas a traditional index groups and

stores data for each row and then joins all the rows to complete the whole index.

Now that you understand how data is stored when using columnstore indexes compared to

traditional

B-tree indexes, let’s take a look at how this new storage model and advanced query

optimizations

significantly speed up the retrieval of data. The next section describes three ways that

SQL Server columnstore indexes significantly improve the speed of queries.

How Do Columnstore Indexes Significantly Improve the Speed

of Queries?

The new columnstore storage model significantly improves data warehouse query speeds for

many reasons. First, data organized in a column shares many more similar characteristics than data

organized

across rows. As a result, a much higher level of compression can be achieved compared

to data organized across rows. Moreover, the columnstore index within SQL Server uses the VertiPaq

compression algorithm technology, which in SQL Server 2008 R2 was found only in Analysis Server

for PowerPivot. VertiPaq compression is far superior to the traditional row and page compression

used in the Database Engine, with compression rates of up to 15 to 1 having been achieved with

VertiPag. When data is compressed, queries require less IO because the amount of data transferred

from disk to memory is significantly reduced. Reducing IO when processing queries equates to faster

performance-response times. The advantages of columnstore do not come to an end here. With less

data transferred to memory, less space is required in memory to hold the working set affiliated with

the query.

Second, when a user runs a query using the columnstore index, SQL Server fetches data only for

the columns that are required for the query, as illustrated in Figure 3-1. In this example, there are 15

columns in the table; however, because the data required for the query resides in Column 7, Column

8, and Column 9, only these columns are retrieved.

C1 C2 C3 C4 C5 C6

C7 C8 C9

C10 C11 C12 C13 C14 C15

FIGURE 3-1 Improving performance and reducing IO by fetching only columns required for the query

Because data-warehouse queries typically touch only 10 to 15 percent of the columns in large fact

tables, fetching only selected columns translates into savings of approximately 85 to 90 percent of an

organization’s IO, which again increases performance speeds.

Batch-Mode Processing

Finally, an advanced technology for processing queries that use columnstore indexes speeds up

queries yet another way. Speaking about processes, it is a good time to dig a little deeper into how

queries are processed.

First, the data in the columns are processed in batches using a new, highly-efficient vector

technology

that works with columnstore indexes. Database administrators should take a moment

to review the query plan and notice the groups of operators that execute in batch mode. Note

that not all operators execute in batch mode; however, the most important ones affiliated with

data warehousing

do, such as Hash Join and Hash Aggregation. All of the algorithms have been

significantly

optimized to take advantage of modern hardware architecture, such as increased core

counts and additional RAM, thereby improving parallelism. All of these improvements affiliated

with the columnstore index contribute to better batch-mode processing than traditional row-mode processing.

Columnstore Index Storage Organization

Let’s examine how the storage associated with a columnstore index is organized. First I’ll define the

new storage concepts affiliated with columnstore indexes, such as a segment and a row group, and

then I’ll elucidate how these concepts relate to one another.

As illustrated in Figure 3-2, data in the columnstore index is broken up into segments. A segment

contains data from one column for a set of up to about 1 million rows. Segments for the same set

of rows comprise a row group. Instead of storing the data page by page, SQL Server stores the row

group as a unit. Each segment is internally stored in a separate Large Object (LOB). Therefore, when

SQL Server reads the data, the unit reading from disk consists of a segment and the segment is a unit

of transfer between the disk and memory.

C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15

Segment

Row Group

FIGURE 3-2 How a columnstore index stores data

Columnstore Index Support and SQL Server 2012

Columnstore indexes and batch-query execution mode are deeply integrated in SQL Server 2012, and

they work in conjunction with many of the Database Engine features found in SQL Server 2012. For

example, database administrators can implement a columnstore index on a table and still successfully

use AlwaysOn Availability Groups (AG), AlwaysOn failover cluster instances (FCI), database mirroring,

log shipping, and SQL Server Management Studio administration tools. Here are the common business

data types supported by columnstore indexes:

¦ ¦ char and varchar

¦ ¦ All Integer types (int, bigint, smallint, and tinyint)

¦ ¦ real and float

¦ ¦ string

¦ ¦ money and small money

¦ ¦ All date, time, and DateTime types, with one exception (datetimeoffset with precision greater

than 2)

¦ ¦ Decimal and numeric with precision less than or equal to 18 (that is, with less than or exactly

18 digits)

¦ ¦ Only one columnstore index can be created per table.

Columnstore Index Restrictions

Although columnstore indexes work with the majority of the data types, components, and features

found in SQL Server 2012, columnstore indexes have the following restrictions and cannot be

leveraged

in the following situations:

¦ ¦ You can enable PAGE or ROW compression on the base table, but you cannot enable PAGE or

ROW compression on the columnstore index.

¦ ¦ Tables and columns cannot participate in a replication topology.

¦ ¦ Tables and columns using Change Data Capture are unable to participate in a columnstore

index.

¦ ¦ Create Index: You cannot create a columnstore index on the following data types:

• decimal greater than 18 digits

• binary and varbinary

• BLOB

• CLR

• (n)varchar(max)

• uniqueidentifier

• datetimeoffset with precision greater than 2

¦ ¦ Table Maintenance: If a columnstore index exists, you can read the table but you cannot

directly update it. This is because columnstore indexes are designed for data-warehouse

workloads

that are typically read based. Rest assured that there is no need to agonize. The

upcoming “Columnstore Index Design Considerations and Loading Data” section articulates

strategies on how to load new data when using columnstore indexes.

¦ ¦ Process Queries: You can process all read-only T-SQL queries using the columnstore index, but

because batch processing works only with certain operators, you will see that some queries

are accelerated more than others.

¦ ¦ A column that contains filestream data cannot participate in a columnstore index.

¦ ¦ INSERT, UPDATE, DELETE, and MERGE statements are not allowed on tables using columnstore

indexes.

¦ ¦ More than 1024 columns are not supported when creating a columnstore index.

¦ ¦ Only nonclustered columnstore indexes are allowed. Filtered columnstore indexes are not allowed.

¦ ¦ Computed and sparse columns cannot be part of a columnstore index.

¦ ¦ A columnstore index cannot be created on an indexed view.

Columnstore Index Design Considerations and Loading Data

When working with columnstore indexes, some queries are accelerated much more than others.

Therefore, to optimize query performance, it is important to understand when to build a columnstore

index and when not to build a columnstore index. The next sections cover when to build columnstore

indexes, design considerations, and how to load data when using a columnstore index.

When to Build a Columnstore Index

The following list describes when database administrators should use a columnstore index to optimize

query performance:

¦ ¦ When workloads are mostly read based—specifically, data warehouse workloads.

¦ ¦ Your workflow permits partitioning (or a drop-rebuild index strategy) to handle new data.

Most commonly, this is associated with periodic maintenance windows when indexes can be

rebuilt or when staging tables are switching into empty partitions of existing tables.

¦ ¦ If most queries fit a star join pattern or entail scanning and aggregating large amounts

of data.

¦ ¦ If updates occur and most updates append new data, which can be loaded using staging

tables and partition switching.

You should use columnstore indexes when building the following types of tables:

¦ ¦ Large fact tables

¦ ¦ Large (millions of rows) dimension tables

When Not to Build a Columnstore Index

Database administrators might encounter situations when the performance benefits achieved by

using traditional B-tree indexes on their tables are greater than the benefits of using a columnstore

index. The following list describes some of these situations:

¦ ¦ Data in your table constantly requires updating.

¦ ¦ Partition switching or rebuilding an index does not meet the workflow requirements of your

business.

¦ ¦ You encounter frequent small look-up queries. Note, however, that a columnstore index might

still benefit you in this situation. As such, you can implement a columnstore index without any

repercussions because the query optimizer should be able to determine when to use the traditional

B-tree index instead of the columnstore index. This strategy assumes you have updated

statistics.

¦ ¦ You test columnstore indexes on your workload and do not see any benefit.

Loading New Data

As mentioned in earlier sections, tables with a columnstore index cannot be updated directly.

However,

there are three alternatives for loading data into a table with a columnstore index:

¦ ¦ Disable the Columnstore Index This procedure consists of the following three steps:

1. First disable the columnstore index.

2. Update the data.

3. Rebuild the index when the updates are complete.

Database administrators should ensure a maintenance window exists when leveraging this

strategy. The time affiliated with the maintenance window will vary per customers workload.

Therefore test within a prototype environment to determine time required.

¦ ¦ Leverage Partitioning and Partition Switching Partitioning data enables database

administrators

to manage and access subsets of their data quickly and efficiently while

maintaining

the integrity of the entire data collection. Partition switching also allows database

administrators to quickly and efficiently transfer subsets of their data by assigning a table as

a partition to an already existing partitioned table, switching a partition from one partitioned

table to another, or reassigning a partition to form a single table. Partition switching is fully

supported with a columnstore index and is a practical way for updating data.

To use partitioning to load data, follow these steps:

1. Ensure you have an empty partition to accept the new data.

2. Load data into an empty staging table.

3. Switch the staging table (containing the newly loaded data) into the empty partition.

To use partitioning to update existing data, use the following steps:

1. Determine which partition contains the data to be modified.

2. Switch the partition into an empty staging table.

3. Disable the columnstore index on the staging table.

4. Update the data.

5. Rebuild the columnstore index on the staging table.

6. Switch the staging table back into the original partition (which was left empty when the

partition was switched into the staging table).

¦ ¦ Union All Database administrators can load data by storing their main data in a fact table

that has a columnstore. Next, create a secondary table to add or update data. Finally, leverage

a UNION ALL query so that it returns all of the data between the large fact table with the

columnstore index and smaller updateable tables. Periodically load the table from the secondary

table into the main table by using partition switching or by disabling and rebuilding the

columnstore index. Note that some queries using the UNION ALL strategy might not be as fast

as when all the data is in a single table.

Creating a Columnstore Index

Creating a columnstore index is very similar to creating any other traditional SQL Server indexes. You

can use the graphical user interface in SQL Server Management Studio or Transact-SQL. Many individuals

prefer to use the graphical user interface because they want to avoid typing all of the column

names, which is what they have to do when creating the index with Transact-SQL.

A few questions arise frequently when creating a columnstore index. Database administrators

often want to know the following:

¦ ¦ Which columns should be included in the columnstore index?

¦ ¦ Is it possible to create a clustered columnstore index?

When creating a columnstore index, a database administrator should typically include all of the

supported columnstore index columns associated with the table. Note that you don’t need to include

all of the columns. The answer to the second question is “No.” All columnstore indexes must be

nonclustered;

therefore, a clustered columnstore index is not allowed.

The next sections explain the steps for creating a columnstore index using either of the two

methods

mentioned: SQL Server Management Studio and Transact-SQL.

Creating a Columnstore Index by Using SQL Server Management

Studio

Here are the steps for creating a columnstore index using SQL Server Management Studio (SSMS):

1. In SQL Server Management Studio, use Object Explorer to connect to an instance of the SQL

Server Database Engine.

2. In Object Explorer, expand the instance of SQL Server, expand Databases, expand a database,

and expand a table in which you would like to create a new columnstore index.

3. Expand the table, right-click the Index folder, choose New Index, and then click Non-Clustered

Columnstore Index.

4. On the General tab, in the Index name box, type a name for the new index, and then click

Add.

5. In the Select Columns dialog box, select the columns to participate in the columnstore index

and then click OK.

6. If desired, configure the settings on the Options, Storage, and Extended Properties pages. If

you want to maintain the defaults, click OK to create the index, as illustrated in Figure 3-3.

FIGURE 3-3 Creating a nonclustered columnstore index with SSMS

Creating a Columnstore Index Using Transact-SQL

As mentioned earlier, you can use Transact-SQL to create a columnstore index instead of using the

graphical user interface in SQL Server Management Studio. The following example illustrates the

syntax

for creating a columnstore index with Transact-SQL:

CREATE [ NONCLUSTERED ] COLUMNSTORE INDEX index_name

ON <object> ( column [ ,…n ] )

[ WITH ( <column_index_option> [ ,…n ] ) ]

[ ON {

{ partition_scheme_name ( column_name ) }

| filegroup_name

| “default”

}

]

[ ; ]

<object> ::=

{

[database_name. [schema_name ] . | schema_name . ]

table_name

{

<column_index_option> ::=

{

DROP_EXISTING = { ON | OFF }

| MAXDOP = max_degree_of_parallelism

}

The following bullets explain the arguments affiliated with the Transact-SQL syntax to create the

nonclustered columnstore index:

¦ ¦ NONCLUSTERED This argument indicates that this index is a secondary representation of the

data.

¦ ¦ COLUMNSTORE This argument indicates that the index that will be created is a columnstore

index.

¦ ¦ index_name This is where you specify the name of the columnstore index to be created.

Index names must be unique within a table or view but do not have to be unique within a

database.

¦ ¦ column This refers to the column or columns to be added to the index. As a reminder, a

columnstore index is limited to 1024 columns.

¦ ¦ ON partition_scheme_name(column_name) Specifies the partition scheme that defines the

file groups on which the partitions of a partitioned index are mapped. The column_name

specifies the column against which a partitioned index will be partitioned. This column must

match the data type, length, and precision of the argument of the partition function that the

partition_scheme_name is using: partition_scheme_name or filegroup.. If these are not specified

and the table is partitioned, the index is placed in the same partition scheme using the same

partitioning column as the underlying table.

FIGURE 3-5 Reviewing the Columnstore Index Scan results

Using Hints with a Columnstore Index

Finally, if you believe the query can benefit from a columnstore index and the query execution plan is

not leveraging it, you can force the query to use a columnstore index. You do this by using the WITH

(INDEX(<indexname>)) hint where the <indexname> argument is the name of the columnstore index

you want to force.

The following example illustrates a query with an index hint forcing the use of a columnstore index:

SELECT DISTINCT (SalesTerritoryKey)

FROM dbo.FactResellerSales WITH (INDEX (Non-ClusteredColumnStoreIndexSalesTerritory)

The next example illustrates a query with an index hint forcing the use of a different index, such

as a traditional clustered B-tree index over a columnstore index. For example, let’s say there are two

indexes on this table called SalesTerritoryKey, a clustered index called ClusteredIndexSalesTerritory,

and a nonclustered columnstore index called Non-ClusteredColumnStoreIndexSalesTerritory.

Instead

of using the columnstore index, the hint forces the query to use the clustered index known as

ClusteredIndexSalesTerritory:

CHAP TE R 6

Integration Services

Since its initial release in Microsoft SQL Server 2005, Integration Services has had incremental

changes in each subsequent version of the product. However, those changes were trivial in comparison

to the number of enhancements, performance improvements, and new features introduced

in SQL Server 2012 Integration Services. This product overhaul affects every aspect of Integration

Services, from development to deployment to administration.

Developer Experience

The first change that you notice as you create a new Integration Services project is that Business

Intelligence Development Studio (BIDS) is now a Microsoft Visual Studio 2010 shell called SQL Server

Data Tools (SSDT). The Visual Studio environment alone introduces some slight user-interface changes

from the previous version of BIDS. However, several more significant interface changes of note are

specific to SQL Server Integration Services (SSIS). These enhancements to the interface help you to

learn about the package-development process if you are new to Integration Services, and they enable

you to develop packages more easily if you already have experience with Integration Services. If you

are already an Integration Services veteran, you will also notice the enhanced appearance of tasks and

data flow components with rounded edges and new icons.

Add New Project Dialog Box

To start working with Integration Services in SSDT, you create a new project by following the same

steps you use to perform the same task in earlier releases of Integration Services. From the File menu,

point to New, and then select Project. The Add New Project dialog box displays. In the Installed

Templates

list, you can select the type of Business Intelligence template you want to use and then

view only the templates related to your selection, as shown in Figure 6-1. When you select a template,

a description of the template displays on the right side of the dialog box.

94 PART 2 Business Intelligence Development

FIGURE 6-1 New Project dialog box displaying installed templates

There are two templates available for Integration Services projects:

¦ ¦ Integration Services Project You use this template to start development with a blank

package

to which you add tasks and arrange those tasks into workflows. This template type

was available in previous versions of Integration Services.

¦ ¦ Integration Services Import Project Wizard You use this wizard to import a project from

the Integration Services catalog or from a project deployment file. (You learn more about

project deployment files in the “Deployment Models” section of this chapter.) This option is

useful when you want to use an existing project as a starting point for a new project, or when

you need to make changes to an existing project.

Note The Integration Services Connections Project template from previous versions is no

longer available.

CHAPTER 6 Integration Services 95

General Interface Changes

After creating a new package, several changes are visible in the package-designer interface, as you

can see in Figure 6-2:

¦ ¦ SSIS Toolbox You now work with the SSIS Toolbox to add tasks and data flow components

to a package, rather than with the Visual Studio toolbox that you used in earlier versions of

Integration Services. You learn more about this new toolbox in the “SSIS Toolbox” section of

this chapter.

¦ ¦ Parameters The package designer includes a new tab to open the Parameters window for

a package. Parameters allow you to specify run-time values for package, container, and task

properties or for variables, as you learn in the “Parameters” section of this chapter.

¦ ¦ Variables button This new button on the package designer toolbar provides quick access

to the Variables window. You can also continue to open the window from the SSIS menu or by

right-clicking the package designer and selecting the Variables command.

¦ ¦ SSIS Toolbox button This button is also new in the package-designer interface and allows

you to open the SSIS Toolbox when it is not visible. As an alternative, you can open the SSIS

Toolbox from the SSIS menu or by right-clicking the package designer and selecting the SSIS

Toolbox command.

¦ ¦ Getting Started This new window displays below the Solution Explorer window and

provides

access to links to videos and samples you can use to learn how to work with

Integration

Services. This window includes the Always Show In New Project check box, which

you can clear if you prefer not to view the window after creating a new project. You learn

more about using this window in the next section, “Getting Started Window.”

¦ ¦ Zoom control Both the control flow and data flow design surface now include a zoom

control

in the lower-right corner of the workspace. You can zoom in or out to a maximum size

of 500 percent of the normal view or to a minimum size of 10 percent, respectively. As part

of the zoom control, a button allows you to resize the view of the design surface to fit the

window.

SSIS Toolbox

Zoom Control Getting Started

Parameters

Variables

Button

SSIS Toolbox

Button Parameters

FIGURE 6-2 Package-designer interface changes

Getting Started Window

As explained in the previous section, the Getting Started window is new to the latest version of

Integration

Services. Its purpose is to provide resources to new developers. It will display automatically

when you create a new project unless you clear the check box at the bottom of the window. You

must use the Close button in the upper-right corner of the window to remove it from view. Should

you want to access the window later, you can choose Getting Started on the SSIS menu or right-click

the design surface and select Getting Started.

In the Getting Started window, you find several links to videos and Integration Services samples.

To use the links in this window, you must have Internet access. By default, the following topics are available:

¦ ¦ Designing and Tuning for Performance Your SSIS Packages in the Enterprise This

link provides access to a series of videos created by the SQL Server Customer Advisory Team

(SQLCAT) that explain how to monitor package performance and techniques to apply during

package development to improve performance.

¦ ¦ Parameterizing the Execute SQL Task in SSIS This link opens a page from which you can

access a brief video explaining how to work with parameterized SQL statements in Integration

Services.

¦ ¦ SQL Server Integration Services Product Samples You can use this link to access the

product samples available on Codeplex, Microsoft’s open-source project-hosting site. By

studying the package samples available for download, you can learn how to work with various

control flow tasks or data flow components.

Note Although the videos and samples accessible through these links were developed for

previous versions of Integration Services, the principles remain applicable to the latest version.

When opening a sample project in SSDT, you will be prompted to convert the project.

You can customize the Getting Started window by adding your own links to the SampleSites.xml

file located in the Program Files (x86)\Microsoft SQL Server\110\DTS\Binn folder.

SSIS Toolbox

Another new window for the package designer is the SSIS Toolbox. Not only has the overall interface

been improved, but you will find there is also added functionality for arranging items in the toolbox.

Interface Improvement

The first thing you notice in the SSIS Toolbox is the updated icons for most items. Furthermore, the

SSIS Toolbox includes a description for the item that is currently selected, allowing you to see what it

does without needing to add it first to the design surface. You can continue to use drag-and-drop to

place items on the design surface, or you can double-click the item. However, the new behavior when

you double-click is to add the item to the container that is currently selected, which is a welcome

time-saver for the development process. If no container is selected, the item is added directly to the

design surface.

Item Arrangement

At the top of the SSIS Toolbox, you will see two new categories, Favorites and Common, as shown

in Figure 6-3. All categories are populated with items by default, but you can move items into

another

category at any time. To do this, right-click the item and select Move To Favorites or Move

To Common.

If you are working with control flow items, you have Move To Other Tasks as another

choice, but if you are working with data flow items, you can choose Move To Other Sources, Move To

Other Transforms, or Move To Other Destinations. You will not see the option to move an item to the

category in which it already exists, nor are you able to use drag-and-drop to move items manually.

If you decide to start over and return the items to their original locations, select Restore Toolbox

Defaults.

FIGURE 6-3 SSIS Toolbox for control flow and data flow

Shared Connection Managers

If you look carefully at the Solution Explorer window, you will notice that the Data Sources and Data

Source Views folders are missing, and have been replaced by a new file and a new folder. The new

file is Project.params, which is used for package parameters and is discussed in the “Package Parameters”

section of this chapter. The Connections Managers folder is the new container for connection

managers

that you want to share among multiple packages.

Note If you create a Cache Connection Manager, Integration Services shares the in-memory

cache with child packages using the same cache as the parent package. This

feature

is valuable for optimizing repeated lookups to the same source across multiple

packages.

To create a shared connection manager, follow these steps:

1. Right-click the Connections Managers folder, and select New Connection Manager.

2. In the Add SSIS Connection Manager dialog box, select the desired connection-manager type

and then click the Add button.

3. Supply the required information in the editor for the selected connection-manager type, and

then click OK until all dialog boxes are closed.

A file with the CONMGR file extension displays in the Solution Explorer window within the

Connections

Managers folder. In addition, the file also appears in the Connections Managers tray

in the package designer in each package contained in the same project. It displays with a (project)

prefix to differentiate it from package connections. If you select the connection manager associated

with one package and change its properties, the change affects the connection manager in all other packages.

If you change your mind about using a shared connection manager, you can convert it to a

package

connection. To do this, right-click the connection manager in the Connection Managers

tray, and select Convert To Package Connection. The conversion removes the CONMGR file from

the Connections

Manager folder in Solution Explorer and from all other packages. Only the package

in which you execute the conversion contains the connection. Similarly, you can convert a package

connection to a shared connection manager by right-clicking the connection manager in Solution

Explorer and selecting Convert To Project Connection.

Scripting Engine

The scripting engine in SSIS is an upgrade to Visual Studio Tools for Applications (VSTA) 3.0 and

includes support for the Microsoft .NET Framework 4.0. When you edit a script task in the control

flow or a script component in the data flow, the VSTA integrated development environment (IDE)

continues

to open in a separate window, but now it uses a Visual Studio 2010 shell. A significant

improvement

to the scripting engine is the ability to use the VSTA debug features with a Script

component

in the data flow.

Note As with debugging the Script task in the control flow, you must set the

Run64BitRunTime project property to False when you are debugging on a 64-bit computer.

Expression Indicators

The use of expressions in Integration Services allows you, as a developer, to create a flexible package.

Behavior can change at run-time based on the current evaluation of the expression. For example, a

common reason to use expressions with a connection manager is to dynamically change connection

strings to accommodate the movement of a package from one environment to another, such as from

development to production. However, earlier versions of Integration Services did not provide an easy

way to determine whether a connection manager relies on an expression. In the latest version, an

extra icon appears beside the connection manager icon as a visual cue that the connection manager

uses expressions, as you can see in Figure 6-4.

FIGURE 6-4 A visual cue that the connection manager uses an expression

This type of expression indicator also appears with other package objects. If you add an expression

to a variable or a task, the expression indicator will appear on that object.

Undo and Redo

A minor feature, but one you will likely appreciate greatly, is the newly added ability to use Undo and

Redo while developing packages in SSDT. You can now make edits in either the control flow or data

flow designer surface, and you can use Undo to reverse a change or Redo to restore a change you

had just reversed. This capability also works in the Variables window, and on the Event Handlers and

Parameters tabs. You can also use Undo and Redo when working with project parameters.

To use Undo and Redo, click the respective buttons in the standard toolbar. You can also use Ctrl+Z

and Ctrl+Y, respectively. Yet another option is to access these commands on the Edit menu.

Note The Undo and Redo actions will not work with changes you make to the SSIS

Toolbox, nor will they work with shared connection managers.

Package Sort By Name

As you add multiple packages to a project, you might find it useful to see the list of packages in

Solution

Explorer display in alphabetical order. In previous versions of Integration Services, the only

way to re-sort the packages was to close the project and then reopen it. Now you can easily sort the

list of packages without closing the project by right-clicking the SSIS Packages folder and selecting

Sort By Name.

Status Indicators

After executing a package, the status of each item in the control flow and the data flow displays in the

package designer. In previous versions of Integration Services, the entire item was filled with green to

indicate success or red to indicate failure. However, for people who are color-blind, this use of color

was not helpful for assessing the outcome of package execution. Consequently, the user interface

now displays icons in the upper-right corner of each item to indicate success or failure, as shown in

Figure 6-5.

FIGURE 6-5 Item status indicators appear in the upper-right corner

Control Flow

Apart from the general enhancements to the package-designer interface, there are three notable

updates for the control flow. The Expression Task is a new item available to easily evaluate an expression

during the package workflow. In addition, the Execute Package Task has some changes to make it

easier to configure the relationship between a parent package and child package. Another new item

is the Change Data Capture Task, which we discuss in the “Change Data Capture Support” section of

this chapter.

Expression Task

Many of the developer experience enhancements in Integration Services affect both control flow and

data flow, but there is one new feature that is exclusive to control flow. The Expression Task is a new

item available in the SSIS Toolbox when the control flow tab is in focus. The purpose of this task is to

make it easier to assign a dynamic value to a variable.

Rather than use a Script Task to construct a variable value at runtime, you can now add an

Expression

Task to the workflow and use the SQL Server Integration Services Expression Language.

When you edit the task, the Expression Builder opens. You start by referencing the variable and

including

the equals sign (=) as an assignment operator. Then provide a valid expression that resolves

to a single value with the correct data type for the selected variable. Figure 6-6 illustrates an example

of a variable assignment in an Expression Task.

FIGURE 6-6 Variable assignment in an Expression Task

Note The Expression Builder is an interface commonly used with other tasks and data flow

components. Notice in Figure 6-6 that the list on the left side of the dialog box includes

both variables and parameters. In addition, system variables are now accessible from a

separate folder rather than listed together with user variables.

Execute Package Task

The Execute Package Task has been updated to include a new property, ReferenceType, which appears

on the Package page of the Execute Package Task Editor. You use this property to specify the

location

of the package to execute. If you select External Reference, you configure the path to the

child package

just as you do in earlier versions of Integration Services. If you instead select Project

Reference,

you then choose the child package from the drop-down list.

In addition, the Execute Package Task Editor has a new page for parameter bindings, as shown

in Figure 6-7. You use this page to map a parameter from the child package to a parameter value or

variable value in the parent package.

FIGURE 6-7 Parameter bindings between a parent package and a child package

Data Flow

The data flow also has some significant updates. It has some new items, such as the Source and

Destination

assistants and the DQS Cleansing transformation, and there are some improved items

such as the Merge and Merge Join transformation. In addition, there are several new data flow

components

resulting from a partnership between Microsoft and Attunity for use when accessing

Open Database Connectivity (ODBC) connections and processing change data capture logs. We

describe the change data capture components in the “Change Data Capture Support” section of this

chapter. Some user interface changes have also been made to simplify the process and help you get

your job done faster when designing the data flow.

Sources and Destinations

Let’s start exploring the changes in the data flow by looking at sources and destinations.

Source and Destination Assistants

The Source Assistant and Destination Assistant are two new items available by default in the Favorites

folder of the SSIS Toolbox when working with the data flow designer. These assistants help you easily

create a source or a destination and its corresponding connection manager.

To create a SQL Server source in a data flow task, perform the following steps:

1. Add the Source Assistant to the data flow design surface by using drag-and-drop or by

double-clicking the item in the SSIS Toolbox, which opens the Source Assistant – Add New

Source dialog box as shown here:

Note Clear the Show Only Installed Source Types check box to display the

additional

available source types that require installation of one of the following

client providers: DB2, SAP BI, Sybase, or Teradata.

2. In the Select Connection Managers list, select an existing connection manager or select New

to create a new connection manager, and click OK.

3. If you selected the option to create a new connection manager, specify the server name,

authentication method, and database for your source data in the Connection Manager dialog

box, and click OK.

The new data source appears on the data flow design surface, and the connection manager

appears in the Connection Managers tray. You next need to edit the data source to configure

the data-access mode, columns, and error output.

ODBC Source and Destination

The ODBC Source and ODBC Destination components, shown in Figure 6-8, are new to Integration

Services in this release and are based on technology licensed by Attunity to Microsoft. Configuration

of these components is similar to that of OLE DB sources and destinations. The ODBC Source supports

Table Name and SQL Command as data-access modes, whereas data-access modes for the ODBC

Destination are Table Name – Batch and Table Name – Row By Row.

FIGURE 6-8 ODBC Source and ODBC Destination data flow components

Flat File Source

You use the Flat File source to extract data from a CSV or TXT file, but there were some data formats

that this source did not previously support without requiring additional steps in the extraction

process. For example, you could not easily use the Flat File source with a file containing a variable

number of columns. Another problem was the inability to use a character that was designated as a

qualifier as a literal value inside a string. The current version of Integration Services addresses both of

these problems.

¦ ¦ Variable columns A file layout with a variable number of columns is also known as a

ragged-right delimited file. Although Integration Services supports a ragged-right format,

a problem arises when one or more of the rightmost columns do not have values and the

column delimiters for the empty columns are omitted from the file. This situation commonly

occurs when the flat file contains data of mixed granularity, such as header and detail transaction

records. Although a row delimiter exists on each row, Integration Services ignored the row

delimiter and included data from the next row until it processed data for each expected column.

Now the Flat File source correctly recognizes the row delimiter and handles the missing

columns as NULL values.

Note If you expect data in a ragged-right format to include a column delimiter

for each missing column, you can disable the new processing behavior by changing

the AlwaysCheckForRowDelimiters property of the Flat File connection

manager to False.

¦ ¦ Embedded qualifiers Another challenge with the Flat File source in previous versions of Integration

Services was the use of a qualifier character inside a string encapsulated within qualifiers.

For example, consider a flat file that contains the names of businesses. If a single quote

is used as a text qualifier but also appears within the string as a literal value, the common

practice is to use another single quote as an escape character, as shown here.

ID,BusinessName

404,’Margie”s Travel’

406, ‘Kickstand Sellers’

In the first data row in this example, previous versions of Integration Services would fail to

interpret the second apostrophe in the BusinessName string as an escape character, and

instead would process it as the closing text qualifier for the column. As a result, processing of

the flat file returned an error because the next character in the row is not a column delimiter.

This problem is now resolved in the current version of Integration Services with no additional

configuration required for the Flat File source.

Transformations

Next we turn our attention to transformations.

Pivot Transformation

The user interface of the Pivot transformation in previous versions of Integration Services was a

generic editor for transformations, and it was not intuitive for converting input rows into a set of

columns for each row. The new custom interface, shown in Figure 6-9, provides distinctly named fields

and includes descriptions describing how each field is used as input or output for the pivot operation.

FIGURE 6-9 Pivot transformation editor for converting a set of rows into columns of a single row

Row Count Transformation

Another transformation having a generic editor in previous versions is the Row Count transformation.

The sole purpose of this transformation is to update a variable with the number of rows passing

through the transformation. The new editor makes it very easy to change the one property for this

transformation that requires configuration, as shown in Figure 6-10.

FIGURE 6-10 Row Count transformation editor for storing the current row count in a variable

Merge and Merge Join Transformations

Both the Merge transformation and the Merge Join transformation allow you to collect data from two

inputs and produce a single output of combined results. In earlier versions of Integration Services,

these transformations could result in excessive memory consumption by Integration Services when

data arrives from each input at different rates of speed. The current version of Integration Services

better accommodates this situation by introducing a mechanism for these two transformations to

better manage memory pressure in this situation. This memory-management mechanism operates

automatically, with no additional configuration of the transformation necessary.

Note If you develop custom data flow components for use in the data flow and if

these components accept multiple inputs, you can use new methods in the

Microsoft.SqlServer.Dts.Pipeline namespace to provide similar memory pressure

management

to your custom components. You can learn more about implementing these

methods by reading “Developing Data Flow Components with Multiple Inputs,” located at

http://msdn.microsoft.com/en-us/library/ff877983(v=sql.110).aspx.

DQS Cleansing Transformation

The DQS Cleansing transformation is a new data flow component you use in conjunction with Data

Quality Services (DQS). Its purpose is to help you improve the quality of data by using rules that

are established for the applicable knowledge domain. You can create rules to test data for common

misspellings

in a text field or to ensure that the column length conforms to a standard specification.

To configure the transformation, you select a data-quality-field schema that contains the rules to

apply and then select the input columns in the data flow to evaluate. In addition, you configure error

handling. However, before you can use the DQS Cleansing transformation, you must first install and

configure DQS on a server and create a knowledge base that stores information used to detect data

anomalies and to correct invalid data, which deserves a dedicated chapter. We explain not only how

DQS works and how to get started with DQS, but also how to use the DQS Cleansing transformation

in Chapter 7, “Data Quality Services.”

Column References

The pipeline architecture of the data flow requires precise mapping between input columns and

output

columns of each data flow component that is part of a Data Flow Task. The typical workflow

during data flow development is to begin with one or more sources, and then proceed with the

addition

of new components in succession until the pipeline is complete. As you plug each subsequent

component into the pipeline, the package designer configures the new component’s input

columns to match the data type properties and other properties of the associated output columns

from the preceding

component. This collection of columns and related property data is also known as

metadata.

If you later break the path between components to add another transformation to pipeline,

the metadata in some parts of the pipeline could change because the added component can add

columns, remove columns, or change column properties (such as convert a data type). In previous

versions of Integration Services, an error would display in the data flow designer whenever metadata

became invalid. On opening a downstream component, the Restore Invalid Column References editor

displayed to help you correct the column mapping, but the steps to perform in this editor were not

always intuitive. In addition, because of each data flow component’s dependency on access to metadata,

it was often not possible to edit the component without first attaching it to an existing component

in the pipeline.

Components Without Column References

Integration Services now makes it easier to work with disconnected components. If you attempt to

edit a transformation or destination that is not connected to a preceding component, a warning message

box displays: “This component has no available input columns. Do you want to continue editing

the available properties of this component?”

After you click Yes, the component’s editor displays and you can configure the component as

needed. However, the lack of input columns means that you will not be able to fully configure the

component using the basic editor. If the component has an advanced editor, you can manually add

input columns and then complete the component configuration. However, it is usually easier to use

the interface to establish the metadata than to create it manually.

Resolve References Editor

The current version of Integration Services also makes it easier to manage the pipeline metadata if

you need to add or remove components to an existing data flow. The data flow designer displays an

error indicator next to any path that contains unmapped columns. If you right-click the path between

components, you can select Resolve References to open a new editor that allows you to map the

output

columns to input columns by using a graphical interface, as shown in Figure 6-11.

FIGURE 6-11 Resolve References editor for mapping output to input columns

In the Resolve References editor, you can drag a column from the Unmapped Output Columns list

and add it to the Source list in the Mapped Columns area. Similarly, you can drag a column from the

Unmapped Input Columns area to the Destination list to link the output and input columns. Another

option is to simply type or paste in the names of the columns to map.

Tip When you have a long list of columns in any of the four groups in the editor, you

can type a string in the filter box below the list to view only those columns matching the

criteria

you specify. For example, if your input columns are based on data extracted from

the Sales.SalesOrderDetail table in the AdventureWorks2008R2 database, you can type

unit in the filter box to view only the UnitPrice and UnitPriceDiscount columns.

You can also manually delete a mapping by clicking the Delete Row button to the right of each

mapping. After you have completed the mapping process, you can quickly delete any remaining

unmapped input columns by selecting the Delete Unmapped Input Columns check box at the bottom

of the editor. By eliminating unmapped input columns, you reduce the component’s memory

requirements

during package execution.

Collapsible Grouping

Sometimes the data flow contains too many components to see at one time in the package designer,

depending on your screen size and resolution. Now you can consolidate data flow components into

groups and expand or collapse the groups. A group in the data flow is similar in concept to a se

quence container in the control flow, although you cannot use the group to configure a common

property for all components that it contains, nor can you use it to set boundaries for a transaction or

to set scope for a variable.

To create a group, follow these steps:

1. On the data flow design surface, use your mouse to draw a box around the components that

you want to combine as a group. If you prefer, you can click each component while pressing

the Ctrl key.

2. Right-click one of the selected components, and select Group. A group containing the

components

displays in the package designer, as shown here:

3. Click the arrow at the top right of the Group label to collapse the group.

Data Viewer

The only data viewer now available in Integration Services is the grid view. The histogram, scatter plot,

and chart views have been removed.

To use the data viewer, follow these steps:

1. Right-click the path, and select Enable Data Viewer. All columns in the pipeline are automatically

included.

2. If instead you want to display a subset of columns, right-click the new Data Viewer icon (a

magnifying glass) on the data flow design surface, and select Edit.

3. In the Data Flow Path Editor, select Data Viewer in the list on the left.

4. Move columns from the Displayed Columns list to the Unused Columns list as applicable

(shown next), and click OK.

Change Data Capture Support

Change data capture (CDC) is a feature introduced in the SQL Server 2008 database engine. When

you configure a database for change data capture, the Database Engine stores information about

insert,

update, and delete operations on tables you are tracking in corresponding change data

capture

tables. One purpose for tracking changes in separate tables is to perform extract, transform,

and load (ETL) operations without adversely impacting the source table.

In the previous two versions of Integration Services, there were multiple steps required to develop

packages that retrieve data from change data capture tables and load the results into destination.

To expand Integration Services’ data-integration capabilities in SQL Server 2012 by supporting

change data capture for SQL Server, Microsoft partnered with Attunity, a provider of real-time data-integration

software. As a result, new change data capture components are available for use in the

control flow and data flow, simplifying the process of package development for change data capture.

Note Change data capture support in Integration Services is available only in Enterprise,

Developer, and Evaluation editions. To learn more about the change data capture feature

in the Database Engine, see “Basics of Change Data Capture” at http://msdn.microsoft.com

/en-us/library/cc645937(SQL.110).aspx.

CDC Control Flow

There are two types of packages you must develop to manage change data processing with

Integration

Services: an initial load package for one-time execution, and a trickle-feed package for

ongoing execution on a scheduled basis. You use the same components in each of these packages,

but you configure the control flow differently. In each package type, you include a package variable

with a string data type for use by the CDC components to reflect the current state of processing.

As shown in Figure 6-12, you begin the control flow with a CDC Control Task to mark the start of

an initial load or to establish the Log Sequence Number (LSN) range to process during a trickle-feed

package execution. You then add a Data Flow Task that contains CDC components to perform the

processing of the initial load or changed data. (We describe the components to use in this Data Flow

Task later in this section.) Then you complete the control flow with another CDC Control Task to mark

the end of the initial load or the successful processing of the LSN range for a trickle-feed package.

FIGURE 6-12 Trickle-feed control flow for change data capture processing

Figure 6-13 shows the configuration of the CDC Control Task for the beginning of a trickle-feed

package. You use an ADO.NET Connection Manager to define the connection to a SQL Server database

for which change data capture is enabled. You also specify a CDC control operation and the

name of the CDC state variable. Optionally, you can use a table to persist the CDC state. If you do not

use a table for state persistency, you must include logic in the package to write the state to a persistent

store when change data processing completes and to read the state before beginning the next

execution of change data processing.

FIGURE 6-13 CDC Control Task Editor for retrieving the current LSN range for change data to process

CDC Data Flow

To process changed data, you begin a Data Flow Task with a CDC Source and a CDC Splitter, as

shown in Figure 6-14. The CDC Source extracts the changed data according to the specifications

defined by the CDC Control Task, and then the CDC Splitter evaluates each row to determine whether

the changed data is a result of an insert, update, or delete operation. Then you add data flow

components

to the each output of the CDC Splitter for downstream processing.

FIGURE 6-14 CDC Data Flow Task for processing changed data

CDC Source

In the CDC Source editor (shown in Figure 6-15), you specify an ADO.NET connection manager for the

database and select a table and a corresponding capture instance. Both the database and table must

be configured for change data capture in SQL Server. You also select a processing mode to control

whether to process all change data or net changes only. The CDC state variable must match the

variable you define in the CDC Control Task that executes prior to the Data Flow Task containing the

CDC Source. Last, you can optionally select the Include Reprocessing Indicator Column check box to

identify reprocessed rows for separate handling of error conditions.

FIGURE 6-15 CDC Source editor for extracting change data from a CDC-enabled table

CDC Splitter

The CDC Splitter uses the value of the _$operation column to determine the type of change

associated

with each incoming row and assigns the row to the applicable output: InsertOutput,

UpdateOutput,

or DeleteOutput. You do not configure this transformation. Instead, you add downstream

data flow components to manage the processing of each output separately.

Flexible Package Design

During the initial development stages of a package, you might find it easiest to work with hard-coded

values in properties and expressions to ensure that your logic is correct. However, for maximum

flexibility,

you should use variables. In this section, we review the enhancements for variables and

expressions—the cornerstones of flexible package design.

Variables

A common problem for developers when adding a variable to a package has been the scope

assignment.

If you inadvertently select a task in the control flow designer and then add a new variable

in the Variables window, the variable is created within the scope of that task and cannot be changed.

In these cases, you were required to delete the variable, clear the task selection on the design surface,

and then add the variable again within the scope of the package.

Integration Services now creates new variables with scope set to the package by default. To change

the variable scope, follow these steps:

1. In the Variables window, select the variable to change and then click the Move Variable button

in the Variables toolbar (the second button from the left), as shown here:

2. In the Select New Scope dialog box, select the executable to have scope—the package, an

event handler, container, or task—as shown here, and click OK:

Expressions

The expression enhancements in this release address a problem with expression size limitations and

introduce new functions in the SQL Server Integration Services Expression Language.

Expression Result Length

Prior to the current version of Integration Services, if an expression result had a data type of DT_WSTR

or DT_STR, any characters above a 4000-character limit would be truncated. Furthermore, if an

expression

contained an intermediate step that evaluated a result exceeding this 4000-character limit,

the intermediate result would similarly be truncated. This limitation is now removed.

New Functions

The SQL Server Integration Services Expression Language now has four new functions:

¦ ¦ LEFT You can now more easily return the leftmost portion of a string rather than use the

SUBSTRING function:

LEFT(character_expression,number)

¦ ¦ REPLACENULL You can use this function to replace NULL values in the first argument with

the expression specified in the second expression:

REPLACENULL(expression, expression)

¦ ¦ TOKEN This function allows you to return a substring by using delimiters to separate a string

into tokens and then specifying which occurrence to return:

TOKEN(character_expression, delimiter_string, occurrence)

¦ ¦ TOKENCOUNT This function uses delimiters to separate a string into tokens and then

returns

the count of tokens found within the string:

TOKENCOUNT(character_expression, delimiter_string)

Deployment Models

Up to now in this chapter, we have explored the changes to the package-development process in

SSDT, which have been substantial. Another major change to Integration Services is the concept of

deployment models.

Supported Deployment Models

The latest version of Integration Services supports two deployment models:

¦ ¦ Package deployment model The package deployment model is the deployment model

used in previous versions of Integration Services, in which the unit of deployment is an

individual

package stored as a DTSX file. A package can be deployed to the file system or to

the MSDB database in a SQL Server database instance. Although packages can be deployed

as a group and dependencies can exist between packages, there is no unifying object in

Integration

Services that identifies a set of related packages deployed using the package

model. To modify properties of package tasks at runtime, which is important when running

a package in different environments such as development or production, you use configurations

saved as DTSCONFIG files on the file system. You use either the DTExec or the DTExecUI

utilities

to execute a package on the Integration Services server, providing arguments on the

command line or in the graphical interface when you want to override package property values

at run time manually or by using configurations.

¦ ¦ Project deployment model With this deployment model, the unit of deployment is a

project, stored as an ISPAC file, which in turn is a collection of packages and parameters. You

deploy the project to the Integration Services catalog, which we describe in a separate section

of this chapter. Instead of configurations, you use parameters (as described later in the

“Parameters” section) to assign values to package properties at runtime. Before executing

a package, you create an execution object in the catalog and, optionally, assign parameter

values or environment references to the execution object. When ready, you start the execution

object by using a graphical interface in SQL Server Management Studio by executing a stored

procedure or by running managed code.

In addition to the characteristics just described, there are additional differences between

the package

deployment model and the project deployment model. Table 6-1 compares these differences.

TABLE 6-1 Deployment Model Comparison

Characteristic

Package Deployment Model

Project Deployment Model

Unit of deployment

Package

Project

Deployment location

File system or MSDB database

Integration Services catalog

Run-time property value assignment

Configurations

Parameters

Environment-specific values

for use in property values

Configurations

Environment variables

Package validation

Just before execution using:

• DTExec

• Managed code

Independent of execution using:

• SQL Server Management Studio

interface

• Stored procedure

• Managed code

Package execution

DTExec

DTExecUI

SQL Server Management Studio

interface

Stored procedure

Managed code

Logging

Configure log provider or

implement custom logging

No configuration required

Scheduling

SQL Server Agent job

CLR integration

Not required

Required

When you create a new project in SSDT, the project is by default established as a project

deployment

model. You can use the Convert To Package Deployment Model command on the Project

menu (or choose it from the context menu when you right-click the project in Solution Explorer)

to switch to the package deployment model. The conversion works only if your project is compatible

with the package deployment model. For example, it cannot use features that are exclusive to

the project deployment model, such as parameters. After conversion, Solution Explorer displays

an additional

label after the project name to indicate the project is now configured as a package

deployment

model, as shown in Figure 6-16. Notice that the Parameters folder is no longer available

in the project, while the Data Sources folder is now available in the project.

FIGURE 6-16 Package deployment model

Tip You can reverse the process by using the Project menu, or the project’s context

menu in Solution Explorer, to convert a package deployment model project to a project deployment

model.

Project Deployment Model Features

In this section, we provide an overview of the project deployment model features to help you

understand

how you use these features in combination to manage deployed projects. Later in this

chapter, we explain each of these features in more detail and provide links to additional information

available online.

Although you can continue to work with the package deployment model if you prefer, the primary

advantage of using the new project deployment model is the improvement in package management

across multiple environments. For example, a package is commonly developed on one server,

tested on a separate server, and eventually implemented on a production server. With the package

deployment model, you can use a variety of techniques to provide connection strings for the correct

environment at runtime, each of which requires you to create at least one configuration file and,

optionally, maintain SQL Server tables or environment variables. Although this approach is flexible,

it can also be confusing and prone to error. The project deployment model continues to separate

run-time values from the packages, but it uses object collections in the Integration Services catalog to

store these values and to define relationships between packages and these object collections, known

as parameters, environments, and environment variables.

¦ ¦ Catalog The catalog is a dedicated database that stores packages and related configuration

information accessed at package runtime. You can manage package configuration and execution

by using the catalog’s stored procedures and views or by using the graphical interface in

SQL Server Management Studio.

¦ ¦ Parameters As Table 6-1 shows, the project deployment model relies on parameters to

change task properties during package execution. Parameters can be created within a project

scope or within a package scope. When you create parameters within a project scope, you

apply a common set of parameter values across all the packages contained in the project. You

can then use parameters in expressions or tasks, much the same way that you use variables.

¦ ¦ Environments Each environment is a container of variables you associate with a package at

runtime. You can create multiple environments to use with a single package, but the package

can use variables from only one environment during execution. For example, you can create

environments for development, test, and production, and then execute a package using one

of the applicable environments.

¦ ¦ Environment variables An environment variable contains a literal value that Integration

Services assigns to a parameter during package execution. After deploying a project, you can

associate a parameter with an environment variable. The value of the environment variable

resolves during package execution.

Project Deployment Workflow

The project deployment workflow includes not only the process of converting design-time objects in

SSDT into database objects stored in the Integration Services catalog, but also the process of retrieving

database objects from the catalog to update a package design or to use an existing package as a

template for a new package. To add a project to the catalog or to retrieve a project from the catalog,

you use a project-deployment file that has an ISPAC file extension. There are four stages of the

project

deployment workflow in which the ISPAC file plays a role: build, deploy, import, and convert.

In this section, we review each of these stages.

Build

When you use the project deployment model for packages, you use SSDT to develop one or more

packages as part of an Integration Services project. In preparation for deployment to the catalog,

which serves as a centralized repository for packages and related objects, you build the Integration

Services project in SSDT to produce an ISPAC file. The ISPAC file is the project deployment file that

contains project information, all packages in the Integration Services project, and parameters.

Before performing the build, there are two additional tasks that might be necessary:

¦ ¦ Identify entry-point package If one of the packages in the project is the package that

triggers the execution of the other packages in the project, directly or indirectly, you should

flag that package as an entry-point package. You can do this by right-clicking the package in

Solution Explorer and selecting Entry-Point Package. An administrator uses this flag to identify

the package to start when a package contains multiple projects.

¦ ¦ Create project and package parameters You use project-level or package-level

parameters

to provide values for use in tasks or expressions at runtime, which you learn

more about how to do later in this chapter in the “Parameters” section. In SSDT, you assign

parameter

values to use as a default. You also mark a parameter as required, which prevents a

package from executing until you assign a value to the variable.

During the development process in SSDT, you commonly execute a task or an entire package

within

SSDT to test results before deploying the project. SSDT creates an ISPAC file to hold the

information

required to execute the package and stores it in the bin folder for the Integration Services

project. When you finish development and want to prepare the ISPAC file for deployment, use the

Build menu or press F5.

Deploy

The deployment process uses the ISPAC file to create database objects in the catalog for the project,

packages, and parameters, as shown in Figure 6-17. To do this, you use the Integration Services

Deployment

Wizard, which prompts you for the project to deploy and the project to create or update

as part of the deployment. You can also provide literal values or specify environment variables as

default parameter values for the current project version. These parameter values that you provide in

the wizard are stored in the catalog as server defaults for the project, and they override the default

parameter values stored in the package.

SQL Server

Database Engine

(Source)

SQL Server

Database Engine

(Destination)

Integration Services

Catalog

Integration Services

Catalog

Project Project

Packages Packages

Project

Deployment File

Deployment

Wizard

FIGURE 6-17 Deployment of the ISPAC file to the catalog

You can launch the wizard from within SSDT by right-clicking the project in Solution Explorer and

selecting Deploy. However, if you have an ISPAC file saved to the file system, you can double-click the

file to launch the wizard.

Import

When you want to update a package that has already been deployed or to use it as basis for a new

package, you can import a project into SSDT from the catalog or from an ISPAC file, as shown in

Figure 6-18. To import a project, you use the Integration Services Import Project Wizard, which is

available in the template list when you create a new project in SSDT.

SQL Server

Database Engine

(Source)

SQL Server

Business Intelligence Development Studio

Integration Services

Catalog

Project

Packages

Project

Packages

Project

Deployment File

Import Project

Wizard

FIGURE 6-18 Import a project from the catalog or an ISPAC file

Convert

If you have legacy packages and configuration files, you can convert them to the latest version of

Integration Services, as shown in Figure 6-19. The Integration Services Project Conversion Wizard is

available in both SSDT and in SQL Server Management Studio. Another option is to use the Integration

Services Package Upgrade Wizard available on the Tools page of the SQL Server Installation

Center.

Legacy Packages

and Configurations

SQL Server

2005 Package

Configurations

Project

Deployment File

Migration

Wizard

SQL Server

2005 Package

FIGURE 6-19 Convert existing DTSX files and configurations to an ISPAC file

Note You can use the Conversion Wizard to migrate packages created using SQL Server

2005 Integration Services and later. If you use SQL Server Management Studio, the original

DTSX files are not modified, but used only as a source to produce the ISPAC file containing

the upgraded packages.

In SSDT, open a package project, right-click the project in Solution Explorer, and select Convert To

Project Deployment Model. The wizard upgrades the DTPROJ file for the project and the DTSX files

for the packages.

The behavior of the wizard is different in SQL Server Management Studio. There you right-click the

Projects node of the Integration Services catalog in Object Explorer and select Import Packages. The

wizard prompts you for a destination location and produces an ISPAC file for the new project and the

upgraded packages.

Regardless of which method you use to convert packages, there are some common steps that

occur

as packages are upgraded:

¦ ¦ Update Execute Package tasks If a package in a package project contains an Execute

Package

task, the wizard changes the external reference to a DTSX file to a project reference

to a package contained within the same project. The child package must be in the same

package

project you are converting and must be selected for conversion in the wizard.

¦ ¦ Create parameters If a package in a package project uses a configuration, you can choose

to convert the configuration to parameters. You can add configurations belonging to other

projects to include them in the conversion process. Additionally, you can choose to remove

configurations from the upgraded packages. The wizard uses the configurations to prompt

you for properties to convert to parameters, and it also requires you to specify project scope

or package scope for each parameter.

¦ ¦ Configure parameters The Conversion Wizard allows you to specify a server value for each

parameter and whether to require the parameter at runtime.

Parameters

As we explained in the previous section, parameters are the replacement for configurations in legacy

packages, but only when you use the project deployment model. The purpose of configurations was

to provide a way to change values in a package at runtime without requiring you to open the package

and make the change directly. You can establish project-level parameters to assign a value to one

or more properties across multiple packages, or you can have a package-level parameter when you

need to assign a value to properties within a single package.

Project Parameters

A project parameter shares its values with all packages within the same project. To create a project

parameter in SSDT, follow these steps:

1. In Solution Explorer, double-click Project.params.

2. Click the Add Parameter button on the toolbar in the Project.params window.

3. Type a name for the parameter in the Name text box, select a data type, and specify a value

for the parameter as shown here. The parameter value you supply here is known as the design

default value.

Note The parameter value is a design-time value that can be overwritten during

after deployment to the catalog. You can use the Add Parameters To Configuration

button on the toolbar (the third button from the left) to add selected

parameters to

Visual Studio project configurations, which is useful for testing package executions

under

a variety of conditions.

4. Save the file.

Optionally, you can configure the following properties for each parameter:

¦ ¦ Sensitive By default, this property is set to False. If you change it to True, the parameter

value is encrypted when you deploy the project to the catalog. If anyone attempts to view the

parameter value in SQL Server Management Studio or by accessing Transact-SQL views, the

parameter value will display as NULL. This setting is important when you use a parameter to

set a connection string property and the value contains specific credentials.

¦ ¦ Required By default, this property is also set to False. When the value is True, you must

configure

a parameter value during or after deployment before you can execute the package.

The Integration Services engine will ignore the parameter default value that you specify on

this screen when the Required property is True and deploy the package to the catalog.

¦ ¦ Description This property is optional, but it allows you to provide documentation to an

administrator responsible for managing packages deployed to the catalog.

Package Parameters

Package parameters apply only to the package in which they are created and cannot be shared with

other packages. The center tab in the package designer allows you to access the Parameters window

for your package. The interface for working with package parameters is identical to the project parameters

interface.

Parameter Usage

After creating project or package parameters, you are ready to implement the parameters in your

package much like you implement variables. That is, anywhere you can use variables in expressions

for tasks, data flow components, or connection managers, you can also use parameters.

As one example, you can reference a parameter in expressions, as shown in Figure 6-20. Notice the

parameter appears in the Variables And Parameters list in the top left pane of the Expression Builder.

You can drag the parameter to the Expression text box and use it alone or as part of a more complex

expression. When you click the Evaluate Expression button, you can see the expression result based

on the design default value for the parameter.

FIGURE 6-20 Parameter usage in an expression

Note This expression uses a project parameter that has a prefix of $Project. To create an

expression that uses a package parameter, the parameter prefix is $Package.

You can also directly set a task property by right-clicking the task and selecting Parameterize on

the context menu. The Parameterize dialog box displays as shown in Figure 6-21. You select a property,

and then choose whether to create a new parameter or use an existing parameter. If you create

a new parameter, you specify values for each of the properties you access in the Parameters window.

Additionally, you must specify whether to create the parameter within package scope or project

scope.

FIGURE 6-21 Parameterize task dialog box

Post-Deployment Parameter Values

The design default values that you set for each parameter in SSDT are typically used only to supply a

value for testing within the SSDT environment. You can replace these values during deployment by

specifying server default values when you use the Deployment Wizard or by configuring execution

values when creating an execution object for deployed projects.

Figure 6-22 illustrates the stage at which you create each type of parameter value. If a parameter

has no execution value, the Integration Services engine uses the server default value when executing

the package. Similarly, if there is no server default value, package execution uses the design default

value. However, if a parameter is marked as required, you must provide either a server default value

or an execution value.

Design Deployment

Project

Package

Parameter

“pkgOptions”

Package

Design Default Value = 1

Execution

Project

Package

Parameter

“pkgOptions”

Package

Project

Package

Parameter

“pkgOptions”

Package

Server Default Value = 3

Design Default Value = 1

Exception Value = 5

Server Default Value = 3

Design Default Value = 1

FIGURE 6-22 Parameter values by stage

Note A package will fail when the Integration Services engine cannot resolve a parameter

value. For this reason, it is recommended that you validate projects and packages as

described

in the “Validation” section of this chapter.

Server Default Values

Server default values can be literal values or environment variable references (explained later in this

chapter), which in turn are literal values. To configure server defaults in SQL Server Management

Studio,

you right-click the project or package in the Integration Services node in Object Explorer,

select

Configure, and change the Value property of the parameter, as shown in Figure 6-23. This

server

default value persists even if you make changes to the design default value in SSDT and

redeploy

the project.

FIGURE 6-23 Server default value configuration

Execution Parameter Values

The execution parameter value applies only to a specific execution of a package and overrides all other

values. You must explicitly set the execution parameter value by using the catalog.set_ execution_

parameter_value stored procedure. There is no interface available in SQL Server Management Studio

to set an execution parameter value.

set_execution_parameter_value [ @execution_id = execution_id

, [ @object_type = ] object_type

, [ @parameter_name = ] parameter_name

, [ @parameter_value = ] parameter_value

To use this stored procedure, you must supply the following arguments:

¦ ¦ execution_id You must obtain the execution_id for the instance of the execution. You can

use the catalog.executions view to locate the applicable execution_id.

¦ ¦ object_type The object type specifies whether you are setting a project parameter or a

package parameter. Use a value of 20 for a project parameter and a value of 30 for a package

parameter.

¦ ¦ parameter_name The name of the parameter must match the parameter stored in the

catalog.

¦ ¦ parameter_value Here you provide the value to use as the execution parameter value.

Integration Services Catalog

The Integration Services catalog is a new feature to support the centralization of storage and the

administration of packages and related configuration information. Each SQL Server instance can

host only one catalog. When you deploy a project using the project deployment model, the project

and its components are added to the catalog and, optionally, placed in a folder that you specify in

the Deployment

Wizard. Each folder (or the root level if you choose not to use folders) organizes its

contents

into two groups: projects and environments, as shown in Figure 6-24.

SQL Server Database Engine

Integration Services Catalog

Environment

Reference

Folder

Project

Parameter

Package

Parameter

Package

Environment

Variable

FIGURE 6-24 Catalog database objects

Catalog Creation

Installation of Integration Services on a server does not automatically create the catalog. To do this,

follow these steps:

1. In SQL Server Management Studio, connect to the SQL Server instance, right-click the

Integration

Services Catalogs folder in Object Explorer, and select Create Catalog.

2. In the Create Catalog dialog box, you can optionally select the Enable Automatic Execution Of

Integration Services Stored Procedure At SQL Server Startup check box. This stored procedure

performs a cleanup operation when the service restarts and adjusts the status of packages

that were executing when the service stopped.

3. Notice that the catalog database name cannot be changed from SSISDB, as shown in the

following

figure, so the final step is to provide a strong password and then click OK. The

password

creates a database master key that Integration Services uses to encrypt sensitive

data stored in the catalog.

After you create the catalog, you will see it appear twice as the SSISDB database in Object Explorer.

It displays under both the Databases node as well as the Integration Services node. In the Databases

node, you can interact with it as you would any other database, using the interface to explore

database

objects. You use the Integration Services node to perform administrative tasks.

Note In most cases, multiple options are available for performing administrative tasks with

the catalog. You can use the graphical interface by opening the applicable dialog box for a

selected catalog object, or you can use Transact-SQL views and stored procedures to view

and modify object properties. For more information about the Transact-SQL API, see

http://msdn.microsoft.com/en-us/library/ff878003(v=SQL.110).aspx. You can also use

Windows PowerShell to perform administrative tasks by using the SSIS Catalog Managed

Object Model. Refer to http://msdn.microsoft.com/en-us/library/microsoft.sqlserver.management.

integrationservices(v=sql.110).aspx for details about the API.

Catalog Properties

The catalog has several configurable properties. To access these properties, right-click SSISDB under

the Integration Services node and select Properties. The Catalog Properties dialog box, as shown in

Figure 6-25, displays several properties.

FIGURE 6-25 Catalog Properties dialog box

Encryption

Notice in Figure 6-25 that the default encryption algorithm is AES_256. If you put the SSISDB database

in single-user mode, you can choose one of the other encryption algorithms available:

¦ ¦ DES

¦ ¦ TRIPLE_DES

¦ ¦ TRIPLE_DES_3KEY

¦ ¦ DESX

¦ ¦ AES_128

¦ ¦ AES_192

Integration Services uses encryption to protect sensitive parameter values. When anyone uses the

SQL Server Management Studio interface or the Transact-SQL API to query the catalog, the parameter

value displays only a NULL value.

Operations

Operations include activities such as package execution, project deployment, and project validation,

to name a few. Integration Services stores information about these operations in tables in the catalog.

You can use the Transact-SQL API to monitor operations, or you can right-click the SSISDB database

on the Integration Services node in Object Explorer and select Active Operations. The Active

Operations

dialog box displays the operation identifier, its type, name, the operation start time,

and the caller of the operation. You can select an operation and click the Stop button to end the operation.

Periodically, older data should be purged from these tables to keep the catalog from growing

unnecessarily large. By configuring the catalog properties, you can control the frequency of the SQL

Server Agent job that purges the stale data by specifying how many days of data to retain. If you

prefer, you can disable the job.

Project Versioning

Each time you redeploy a project with the same name to the same folder, the previous version

remains

in the catalog until ten versions are retained. If necessary, you can restore a previous version

by following these steps:

1. In Object Explorer, locate the project under the SSISDB node.

2. Right-click the project, and select Versions.

3. In the Project Versions dialog box, shown here, select the version to restore and click the

Restore

To Selected Version button:

4. Click Yes to confirm, and then click OK to close the information message box. Notice the

selected version is now flagged as the current version, and that the other version remains

available as an option for restoring.

You can modify the maximum number of versions to retain by updating the applicable

catalog

property. If you increase this number above the default value of ten, you should continually

monitor the size of the catalog database to ensure that it does not grow too large. To manage the

size of the catalog, you can also decide whether to remove older versions periodically with a SQL

Server agent job.

Environment Objects

After you deploy projects to the catalog, you can create environments to work in tandem with

parameters to change parameter values at execution time. An environment is a collection of environment

variables. Each environment variable contains a value to assign to a parameter. To connect an

environment to a project, you use an environment reference. Figure 6-26 illustrates the relationship

between parameters, environments, environment variables, and environment references.

Integration Services Catalog

Folder

Project

Parameter

“p1”

Package

Parameter

“p2”

Package

Environment

Variable

“ev1”

Environment

Variable

“ev2”

Environment

Reference

FIGURE 6-26 Environment objects in the catalog

Environments

One convention you can use is to create one environment for each server you will use for package

execution. For example, you might have one environment for development, one for testing, and one

for production. To create a new environment using the SQL Server Management Studio interface,

these steps:

1. In Object Explorer, expand the SSISDB node and locate the Environments folder that

corresponds

to the Projects folder containing your project.

2. Right-click the Environments folder, and select Create Environment.

3. In the Create Environment dialog box, type a name, optionally type a description, and click

OK.

Environment Variables

For each environment, you can create a collection of environment variables. The properties you

configure

for an environment variable are the same ones you configure for a parameter, which is

understandable when you consider that you use the environment variable to replace the parameter

value at runtime. To create an environment variable, follow these steps:

1. In Object Explorer, locate the environment under the SSISDB node.

2. Right-click the environment, and select Properties to open the Environment Properties dialog

box.

3. Click Variables to display the list of existing environment variables, if any, as shown here:

4. On an empty row, type a name for the environment variable in the Name text box, select a

data type in the Type column, type a description (optional), type a value in the Value column

for the environment variable, and select the Sensitive check box if you want the value to be

encrypted in the catalog. Continue adding environment variables on this page, and click OK

when you’re finished.

5. Repeat this process by adding the same set of environment variables to other environments

you intend to use with the same project.

Environment References

To connect environment variables to a parameter, you create an environment reference. There are

two types of environment references: relative and absolute. When you create a relative environment

reference, the parent folder for the environment folder must also be the parent folder for the project

folder. If you later move the package to another without also moving the environment, the package

execution will fail. An alternative is to use an absolute reference, which maintains the relationship

between the environment and the project without requiring them to have the same parent folder.

The environment reference is a property of the project. To create an environment reference, follow

these steps:

1. In Object Explorer, locate the project under the SSISDB node.

2. Right-click the project, and select Configure to open the Configure <Project> dialog box.

3. Click References to display the list of existing environment references, if any.

4. Click the Add button and select an environment in the Browse Environments dialog box. Use

the Local Folder node for a relative environment reference, or use the SSISDB node for an

absolute environment reference.

5. Click OK twice to create the reference. Repeat steps 4 and 5 to add reference for all other applicable

environments.

6. In the Configure <Project> dialog box, click Parameters to switch to the parameters page.

7. Click the ellipsis button to the right of the Value text box to display the Set Parameter Value

dialog box, select the Use Environment Variable option, and select the applicable variable in

the drop-down list, as shown here:

8. Click OK twice.

You can create multiple references for a project, but only one environment will be active during

package execution. At that time, Integration Services will evaluate the environment variable based on

the environment associated with the current execution instance as explained in the next section.

Administration

After the development and deployment processes are complete, it’s time to become familiar with the

administration tasks that enable operations on the server to keep running.

Validation

Before executing packages, you can use validation to verify that projects and packages are likely

to run successfully, especially if you have configured parameters to use environment variables. The

validation process ensures that server default values exist for required parameters, that environment

references are valid, and that data types for parameters are consistent between project and package

configurations and their corresponding environment variables, to name a few of the validation checks.

To perform the validation, right-click the project or package in the catalog, click Validate, and

select

the environments to include in the validation: all, none, or a specific environment. Validation

occurs asynchronously, so the Validation dialog box closes while the validation processes. You can

open the Integration Services Dashboard report to check the results of validation. Your other options

are to right-click the SSISDB node in Object Explorer and select Active Operations or to use of the

Transact-SQL API to monitor an executing package.

Package Execution

After deploying a project to the catalog and optionally configuring parameters and environment

references, you are ready to prepare your packages for execution. This step requires you to create

a SQL Server object called an execution. An execution is a unique combination of a package and its

corresponding

parameter values, whether the values are server defaults or environment references.

To configure and start an execution instance, follow these steps:

1. In Object Explorer, locate the entry-point package under the SSISDB node.

2. Right-click the project, and select Execute to open the Execute Package dialog box, shown

here:

3. Here you have two choices. You can either click the ellipsis button to the right of the value and

specify a literal execution value for the parameter, or you can select the Environment check

box at the bottom of the dialog box and select an environment in the corresponding dropdown

list.

You can continue configuring the execution instance by updating properties on the Connection

Managers tab and by overriding property values and configuring logging on the Advanced tab. For

more information about the options available in this dialog box, see http://msdn.microsoft.com/en-us

/library/hh231080(v=SQL.110).aspx.

When you click OK to close the Execute Package dialog box, the package execution begins.

Because

package execution occurs asynchronously, the dialog box does not need to stay open

during execution. You can use the Integration Services Dashboard report to monitor the execution

status, or right-click the SSISDB node and select Active Operations. Another option is the use of the

Transact-

SQL API to monitor an executing package.

More often, you will schedule package execution by creating a Transact-SQL script that starts

execution

and save the script to a file that you can then schedule using a SQL Server agent job. You

add a job step using the Operating System (CmdExec) step type, and then configure the step to use

the sqlcmd.exe utility and pass the package execution script to the utility as an argument. You run

the job using the SQL Server Agent service account or a proxy account. Whichever account you use, it

must have permissions to create and start executions.

Logging and Troubleshooting Tools

Now that Integration Services centralizes package storage and executions on the server and has

access to information generated by operations, server-based logging is supported and operations

reports are available in SQL Server Management Studio to help you monitor activity on the server and

troubleshoot problems when they occur.

Package Execution Logs

In legacy Integration Services packages, there are two options you can use to obtain logs during

package execution. One option is to configure log providers within each package and associate log

providers with executables within the package. The other option is to use a combination of Execute

SQL statements or script components to implement a custom logging solution. Either way, the steps

necessary to enable logging are tedious in legacy packages.

With no configuration required, Integration Services stores package execution data in the

[catalog].[

executions] table. The most important columns in this table include the start and end times

of package execution, as well as the status. However, the logging mechanism also captures information

related to the Integration Services environment, such as physical memory, the page file size, and

available CPUs. Other tables provide access to parameter values used during execution, the duration

of each executable within a package, and messages generated during package execution. You can

easily write ad hoc queries to explore package logs or build your own custom reports using Reporting

Services for ongoing monitoring of the log files.

Note For a thorough walkthrough of the various tables in which package execution log

data is stored, see “SSIS Logging in Denali,” a blog post by Jamie Thomson at

http://sqlblog.com/blogs/jamie_thomson/archive/2011/07/16/ssis-logging-in-denali.aspx.

Data Taps

A data tap is similar in concept to a data viewer, except that it captures data at a specified point in

the pipeline during package execution outside of SSDT. You can use the T-SQL stored procedure

catalog.

add_data_tap to tap into the data flow during execution if the package has been deployed

to SSIS. The captured data from the data flow is stored in a CSV file you can review after package

execution

completes. No changes to your package are necessary to use this feature.

Reports

Before you build custom reports from the package execution log tables, review the built-in

reports

now available in SQL Server Management Studio for Integration Services. These reports provide

information

on package execution results for the past 24 hours (as shown in Figure 6-27),

performance,

and error messages from failed package executions. Hyperlinks in each report allow

you to drill through from summary to detailed information to help you diagnose package execution problems.

FIGURE 6-27 Integration Services operations dashboard.

To view the reports, you right-click the SSISDB node in Object Explorer, point to Reports, point to

Standard Reports, and then choose from the following list of reports:

¦ ¦ All Executions

¦ ¦ All Validations

¦ ¦ All Operations

¦ ¦ Connections

Security

Packages and related objects are stored securely in the catalog using encryption. Only members of

the new SQL Server database role ssis_admin or members of the existing sysadmin role have permissions

to all objects in the catalog. Members of these roles can perform operations such as creating

the catalog, creating folders in the catalog, and executing stored procedures, to name a few.

Members of the administrative roles delegate administrative permissions to users who need

to manage a specific folder. Delegation is useful when you do not want to give these users access

to the higher privileged roles. To give a user folder-level access, you grant the MANAGE_OBJECT_PERMISSIONS

permission to the user.

For general permissions management, open the Properties dialog box for a folder (or any other

securable object) and go to the Permissions page. On that page, you can select a security principal

by name and then set explicit Grant or Deny permissions as appropriate. You can use this method to

secure folders, projects, environments, and operations.

Package File Format

Although legacy packages stored as DTSX files are formatted as XML, their structure is not compatible

with differencing tools and source-control systems you might use to compare packages. In the

current

version of Integration Services, the package file format is pretty-printed, with properties

formatted

as attributes rather than as elements. (Pretty-printing is the enhancement of code with

syntax conventions for easier viewing.) Moreover, attributes are listed alphabetically and attributes

configured with default values have been eliminated. Collectively, these changes not only help you

more easily locate information in the file, but you can more easily compare packages with automated

tools and more reliably merge packages that have no conflicting changes.

Another significant change to the package file format is the replacement of the meaningless

numeric lineage identifiers with a refid attribute with a text value that represents the path to the

referenced object. For example, a refid for the first input column of an Aggregate transformation in a

data flow task called Data Flow Task in a package called Package looks like this:

Package\Data Flow Task\Aggregate.Inputs[Aggregate Input 1].Columns[LineTotal]

Last, annotations are no longer stored as binary streams. Instead, they appear in the XML file as

clear text. With better access to annotations in the file, the more likely it is that annotations can be

programmatically extracted from a package for documentation purposes.

141

CHAP TE R 7

Data Quality Services

The quality of data is a critical success factor for many data projects, whether for general business

operations or business intelligence. Bad data creeps into business applications as a result of user

entry, corruption during transmission, business processes, or even conflicting data standards across

data sources. The Data Quality Services (DQS) feature of Microsoft SQL Server 2012 is a set of

technologies

you use to measure and manage data quality through a combination of computer-assisted

and manual processes. When your organization has access to high-quality data, your business

process can operate more effectively and managers can rely on this data for better decision-making.

By centralizing

data quality management, you also reduce the amount of time that people spend

reviewing

and correcting data.

Data Quality Services Architecture

In this section, we describe the two primary components of DQS: the Data Quality Server and Data

Quality Client. The DQS architecture also includes components that are built into other SQL Server

2012 features. For example, Integration Services has the DQS Cleansing transformation you use to

apply data-cleansing rules to a data flow pipeline. In addition, Master Data Services supports DQS

matching so that you can de-duplicate data before adding it as master data. We explain more about

these components in the “Integration” section of this chapter. All DQS components can coexist on the

same server, or they can be installed on separate servers.

Data Quality Server

The Data Quality Server is the core component of the architecture that manages the storage of

knowledge and executes knowledge-related processes. It consists of a DQS engine and multiple

databases

stored in a local SQL Server 2012 instance. These databases contain knowledge bases,

stored procedures for managing the Data Quality Server and its contents, and data about cleansing,

matching, and data-profiling activities.

Installation of the Data Quality Server is a multistep process. You start by using SQL Server Setup

and, at minimum, selecting the Database Engine and Data Quality Services on the Feature Selection

page. Then you continue installation by opening the Data Quality Services folder in the Microsoft SQL

Server 2012 program group on the Start menu, and launching Data Quality Server Installer. A command

window opens, and a prompt appears for the database master key password. You must supply

142 PART 2 Business Intelligence Development

a strong password having at least eight characters and including at least one uppercase letter, one

lowercase letter, and one special character.

After you provide a valid password, installation of the Data Quality Server continues for several

minutes. In addition to creating and registering assemblies on the server, the installation process

creates

the following databases on a local instance of SQL Server 2012:

¦ ¦ DQS_MAIN As its name implies, this is the primary database for the Data Quality Server. It

contains the published knowledge bases as well as the stored procedures that support the

DQS engine. Following installation, this database also contains a sample knowledge base

called DQS data, which you can use to cleanse country data or data related to geographical

locations in the United States.

¦ ¦ DQS_PROJECTS This database is for internal use by the Data Quality Server to store data

related to managing knowledge bases and data quality projects.

¦ ¦ DQS_STAGING_DATA You can use this database as intermediate storage for data that

you want to use as source data for DQS operations. DQS can also use this database to store

processed

data that you can later export.

Before users can use client components with the Data Quality Server, a user with sysadmin

privileges

must create a SQL Server login for each user and map each user to the DQS_MAIN

database

using one of the following database roles created at installation of the Data Quality Server:

¦ ¦ dqs_administrator A user assigned to this role has all privileges available to the other

roles plus full administrative privileges, with the exception of adding new users. Specifically, a

member of this role can stop any activity or stop a process within an activity and perform any

configuration task using Data Quality Client.

¦ ¦ dqs_kb_editor A member of this role can perform any DQS activity except administration.

A user must be a member of this role to create or edit a knowledge base.

¦ ¦ dqs_kb_operator This role is the most limited of the database roles, allowing its members

only to edit and execute data quality projects and to view activity-monitoring data.

Important You must use SQL Server Configuration Manager to enable the TCP/IP protocol

for the SQL Server instance hosting the DQS databases before remote clients can connect

to the Data Quality Server.

Data Quality Client

Data Quality Client is the primary user interface for the Data Quality Server that you install as a

stand-

alone application. Business users can use this application to interactively work with data

quality

projects, such as cleansing or data profiling. Data stewards use Data Quality Client to create

or maintain

knowledge bases, and DQS administrators use it to configure and manage the Data Quality

Server.

CHAPTER 7 Data Quality Services 143

To install this application, use SQL Server Setup and select Data Quality Client on the Feature

Selection

page. It requires the Microsoft .NET Framework 4.0, which installs automatically if necessary.

Note If you plan to import data from Microsoft Excel, you must install Excel on the Data

Quality Client computer. DQS supports both 32-bit and 64-bit versions of Excel 2003, but it

supports only the 32-bit version of Excel 2007 or 2010 unless you save the workbook as an

XLS or CSV file.

If you have been assigned to one of the DQS database roles, or if you have sysadmin privileges

on the SQL Server instance hosting the Data Quality Server, you can open Data Quality Client, which

is found in the Data Quality Services folder of the Microsoft SQL Server 2012 program group on the

Start menu. You must then identify the Data Quality Server to establish the client-server connection. If

you click the Options button, you can select a check box to encrypt the connection.

After opening Data Quality Client, the home screen provides access to the following three types of

tasks:

¦ ¦ Knowledge Base Management You use this area of Data Quality Client to create a new

knowledge base, edit an existing knowledge base, use knowledge discovery to enhance a

knowledge base with additional values, or create a matching policy for a knowledge base.

¦ ¦ Data Quality Projects In this area, you create and run data quality projects to perform

data-cleansing or data-matching tasks.

¦ ¦ Administration This area allows you to view the status of knowledge base management

activities,

data quality projects, and DQS Cleansing transformations used in Integration

Services

packages. In addition, it provides access to configuration properties for the Data

Quality Server, logging, and reference data services.

Knowledge Base Management

In DQS, you create a knowledge base to store information about your data, including valid and invalid

values and rules to apply for validating and correcting data. You can generate a knowledge base from

sample data, or you can manually create one. You can reuse a knowledge base with multiple data

quality projects and enhance it over time with the output of cleansing and matching projects. You

can give responsibility for maintaining the knowledge base to data stewards. There are three activities

that you or data stewards perform using the Knowledge Base Management area of Data Quality

Client: Domain Management, Knowledge Discovery, and Matching Policy.

Domain Management

After creating a knowledge base, you manage its contents and rules through the Domain

Management

activity. A knowledge base is a logical collection of domains, with each domain

corresponding

to a single field. You can create separate knowledge bases for customers and products,

or you can combine these subject areas into a single knowledge base. After you create a domain, you

define trusted values, invalid values, and examples of erroneous data. In addition to this set of values,

you also manage properties and rules for a domain.

To prevent potential conflicts resulting from multiple users working on the same knowledge base

at the same time, DQS locks the knowledge base when you begin a new activity. To unlock an activity,

you must publish or discard the results of your changes.

Note If you have multiple users who have responsibility for managing knowledge, keep

in mind that only one user at a time can perform the Domain Management activity for a

single knowledge base. Therefore, you might consider using separate knowledge bases for

each area of responsibility.

DQS Data Knowledge Base

Before creating your own knowledge base, you can explore the automatically installed knowledge

base, DQS Data, to gain familiarity with the Data Quality Client interface and basic knowledge-base

concepts. By exploring DQS Data, you can learn about the types of information that a knowledge

base can contain.

To get started, open Data Quality Client and click the Open Knowledge Base button on the home

page. DQS Data displays as the only knowledge base on the Open Knowledge Base page. When

you select DQS Data in the list of existing knowledge bases, you can see the collection of domains

associated

with this knowledge base, as shown in Figure 7-1.

FIGURE 7-1 Knowledge base details for DQS Data

Choose the Domain Management activity in the bottom right corner of the window, and click

Next to explore the domains in this knowledge base. In the Domain list on the Domain Management

page, you choose a domain, such as Country/Region, and access separate tabs to view or change the

knowledge associated with that domain. For example, you can click on the Domain Values tab to see

how values that represent a specific country or region will be corrected when you use DQS to perform

data cleansing, as shown in Figure 7-2.

FIGURE 7-2 Domain values for the Country/Region domain

The DQS Data knowledge base, by default, contains only domain values. You can add other

knowledge

to make this data more useful for your own data quality projects. For example, you could

add domain rules to define conditions that identify valid data or create term-based relations to

define how to make corrections to terms found in a string value. More information about these other

knowledge

categories is provided in the “New Knowledge Base” section of this chapter.

Tip When you finish reviewing the knowledge base, click the Cancel button and click Yes

to confirm that you do not want to save your work.

New Knowledge Base

On the home page of Data Quality Client, click the New Knowledge Base button to start the process

of creating a new knowledge base. At a minimum, you provide a name for the knowledge base, but

you can optionally add a description. Then you must specify whether you want to create an empty

knowledge base or create your knowledge base from an existing one, such as DQS Data. Another

option is to create a knowledge base by importing knowledge from a DQS file, which you create by

using the export option from any existing knowledge base.

After creating the knowledge base, you select the Domain Management activity, and click Next so

that you can add one or more domains in the knowledge base. To add a new domain, you can create

the domain manually or import a domain from an existing knowledge base using the applicable

button

in the toolbar on the Domain Management page. As another option, you can create a domain

from the output of a data quality project.

Domain When you create a domain manually, you start by defining the following properties for the

domain (as shown in Figure 7-3):

¦ ¦ Domain Name The name you provide for the domain must be unique within the knowledge

base and must be 256 characters or less.

¦ ¦ Description You can optionally add a description to provide more information about the

contents of the domain. The maximum number of characters for the description is 2048.

¦ ¦ Data Type Your options here are String, Date, Integer, or Decimal.

¦ ¦ Use Leading Values When you select this option, the output from a cleansing or matching

data quality project will use the leading value in a group of synonyms. Otherwise, the output

will be the input value or its corrected value.

¦ ¦ Normalize String This option appears only when you select String as the Data Type. You

use it to remove special characters from the domain values during the data-processing stage

of knowledge discovery, data cleansing, and matching activities. Normalization might be

helpful for improving the accuracy of matches when you want to de-duplicate data because

punctuation

might be used inconsistently in strings that otherwise are a match.

¦ ¦ Format Output To When you output the results of a data quality project, you can apply

formatting to the domain values if you change this setting from None to one of the available

format options, which will depend on the domain’s data type. For example, you could choose

Mon-yyyy for a Date data type, #,##0.00 for a Decimal data type, #,##0 for an Integer data

type, or Capitalize for a String data type.

¦ ¦ Language This option applies only to String data types. You use it to specify the language to

apply when you enable the Speller.

¦ ¦ Enable Speller You can use this option to allow DQS to check the spelling of values

for a domain with a String data type. The Speller will flag suspected syntax, spelling, and

sentence-

structure errors with a red underscore when you are working on the Domain Values

or Term-

Based Relations tabs of the Domain Management activity, the Manage Domain

Values

step of the Knowledge Discovery activity, or the Manage And View Results step of the Cleansing

activity.

¦ ¦ Disable Syntax Error Algorithms DQS can check string values for syntax errors before

adding

each value to the domain during data cleansing.

FIGURE 7-3 Domain properties in the Create Domain dialog box

Domain Values After you create the domain, you can add knowledge to the domain by setting

domain values. Domain

values represent the range of possible values that might exist in a data source

for the current domain, including both correct and incorrect values. The inclusion of incorrect domain

values in a knowledge base allows you to establish rules for correcting those values during cleansing

activities.

You use buttons on the Domain Values tab to add a new domain value manually, import new valid

values from Excel, or import new string values with type Correct or Error from a cleansing data quality

project.

Tip To import data from Excel, you can use any of the following file types: XLS, XLSX, or

CSV. DQS attempts to add every value found in the file, but only if the value does not already

exist in the domain. DQS imports values in the first column as domain values and

values in other columns as synonyms, setting the value in the first column as the leading

value. DQS will not import a value if it violates a domain rule, if the data type does not

match that of the domain, or if the value is null.

When you add or import values, you can adjust the Type to one of the following settings for each

domain value:

¦ ¦ Correct You use this setting for domain values that you know are members of the domain

and have no syntax errors.

¦ ¦ Error You assign a Type of Error to a domain value that you know is a member of the domain

but has an incorrect value, such as a misspelling or undesired abbreviation. For example, in a

Product domain, you might add a domain value of Mtn 500 Silver 52 with the Error type and

include a corrected value of Mountain-500 Silver, 52. By adding known error conditions to the

domain, you can speed up the data cleansing process by having DQS automatically identify

and fix known errors, which allows you to focus on new errors found during data processing.

However, you can flag a domain value as an Error value without providing a corrected value.

¦ ¦ Invalid You designate a domain value as Invalid when it is not a member of the domain and

you have no correction to associate with it. For example, a value of United States would be an

invalid value in a Product domain.

After you publish your domain changes to the knowledge base and later return to the Domain

Values tab, you can see the relationship between correct and incorrect values in the domain values

list, as shown in Figure 7-4 for the product Adjustable Race. Notice also the underscore in the user

interface to identify a potential misspelling of “Adj Race.”

FIGURE 7-4 Domain value with Error state and corrected value

If there are multiple correct domain values that correspond to a single entity, you can organize

them as a group of synonyms by selecting them and clicking the Set Selected Domain Values As

Synonyms button in the Domain Values toolbar. Furthermore, you can designate one of the synonyms

as a leading value by right-clicking the value and selecting Set As Leading in the context menu.

When DQS encounters any of the other synonyms in a cleansing activity, it replaces the nonleading

synonym

values with the leading value. Figure 7-5 shows an example of synonyms for the leading

value Road-750 Black, 52 after the domain changes have been published.

FIGURE 7-5 Synonym values with designation of a leading value

Data Quality Client keeps track of the changes you make to domain values during your current

session.

To review your work, you click the Show/Hide The Domain Values Changes History Panel

button,

which is accessible by clicking the last button in the Domain Values toolbar.

Term-Based Relations Another option you have for adding knowledge to a domain is term-based

relations, which you use to make it easier to find and correct common occurrences in your domain

values. Rather than set up synonyms for variations of a domain value on the Domain Values tab, you

can define a list of string values and specify the corresponding Correct To value on the Term-Based

Relations tab. For example, in the product domain, you could have various products that contain the

abbreviation Mtn, such as Mtn-500 Silver, 40 and Mtn End Caps. To have DQS automatically correct

any occurrence of Mtn within a domain value string to Mountain, you can create a term-based

relation

by specifying a Value/Correct To pair, as shown in Figure 7-6.

FIGURE 7-6 Term-based relation with a value paired to a corrected value

Reference Data You can subscribe to a reference data service (RDS) that DQS uses to cleanse,

standardize,

and enhance your data. Before you can set the properties on the Reference Data tab

for your knowledge base, you must configure the RDS provider as described in the “Configuration”

section

of this chapter.

After you configure your DataMarket Account key in the Configuration area, you click the Browse

button on the Reference Data tab to select a reference data service provider for which you have a

current subscription. On the Online Reference Data Service Providers Catalog page, you map the

domain

to an RDS schema column. The letter M displays next to mandatory columns in the RDS

schema that you must include in the mapping.

Next, you configure the following settings for the RDS (as shown in Figure 7-7):

¦ ¦ Auto Correction Threshold Specify the threshold for the confidence score. During data

cleansing, DQS autocorrects records having a score higher than this threshold.

¦ ¦ Suggested Candidates Specify the number of candidates for suggested values to retrieve

from the reference data service.

¦ ¦ Min Confidence Specify the threshold for the confidence score for suggestions. During data

cleansing, DQS ignores suggestions with a score lower than this threshold.

FIGURE 7-7 Reference data service provider settings

Domain Rules You use domain rules to establish the conditions that determine whether a domain

value is valid. However, note that domain rules are not used to correct data.

After you click the Add A New Domain Rule button on the Domain Rules tab, you provide a name

for the rule and an optional description. Then you define the conditions for the rule in the Build A

Rule pane. For example, if there is a maximum length for a domain value, you can create a rule by

selecting Length Is Less Than Or Equal To in the rule drop-down list and then typing in the condition

value, as shown in Figure 7-8.

FIGURE 7-8 Domain rule to validate the length of domain values

You can create compound rules by adding multiple conditions to the rule and specifying whether

the conditions have AND or OR logic. As you build the rule, you can use the Run The Selected Domain

Rule On Test Data button. You must manually enter one or more values as test data for this procedure,

and then click the Test The Domain Rule On All The Terms button. You will see icons display to

indicate whether a value is correct, in error, or invalid. Then when you finish building the rule, you

click the Apply All Rules button to update the status of domain values according to the new rule.

Note You can temporarily disable a rule by clearing its corresponding Active check box.

Composite Domain Sometimes a complex string in your data source contains multiple terms, each

of which has different rules or different data types and thus requires separate domains. However, you

still need to validate the field as a whole. For example, with a product name like Mountain-500 Black,

40, you might want to validate the model of Mountain-500, the color Black, and the size 40 separately

to confirm the validity of the product name. To address this situation, you can create a composite

domain.

Before you can create a composite domain, you must have at least two domains in your knowledge

base. Begin the Domain Management activity, and click the Create A Composite Domain button on

the toolbar. Type a name for the composite domain, provide a description if you like, and then select

the domains to include in the composite domain.

Note After you add a domain to a composite domain, you cannot add it to a second composite

domain.

CD Properties You can always change, add, or remove domains in the composite domain on the CD

Properties tab. You can also select one of the following parsing methods in the Advanced section of

this tab:

¦ ¦ Reference Data If you map the composite domain to an RDS, you can specify that DQS use

the RDS for parsing each domain in the composite domain.

¦ ¦ In Order You use this setting when you want DQS to parse the field values in the same order

that the domains are listed in the composite domain.

¦ ¦ Delimiters When your field contains delimiters, you can instruct DQS to parse values based

on the delimiter you specify, such as a tab, comma, or space. When you use delimiters for

parsing, you have the option to use knowledge-base parsing. With knowledge-base parsing,

DQS identifies the domains for known values in the string and then uses domain knowledge to

determine how to add unknown values to other domains. For example, let’s say that you have

a field in the data source containing the string Mountain-500 Brown, 40. If DQS recognizes

Mountain-500 as a value in the Model domain and 40 as a value in the Size domain, but it

does not recognize Brown in the Color domain, it will add Brown to the Color domain.

Reference Data You use this tab to specify an RDS provider and map the individual domains of

a composite domain to separate fields of the provider’s RDS schema. For example, you might have

company information for which you want to validate address details. You combine the domains

Address,

City, State, and Zip to create a composite domain that you map to the RDS schema. You then

create a cleansing project to apply the RDS rules to your data.

CD Rules You use the CD Rules tab to define cross-domain rules for validating, correcting, and standardizing

domain values for a composite domain. A cross-domain rule uses similar conditions available

to domain rules, but this type of rule must hold true for all domains in the composite domain

rather than for a single domain. Each rule contains an If clause and a Then clause, and each clause

contains one or more conditions and is applicable to separate domains. For example, you can develop

a rule that invalidates a record if the Size value is not S, M, or L when the Model value begins with

Mountain-, as shown in Figure 7-9. As with domain rules, you can create rules with multiple conditions

and you can test a rule before finalizing its definition.

FIGURE 7-9 Cross-domain rule to validate corresponding size values in composite domain

Note If you create a Then clause that uses a definitive condition, DQS will not apply the

rule to both domain values and their synonyms. Definitive conditions are Value Is Equal To,

Value Is Not Equal To, Value Is In, or Value Is Not In. Furthermore, if you use the Value Is

Equal To condition for the Then clause, DQS not only validates data, but also corrects data

using this rule.

Value Relations After completing a knowledge-discovery activity, you can view the number of

occurrences

for each combination of values in a composite domain, as shown in Figure 7-10.

FIGURE 7-10 Value relations for a composite domain

Linked Domain You can use a linked domain to handle situations that a regular domain cannot

support. For example, if you create a data quality project for a data source that has two fields that use

the same domain values, you must set up two domains to complete the mapping of fields to domains.

Rather than maintain two domains with the same values, you can create a linked domain that inherits

the properties,

values, and rules of another domain.

One way to create a linked domain is to open the knowledge base containing the source domain,

right-click the domain, and select Create A Linked Domain. Another way to create a linked domain is

during an activity that requires you to map fields. You start by mapping the first field to a domain and

then attempt to map the second field to the same domain. Data Quality Client prompts you to create

a linked domain, at which time you provide a domain name and description.

After you create the linked domain, you can use the Domain Management activity to complete

tasks such as adding domain values or setting up domain rules by accessing either domain. The

changes are made automatically in the other domain. However, you can change the domain

properties

in the original domain only.

End of Domain Management Activity

DQS locks the knowledge base when you begin the Domain Management activity to prevent others

from making conflicting changes. If you cannot complete all the changes you need to make in a single

session, you click the Close button to save your work and keep the knowledge base locked. Click the

Finish button when your work is complete. Data Quality Client displays a prompt for you to confirm

the action to take. Click Publish to make your changes permanent, unlock the database, and make the

knowledge base available to others. Otherwise, click No to save your work, keep the database locked,

and exit the Domain Management activity.

Knowledge Discovery

As an alternative to manually adding knowledge to your knowledge base, you can use the Knowledge

Discovery activity to partially automate that process. You can perform this activity multiple times, as

often as needed, to add domain values to the knowledge base from one or more data sources.

To start, you open the knowledge base in the Domain Management area of Data Quality Client

and select the Knowledge Discovery activity. Then you complete a series of three steps: Map,

Discover,

and Manage Domain Values.

Map

In the Map step, you identify the source data you want DQS to analyze. This data must be available

on the Data Quality Server, either in a SQL Server table or view or in an Excel file. However, it does not

need to be from the same source that you intend to cleanse or de-duplicate with DQS.

The purpose of this step is to map columns in the source data to a domain or composite domain in

the knowledge base, as shown in Figure 7-11. You can choose to create a new domain or composite

domain at this point when necessary. When you finish mapping all columns, click the Next button to

proceed to the Discover step.

FIGURE 7-11 Mapping the source column to the domain for the Knowledge Discovery activity

Discover

In the Discover step, you use the Start button to begin the knowledge discovery process. As the

process

executes, the Discover step displays the current status for the following three phases of

processing

(as shown in Figure 7-12):

¦ ¦ Pre-processing Records During this phase, DQS loads and indexes records from the

source in preparation for profiling the data. The status of this phase displays as the number

of pre-processed records compared to the total number of records in the source. In addition,

DQS updates all data-profiling statistics except the Valid In Domain column. These statistics

are visible

in the Profiler tab and include the total number of records in the source, the total

number of values for the domain by field, and the number of unique values by field. DQS also

compares the values from the source with the values from the domain to determine which

values are found only in the source and identified as new.

¦ ¦ Running Domain Rules DQS uses the domain rules for each domain to update the Valid In

Domain column in the Profiler, and it displays the status as a percentage of completion.

¦ ¦ Running Discovery DQS analyzes the data to add to the Manage Domain Values step and

identifies syntax errors. As this phase executes, the current status displays as a percentage of

completion.

FIGURE 7-12 Source statistics resulting from data discovery analysis

You use the source statistics on the Profiler tab to assess the completeness and uniqueness of the

source data. If the source yields few new values for a domain or has a high number of invalid values,

you might consider using a different source. The Profiler tab might also display notifications, as shown

in Figure 7-13, to alert you to such conditions.

FIGURE 7-13 Notifications and source statistics display on the Profiler tab

Manage Domain Values

In the third step of the Knowledge Discovery activity, you review the results of the data-discovery

analysis. The unique new domain values display in a list with a Type setting and suggested corrections,

where applicable. You can make any necessary changes to the values, type, and corrected

values, and you can add new domain values, delete values, and work with synonyms just like you can

when working on the Doman Values tab of the Domain Management activity.

Notice in Figure 7-14 that the several product models beginning with Mountain- were marked as

Error and DQS proposed a corrected value of Mountain-500 for each of the new domain values. It did

this because Mountain-500 was the only pre-existing product model in the domain. DQS determined

that similar product models found in the source must be misspellings and proposed corrections to

the source values to match them to the pre-existing domain value. In this scenario, if the new product

models are all correct, you can change the Type setting to Correct. When you make this change, Data

Quality Client automatically removes the Correct To value.

FIGURE 7-14 The Knowledge Discovery activity produces suggested corrections

After reviewing each domain and making corrections, click the Finish button to end the Knowledge

Discovery activity. Then click the Publish button to complete the activity, update the knowledge base

with the new domain values, and leave the knowledge base in an unlocked state. If you click the No

button instead of the Publish button, the results of the activity are discarded and the knowledge base

is unlocked.

Matching Policy

Another aspect of adding knowledge to a knowledge base is defining a matching policy. This policy

is necessary for data quality projects that use matching to correct data problems such as misspelled

customer names or inconsistent address formats. A matching policy contains one or more matching

rules that DQS uses to determine the probability of a match between two records.

You begin by opening a knowledge base in the Domain Management area of Data Quality Client

and selecting the Matching Policy activity. The process to create a matching policy consists of three

steps: Map, Matching Policy, and Matching Results.

Tip You might consider creating a knowledge base that you use only for matching

projects.

In this matching-only knowledge base, include only domains that have values that

are both discrete and uniquely identify a record, such as names and addresses.

Map

The first step of the Matching Policy activity is similar to the first step of the Knowledge Discovery

activity. You start the creation of a matching policy by mapping a field from an Excel or SQL Server

data source to a domain or composite domain in the selected knowledge base. If a corresponding

domain does not exist in the knowledge base, you have the option to create one. You must select a

source field in this step if you want to reference that field in a matching rule in the next step. Use the

Next button to continue to the Discover step.

Matching Policy

In the Matching Policy step, you set up one or more matching rules that DQS uses to assign a

matching

score for each pair of records it compares. DQS considers the records to be a match when

this matching score is greater than the minimum matching score you establish for the matching

policy.

To begin, click the Create A Matching Rule button. Next, assign a name, an optional description,

and a minimum matching score to the matching rule. The lowest minimum matching score you can

assign is 80 percent, unless you change the DQS configuration on the Administration page. In the

Rule Editor toolbar, click the Add A New Domain Element button, select a domain, and configure the

matching rule parameters for the selected domain, as shown in Figure 7-15.

FIGURE 7-15 Creation of a matching rule for a matching policy

You can choose from the following values when configuring the Similarity parameter:

¦ ¦ Similar You select this value when you want DQS to calculate a matching score for a field

in two records and set the similarity score to 0 (to indicate no similarity) when the matching

score is less than 60. If the field has a numeric data type, you can set a threshold for similarity

using a percentage or integer value. If the field has a date data type, you can set the threshold

using a numeric value for day, month, or year.

¦ ¦ Exact When you want DQS to identify two records to be a match only when the same field

in each record is identical, you select this value. DQS assigns a matching score of 100 for the

domain when the fields are identical. Otherwise, it assigns a matching score of 0.

Whether you add one or more domains to a matching rule, you must configure the Weight

parameter

for each domain that you do not set as a prerequisite. DQS uses the weight to determine

how the individual domain’s matching score affects the overall matching score. The sum of the weight

values must be equal to 100.

When you select the Prerequisite check box for a domain, DQS sets the Similarity parameter to

Exact and considers values in a field to be a match only when they are identical in the two compared

records. Regardless of the result, a prerequisite domain has no effect on the overall matching score

for a record. Using the prerequisite option is an optimization that speeds up the matching process.

You can test the rule by clicking the Start button on the Matching Policy page. If the results are

not what you expect, you can modify the rule and test the rule again. When you retest the matching

policy, you can choose to either execute the matching policy on the processed matches from a

previous execution or on data that DQS reloads from the source. The Matching Results tab, shown in

Figure 7-16, displays the results of the current test and the previous test so that you can determine

whether your changes improve the match results.

FIGURE 7-16 Comparison of results from consecutive executions of a matching rule

A review of the Profiler tab can help you decide how to modify a match rule. For example, if a field

has a high percentage of unique records, you might consider eliminating the field from a match rule

or lower the weight value. On the other hand, having a low percentage of unique records is useful

only if the field has a high level of completeness. If both uniqueness and completeness are low, you

should exclude the field from the matching policy.

You can click the Restore Previous Rule button to revert the rule settings to their prior state if you

prefer. When you are satisfied with the results, click the Next button to continue to the next step.

Matching Results

In the Matching Results step, you choose whether to review results as overlapping clusters or

nonoverlapping

clusters. With overlapping clusters, you might see separate clusters that contain the

same records, whereas with nonoverlapping clusters you see only clusters with records in common.

Then you click the Start button to apply all matching rules to your data source.

When processing completes, you can view a table that displays a filtered list of matched records

with a color code to indicate the applicable matching rule, as shown in Figure 7-17. Each cluster

of records has a pivot record that DQS randomly selects from the cluster as the record to keep.

Furthermore,

each cluster includes one or more matched records along with its matching score. You

can change the filter to display unmatched records, or you can apply a separate filter to view matched

records having scores greater than or equal to 80, 85, 90, 95, or 100 percent.

FIGURE 7-17 Matched records based on two matching rules

You can review the color codes on the Matching Rules tab, and you can evaluate statistics about

the matching results on the Matching Results tab. You can also double-click a matched record in the

list to see the Pivot and Matched Records fields side by side, the score for each field, and the overall

score for the match, as shown in Figure 7-18.

FIGURE 7-18 Matching-score details displaying field scores and an overall score

If necessary, you can return to the previous step to fine-tune a matching rule and then return to

this step. Before you click the Restart button, you can choose the Reload Data From Source option

to copy the source data into a staging table where DQS re-indexes it. Your other option is to choose

Execute On Previous Data to use the data in the staging table without re-indexing it, which could

process the matching policy more quickly.

Once you are satisfied with the results, click the Finish button. You then have the option to publish

the matching policy to the knowledge base. At this point, you can use the matching policy with a

matching data quality project.

Data Quality Projects

When you have a knowledge base in place, you can create a data quality project to use the

knowledge

it contains to cleanse source data or use its matching policy to find matching records

in source data. After you run the data quality project, you can export its results to a SQL Server

database

or to a CSV file. As another option, you can import the results to a domain in the Domain Management

activity.

Regardless of which type of data quality project you want to create, you start the project in the

same way, by clicking the New Data Quality Project button on the home page of Data Quality Client.

When you create a new project, you provide a name for the data quality project, provide an optional

description, and select a knowledge base. You can then select the Cleansing activity for any knowledge

base; you can select the Matching activity for a knowledge base only when that knowledge

base has a matching policy. To launch the activity’s wizard, click the Create button. At this point, the

project is locked and inaccessible to other users.

Cleansing Projects

A cleansing data quality project begins with an analysis of source data using knowledge contained

in a knowledge base and a categorization of that data into groups of correct and incorrect data.

After DQS completes the analysis and categorization process, you can approve, reject, or change the proposed

corrections.

When you create a cleansing data quality project, the wizard leads you through four steps: Map,

Cleanse, Manage And View Results, and Export.

Map

The first step in the cleansing data quality project is to map the data source to domains in the

selected

knowledge base, following the same process you used for the Knowledge Discovery and

Matching Policy activities. The data source can be either an Excel file or a table or view in a SQL Server

database. When you finish mapping all columns, click the Next button to proceed to the Cleanse step.

Note If you map a field to a composite domain, only the rules associated with the

composite

domain will apply, rather than the rules for the individual domains assigned

to the composite domain. Furthermore, if the composite domain is mapped to a reference

data service, DQS sends the source data to the reference data service for parsing and

cleansing. Otherwise, DQS performs the parsing using the method you specified for the

composite domain.

Cleanse

To begin the cleansing process, click the Start button. When the analysis process completes, you

can view the statistics in the Profiler tab, as shown in Figure 7-19. These statistics reflect the results

of categorization that DQS performs: correct records, corrected records, suggested records, and

invalid records. DQS uses advanced algorithms to cleanse data and calculates a confidence score to

determine the category applicable to each record in the source data. You can configure confidence

thresholds

for auto-correction and auto-suggestion. If a record’s confidence score falls below either

of these thresholds and is neither correct nor invalid, DQS categorizes the record as new and leaves it

for you to manually correct if necessary in the next step.

FIGURE 7-19 Profiler statistics after the cleansing process completes

Manage and View Results

In the next step of the cleansing data quality project, you see separate tabs for each group of records

categorized by DQS: Suggested, New, Invalid, Corrected, and Correct. The tab labels show the number

of records allocated to each group. When you open a tab, you can see a table of domain values

for that group and the number of records containing each domain value, as shown in Figure 7-20.

When applicable, you can also see the proposed corrected value, confidence score, and reason

for the proposed correction for each value. If you select a row in the table, you can see the individual

records that contain the original value. You must use the horizontal scroll bar to see all the fields for

the individual records.

When you enable the Speller feature for a domain, the cleansing process identifies potential

spelling

errors by displaying a wavy red underscore below the domain value. You can right-click the

value to see suggestions and then select one or add the potential error to the dictionary.

FIGURE 7-20 Categorized results after automated data cleansing

If DQS identifies a value in the source data as a synonym, it suggests a correction to the leading

value. This feature is useful for standardization of your data. You must first enable the domain for

leading values and define synonyms in the knowledge base before running a cleansing data quality

project.

After reviewing the proposed corrections, you can either approve or reject the change for each

value or for each record individually. Another option is to click the Approve All Terms button or the

Reject All Terms button in the toolbar. You can also replace a proposed correction by typing a new

value in the Correct To box, and then approve the manual correction. In most cases, approved values

move to the Corrected tab and rejected values move to the Invalid tab. However, if you approve a

value on the New tab, it moves to the Correct tab.

Export

At no time during the cleansing process does DQS change source data. Instead, you can export the

results of the cleansing data quality project to a SQL Server table or to a CSV file. A preview of the

output displays in the final step of the project, as shown in Figure 7-21.

FIGURE 7-21 Review of the cleansing results to export

Before you export the data, you must decide whether you want to export the data only or export

both the data and cleansing information. Cleansing information includes the original value, the

cleansed value, the reason for a correction, a confidence score, and the categorization of the record.

By default, DQS uses the output format for the domain as defined in the knowledge base unless

you clear the Standardize Output check box. When you click the Export button, Data Quality Client

exports the data to the specified destination. You can then click the Finish button to close and unlock

the data quality project.

Matching Projects

By using a matching policy defined for a knowledge base, a matching data quality project can

identify both exact and approximate matches in a data source. Ideally, you run the matching process

after running the cleansing process and exporting the results. You can then specify the export file or

destination table as the source for the matching project.

A matching data quality project consists of three steps: Map, Matching, and Export.

Map

The first step for a matching project begins in the same way as a cleansing project, by requiring you

to map fields from the source to a domain. However, in a matching project, you must map a field to

each domain specified in the knowledge base’s matching policy.

Matching

In the Matching step, you choose whether to generate overlapping clusters or nonoverlapping

clusters,

and then click the Start button to launch the automated matching process. When the process

completes, you can review a table of matching results by cluster. The interface is similar in functionality

to the one you use when reviewing the matching results during the creation of a matching policy.

However, in the matching project, an additional column includes a check box that you can select to

reject a record as a match.

Export

After reviewing the matching results, you can export the results as the final step of the matching

project.

As shown in Figure 7-22, you must choose the destination and the content to export. You

have the following two options for exporting content:

¦ ¦ Matching Results This content type includes both matched and unmatched records. The

matched records include several columns related to the matching process, including the

cluster identifier, the matching rule that identified the match, the matching score, the approval

status, and a flag to indicate the pivot record.

¦ ¦ Survivorship Results This content type includes only the survivorship record and

unmatched

records. You must select a survivorship rule when choosing this export option to

specify which of the matched records in a cluster are preserved in the export. All other records

in a cluster are discarded. If more than one record satisfies the survivorship criteria, DQS keeps

the record with the lowest record identifier.

FIGURE 7-22 Selection of content to export

Important When you click the Finish button, the project is unlocked and available for

later use. However, DQS uses the knowledge-base contents at the time that you finished

the project and ignores any subsequent changes to the knowledge base. To access any

changes to the knowledge base, such as a modified matching policy, you must create a

new matching project.

Administration

In the Administration feature of Data Quality Client, you can perform activity-monitoring and

configuration

tasks. Activity monitoring is accessible by any user who can open Data Quality Client.

On the other hand, only an administrator can access the configuration tasks.

Activity Monitoring

You can use the Activity Monitoring page to review the status of current and historic activities

performed

on the Data Quality Server, as shown in Figure 7-23. DQS administrators can terminate an

activity or a step within an activity when necessary by right-clicking on the activity or step.

FIGURE 7-23 Status of activities and status of activity steps of the selected activity

Activities that appear on this page include knowledge discovery, domain management, matching

policy, cleansing projects, matching projects, and the cleansing transformation in an Integration

Services package. You can see who initiated each activity, the start and end time of the activity, and

the elapsed time. To facilitate locating specific activities, you can use a filter to find activities by date

range and by status, type, subtype, knowledge base, or user. When you select an activity on this page,

you can view the related activity details, such as the steps and profiler information.

You can click the Export The Selected Activity To Excel button to export the activity details,

process steps, and profiling information to Excel. The export file separates this information into four worksheets:

¦ ¦ Activity This sheet includes the details about the activity, including the name, type, subtype,

current status, elapsed time, and so on.

¦ ¦ Processes This sheet includes information about each activity step, including current status,

start and end time, and elapsed time.

¦ ¦ Profiler – Source The contents of this sheet depend on the activity subtype. For the

Cleansing

subtype, you see the number of total records, correct records, corrected records,

and invalid records. For the Knowledge Discovery, Domain Management, Matching Policy, and

Matching subtypes, you see the number of records, total values, new values, unique value, and

new unique values.

¦ ¦ Profiler – Fields This sheet’s contents also depend on the activity subtype. For the Cleansing

and SSIS Cleansing subtypes, the sheet contains the following information by field: domain,

corrected values, suggested values, completeness, and accuracy. For the Knowledge Discovery,

Domain Management, Matching Policy, and Matching subtypes, the sheet contains the following

information by field: domain, new value count, unique value count, count of values that

are valid in the domain, and completeness.

Configuration

The Configuration area of Data Quality Client allows you to set up reference data providers, set

properties for the Data Quality Server, and configure logging. You must be a DQS administrator to

perform configuration tasks. You access this area from the Data Quality Client home page by clicking

the Configuration button.

Reference Data

Rather than maintain domain values and rules in a knowledge base, you can subscribe to a reference

data service through Windows Azure Marketplace. Most reference data services are available as a

monthly paid subscription, but some providers offer a free trial. (Also, Digital Trowel provides a free

service to cleanse and standardize data for US public and private companies.) When you subscribe to

a service, you receive an account key that you must register in Data Quality Client before you can use

reference data in your data-quality activities.

The Reference Data tab is the first tab that displays in the Configuration area, as shown in

Figure

7-24. Here you type or paste your account key in the DataMarket Account ID box, and then

click the Validate DataMarket Account ID button to the right of the box. You might need to provide a

proxy server and port number if your DQS server requires a proxy server to connect to the Internet.

Note If you do not have a reference data service subscription, you can use the Create A

DataMarket Account ID link to open the Windows Azure Marketplace site in your browser.

You must have a Windows Live ID to access the site. Click the Data link at the top of the

page, and then click the Data Quality Services link in the Category list. You can view the

current list of reference data service providers at https://datamarket.azure.com/browse

/Data?Category=dqs.

FIGURE 7-24 Reference data service account configuration

As an alternative to using a DataMarket subscription for reference data, you can configure settings

for a third-party reference data service by clicking the Add New Reference Data Service Provider

button

and supplying the requisite details: a name for the service, a comma-delimited list of fields as

a schema, a secure URI for the reference data service, a maximum number of records per batch, and a

subscriber account identifier.

General Settings

You use the General Settings tab, shown in Figure 7-25, to configure the following settings:

¦ ¦ Interactive Cleansing Specify the minimum confidence score for suggestions and the

minimum confidence score for auto-corrections. DQS uses these values as thresholds when

determining how to categorize records for a cleansing data quality project.

¦ ¦ Matching Specify the minimum matching score for DQS to use for a matching policy.

¦ ¦ Profiler Use this check box to enable or disable profiling notifications. These notifications

appear in the Profiler tab when you are performing a knowledge base activity or running a

data quality project.

FIGURE 7-25 General settings to set score thresholds and enable notifications

Log Settings

Log files are useful for troubleshooting problems that might occur. By default, the DQS log files

capture

events with an Error severity level, but you can change the severity level to Fatal, Warn, Info,

or Debug by activity, as shown in Figure 7-26.

FIGURE 7-26 Log settings for the Data Quality Server

DQS generates the following three types of log files:

¦ ¦ Data Quality Server You can find server-related activity in the DQServerLog.DQS_MAIN.log

file in the Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Log folder.

¦ ¦ Data Quality Client You can view client-related activity in the DQClientLog.log file in the

%APPDATA%\SSDQS\Log folder.

¦ ¦ DQS Cleansing Transformation When you execute a package containing the DQS

Cleansing

transformation, DQS logs the cleansing activity in the DQSSSISLog.log file available

in the %APPDATA%\SSDQS\Log folder.

In addition to configuring log severity settings by activity, you can configure them at the module

level in the Advanced section of the Log Settings tab. By using a more granular approach to log settings,

you can get better insight into a problem you are troubleshooting. The Microsoft.Ssdqs.Core.

Startup is configured with a default severity of Info to track events related to starting and stopping

the DQS service. You can use the drop-down list in the Advanced section to select another module

and specify the log severity level you want.

Integration

DQS cleansing and matching functionality is built into two other SQL Server 2012 features—Integration

Services and Master Data Services—so that you can more effectively manage data quality

across your organization. In Integration Services, you can use DQS components to routinely perform

data cleansing in a scheduled package. In Master Data Services, you can compare external data to

master data to find matching records based on the matching policy you define for a knowledge base.

Integration Services

In earlier versions of SQL Server, you could use an Integration Services package to automate the process

of cleansing data by using Derived Column or Script transformations, but the creation of a data

flow to perform complex cleansing could be tedious. Now you can take advantage of DQS to use the

rules or reference data in a knowledge base for data cleansing. The integration between Integration

Services and DQS allows you to perform the same tasks that a cleansing data quality project supports,

but on a scheduled basis. Another advantage of using the DQS Cleansing transformation in Integration

Services is the ability to cleanse data from a source other than Excel or a SQL Server database.

Because the DQS functionality in Integration Services is built into the product, you can begin using

it right away without additional installation or configuration. Of course, you must have both a Data

Quality Server and a knowledge base available. To get started, you add a DQS connection manager to

the package, add a Data Flow Task, and then add a DQS Cleansing transformation to the data flow.

DQS Connection Manager

You use the DQS connection manager to establish a connection from the Integration Services

package

to a Data Quality Server. When you add the connection manager to your package, you

supply

the server name. The connection manager interface includes a button to test the connection

to the Data Quality Server. You can identify a DQS connection manager by its icon, as shown in

Figure 7-27.

FIGURE 7-27 DQS connection manager

DQS Cleansing Transformation

As we explain in the “Cleansing Projects” section of this chapter, DQS uses advanced algorithms

to cleanse data and calculates a confidence score to categorize records as Correct, Corrected,

Suggested,

or Invalid. The DQS Cleansing transformation in a data flow transfers data to the Data

Quality Server, which in turn executes the cleansing process and sends the data back to the transformation

with a corrected value, when applicable, and a status. You can then add a Conditional Split

transformation to the data flow to route each record to a separate destination based on its status.

Note If the knowledge base maps to a reference data service, the Data Quality Server

might also forward the data to the service for cleansing and enhancement.

Configuration of the DQS Cleansing transformation begins with the selection of a connection

manager

and a knowledge base. After you select the knowledge base, the available domains and

composite domains are displayed, as shown in Figure 7-28.

FIGURE 7-28 Connection manager configuration for the DQS Cleansing transformation

On the Mapping tab, you map each column in the data flow pipeline to its respective domain, as

shown in Figure 7-29. If you are mapping a column to a composite domain, the column must contain

the domain values as a comma-delimited string in the same order in which the individual domains

appear in the composite domain.

FIGURE 7-29 Mapping a pipeline column to a domain

For each input column, you also define the aliases for the following output columns: source,

output,

and status. The transformation editor supplies default values for you, but you can change

these aliases if you like. During package execution, the column with the source alias contains the

original value in the source, whereas the column with the output alias contains the same value for

correct and invalid records or the corrected value for suggested or corrected records. The column

with the status alias contains values to indicate the outcome of the cleansing process: Auto Suggest,

Correct, Invalid, or New.

On the Advanced tab of the transformation editor, you can configure the following options:

¦ ¦ Standardize Output This option, which is enabled by default, automatically standardizes the

data in the Output Alias column according to the Format Output settings for each domain. In

addition, this option changes synonyms to leading values if you enable Use Leading Values for

the domain.

¦ ¦ Enable Field-Level Columns You can optionally include the confidence score or the reason

for a correction as additional columns in the transformation output.

¦ ¦ Enable Record-Level Columns If a domain maps to a reference data service that returns

additional data columns during the cleansing process, you can include this data as an

appended

column. In addition, you can include a column to contain the schema for the appended

data.

Master Data Services

If you are using Master Data Services (MDS) for master data management, you can use the datamatching

functionality in DQS to de-duplicate master data. You must first enable DQS integration on

the Web Configuration page of the Master Data Services configuration manager, and you must create

a matching policy for a DQS knowledge base. Then you add data to an Excel worksheet and use the

MDS Add-in for Excel to combine that data with MDS-managed data in preparation for matching. The

matching process adds columns to the worksheet similar to the columns you view during a matchingdata-

quality project, including the matching score. We provide more information about how to use

the DQS matching with MDS in Chapter 8, “Master Data Services.”

176 PART 2 Business Intelligence Development

Configuration

Whether you perform a new installation of MDS or upgrade from a previous version, you must

use the Master Data Services Configuration Manager. You can open this tool from the Master Data

Services

folder in the Microsoft SQL Server 2012 program group on the Start menu. You use it to

create

or upgrade the MDS database, configure MDS system settings, and create a web application

for MDS.

Important If you are upgrading from a previous version of MDS, you must log in using the

Administrator account that was used to create the original MDS database. You can identify

this account by finding the user with an ID value of 1 in the mdm.tblUser table in the MDS

database.

The first configuration step to perform following installation is to configure the MDS database. On

the Database Configuration page of Master Data Services Configuration Manager, perform one of the

following tasks:

¦ ¦ New installation Click the Create Database button, and complete the Database Wizard.

In the wizard, you specify the SQL Server instance and authentication type and provide

credentials

having permissions to create a database on the selected instance. You also provide

a database name and specify collation. Last, you specify a Microsoft Windows account to

establish as the MDS administrator account.

Important Before you begin a new installation, you must install Internet

Information Services (IIS).

¦ ¦ Upgrade If you want to keep your database in a SQL Server 2008 R2 instance, click the

Repair Database button if it is enabled. Then, whether you want to store your MDS database

in SQL Server 2008 R2 or SQL Server 2012, click the Upgrade Database button. The Upgrade

Database Wizard displays the SQL Server instance and MDS database name, as well as the

progress of the update. The upgrade process re-creates tables using the new schema and

stored procedures.

Note The upgrade process excludes business rules that you use to generate values

for the code attribute. We explain the new automatic code generation in the “Entity

Management” section later in this chapter. Furthermore, the upgrade process does

not include model deployment packages. You must create new packages in your

SQL Server 2012 installation.

When the new or upgraded database is created, you see the System Settings display on the

Database

Configuration page. You use these settings to control timeouts for the database or the web

CHAPTER 8 Master Data Services 177

service, to name a few. If you upgraded your MDS installation, there are two new system settings that

you can configure:

¦ ¦ Show Add-in For Excel Text On Website Home Page This setting controls whether users

see a link to install the MDS Add-in on the MDS home page, as shown in Figure 8-1.

¦ ¦ Add-in For Excel Install Path On Website Home Page This setting defaults to the MDS

Add-in download page on the Microsoft web site.

FIGURE 8-1 Add-in for Excel text on MDS website home page

Master Data Manager

Master Data Manager is the web application that data stewards use to manage master data and

administrators

use to manage model objects and configure security. For the most part, it works as it

did in the previous version of MDS, although the Explorer and Integration Management functional

areas now use Silverlight 5. As a result, you will find certain tasks are easier and faster to perform.

In keeping with this goal of enabling easier and faster processes, you will find the current version of

MDS also introduces a new staging process and slight changes to the security model.

Explorer

In the Explorer functional area, you add or delete members for an entity, update attribute values for

members, arrange those members within a hierarchy, and optionally organize groups of members as

collections. In SQL Server 2012, MDS improves the workflow for performing these tasks.

Entity Management

When you open an entity in the Explorer area, you see a new interface, as shown in Figure 8-2. The

set of buttons now display with new icons and with descriptions that clarify their purpose. When you

click the Add Member button, you type in all attribute values for the new member in the Details pane.

To delete a member, select the member and then click the Delete button. You can also more easily

edit attribute values for a member by selecting it in the grid and then typing a new value for the

attribute

in the Details pane.

FIGURE 8-2 Member management for a selected entity

Rather than use a business rule to automatically create values for the Code attribute as you do in

SQL Server 2008 R2, you can now configure the entity to automatically generate the code value. This

automatic assignment of a value applies whether you are adding a member manually in Master Data

Manager or importing data through the staging process. To do this, you must have permission to

access the System Administration function area and open the entity. On the Entity Maintenance page,

select the Create Code Values Automatically check box, as shown in Figure 8-3. You can optionally

change the number in the Start With box. If you already have members added to the entity, MDS

increments the maximum value by one when you add a new member.

FIGURE 8-3 Check box to automatically generate code values

Note The automatic generation of a value occurs only when you leave the code value

blank. You always have the option to override it with a different value when you add a new

member.

Many-to-Many Mapping

Another improvement in the current version of MDS is the ability to use the Explorer functional area

in Master Data Manager to view entities for which you have defined many-to-many mappings. To see

how this works, consider a scenario in which you have products you want to decompose into separate

parts and you can associate any single part with multiple products. You manage the relationships

between products and parts in MDS by creating three entities: products, parts, and a mapping entity

to define the relationship between products and parts, as shown in Figure 8-4. Notice that you need

only a code value and attributes to store code values for the two related entities, but no name value.

FIGURE 8-4 Arrangement of members in a derived hierarchy

When you open the Product entity and select a member, you can open the Related Entities pane

on the right side of the screen where a link displays, such as ProductParts (Attribute: Product Code).

When you click this link, a new browser window opens to display the ProductParts entity window with

a filter applied to show records related only to the selected product. From that screen, you can click

the Go To The “<Name> Entity” To View Attribute Details button, as shown in Figure 8-5, to open yet

another browser window that displays the entity member and its attribute values.

FIGURE 8-5 Click through to a related entity available in the Details pane

Hierarchy Management

The new interface in the Explorer functional area, shown in Figure 8-6, makes it easier for you to

move members within a hierarchy when you want to change the parent for a member. A hierarchy

pane displays a tree view of the hierarchy. There you select the check box for each member to move.

Then you click the Copy button at the top of the hierarchy pane, select the check box of the member

to which you want to move the previously selected members, and then click the Paste button. In a

derived hierarchy, you must paste members to the same level only.

FIGURE 8-6 Arrangement of members in a derived hierarchy

Collection Management

You can organize a subset of entity members as a collection in MDS. A new feature is the ability to

assign a weight to each collection member within the user interface, as shown in Figure 8-7. You use

MDS only to store the weight for use by subscribing systems that use the weight to apportion values

across the collection members. Accordingly, the subscription view includes the weight column.

FIGURE 8-7 New collection-management interface

Integration Management

When you want to automate aspects of the master data management process, MDS now has a new,

high-performance staging process. One benefit of this new staging process is the ability to load

members and attribute values at one time, rather than separately in batches. To do this, you load data

into the following tables as applicable, where name is the name of the staging table for the entity:

¦ ¦ stg.name_Leaf Use this table to stage additions, updates, or deletions for leaf members and

their attributes.

¦ ¦ stg.name_Consolidated Use this table to stage additions, updates, or deletions for

consolidated

members and their attributes.

¦ ¦ stg.name_Relationship Use this table to assign members in an explicit hierarchy.

There are two ways to start the staging process after you load data into the staging tables: by using

the Integration Management functional area or executing stored procedures. In the Integration

Management functional area, you select the model in the drop-down list and click the Start Batches

button, which you can see in Figure 8-8. The data processes in batches, and you can watch the status

change from Queued To Run to Running to Completed.

FIGURE 8-8 Staging a batch in the Integration Management functional area

If the staging process produces errors, you see the number of errors appear in the grid, but

you cannot view them in the Integration Management functional area. Instead, you can click the

Copy Query button to copy a SQL query that you can paste into a query window in SQL Server

Management

Studio. The query looks similar to this:

SELECT * from [stg].[viw_Product_MemberErrorDetails] WHERE Batch_ID = 1

This view includes an ErrorDescription column describing the reason for flagging the staged record

as an error. It also includes AttributeName and AttributeValue columns to indicate which attribute and

value caused the error.

If you use a stored procedure to execute the staging process, you use one of the following stored

procedures, where name corresponds to the staging table:

¦ ¦ stg.udp_name_Leaf

¦ ¦ stg.udp_name_Consolidated

¦ ¦ stg.udp_name_Relationship

Each of these stored procedures takes the following parameters:

¦ ¦ VersionName Provide the version name of the model, such as VERSION_1. The collation

setting of the SQL Server database determines whether the value for this parameter is case

sensitive.

¦ ¦ LogFile Use a value of 1 to log transactions during staging or a value of 0 if you do not want

to log transactions.

¦ ¦ BatchTag Provide a string of 50 characters or less to identify the batch in the staging table.

This tag displays in the batch grid in the Integration Management functional area.

For example, to load leaf members and log transactions for the batch, execute the following code

in SQL Server Management Studio:

EXEC [stg].[udp_name_Leaf] @VersionName = N’VERSION_1′, @LogFlag = 1, @BatchTag = N’batch1′

Note Transaction logging during staging is optional. You can enable logging only if you

use stored procedures for staging. Logging does not occur if you launch the staging

process

from the Integration Management functional area.

Important No validation occurs during the staging process. You must validate manually

in the Version Management functional area or use the mdm.udpValidateModel stored procedure.

Refer to http://msdn.microsoft.com/en-us/library/hh231023(SQL.110).aspx for more

information about this stored procedure.

You can continue to use the staging tables and stored procedure introduced for the SQL Server

2008 R2 MDS staging process if you like. One reason that you might choose to do this is to manage

collections, because the new staging process in SQL Server 2012 does not support collections.

Therefore,

you must use the previous staging process to create or delete collections, add members to

or remove them from collections, or reactivate members and collections.

User and Group Permissions

Just as in the previous version of MDS, you assign permissions by functional area and by model

object. When you assign Read-Only or Update permissions for a model to a user or group on the

Models tab of the Manage User page, the permission also applies to lower level objects. For example,

when you grant a user or group the Update permission to the Product model, as shown in Figure 8-9,

the users can also add, change, or delete members for any entity in the model and can change any

attribute value. You must explicitly change permissions on selected entities to Read-only when you

want to give users the ability to view but not change entity members, and to Deny when you do

not want them to see the entity members. You can further refine security by setting permissions on

attribute

objects (below the Leaf, Consolidate, or Collection nodes) to control which attribute values a

user can see or change.

FIGURE 8-9 Model permissions object tree and summary

In SQL Server 2008 MDS, the model permissions object tree also includes nodes for derived hierarchies,

explicit hierarchies, and attribute groups, but these nodes are no longer available in

SQL Server 2012 MDS. Instead, derived hierarchies inherit permissions from the model and explicit

hierarchies

inherit permissions from the associated entity. In both cases, you can override these

default permissions on the Hierarchy Members tab of the Manage User page to manage which entity

members that users can see or change.

For attribute group permissions, you now use the Attribute Group Maintenance page in the System

Administration functional area to assign permissions to users or groups, as shown in Figure 8-10.

Users

or groups appearing in the Assigned list have Update permissions only. You can no longer

assign

Read-Only permissions for attribute groups.

FIGURE 8-10 Users and Groups security for attribute groups

Model Deployment

A new high-performance, command-line tool is now available for deploying packages. If you use the

Model Deployment Wizard in the web application, it deploys only the model structure. As an alternative,

you can use the MDSModelDeploy tool to create and deploy a package with model objects only

or a package with both model objects and data. You find this tool in the Program Files\Microsoft SQL

Server\110\Master Data Services\Configuration folder.

Note You cannot reuse packages you created using SQL Server 2008 MDS. You can deploy

a SQL Server 2012 package only to a SQL Server 2012 MDS instance.

The executable for this tool uses the following syntax:

MDSModelDeploy <commands> [ <options> ]

You can use the following commands with this tool:

¦ ¦ listservices View a list of all service instances.

¦ ¦ listmodels View a list of all models.

¦ ¦ listversions View a list of all versions for a specified model.

¦ ¦ createpackage Create a package for a specified model.

¦ ¦ deployclone Create a duplicate of a specified model, retaining names and identifiers. The

model cannot exist in the target service instance.

¦ ¦ deploynew Create a new model. MDS creates new identifiers for all model objects.

¦ ¦ deployupdate Deploy a model, and update the model version. This option requires you to

use a package with a model having the same names and identifiers as the target model.

¦ ¦ help View usage, options, and examples of a command. For example, type the following

command to learn how to use the deploynew command: MDSModelDeploy help deploynew.

To help you learn how to work with MDS, several sample packages containing models and data

are available. To deploy the Product package to the MDS1 Web service instance, type the following

command

in the command prompt window:

MDSModelDeploy deploynew -package ..\Samples\Packages\product_en.pkg -model Product -service

MDS1

Note To execute commands by using this utility, you must have permissions to access the

System Administration functional area and you must open the command prompt window

as an administrator.

During deployment, MDS first creates the model objects, and then creates the business rules

and subscription views. MDS populates the model with master data as the final step. If any of these

steps fail during deployment of a new or cloned model, MDS deletes the model. If you are updating

a model, MDS retains the changes from the previous steps that completed successfully unless the

failure occurs in the final step. In that case, MDS updates master data members where possible rather

than failing the entire step and rolling back.

Note After you deploy a model, you must manually update user-defined metadata, file

attributes, and user and group permissions.

MDS Add-in for Excel

The most extensive addition to MDS in SQL Server 2012 is the new user-interface option for enabling

data stewards and administrators to manage master data inside Microsoft Excel. Data stewards can

retrieve data and make changes using the familiar environment of Excel after installing the MDS

Add-

in for Excel. Administrators can also use this add-in to create new model objects, such as entities,

and load data into MDS.

Installation of the MDS Add-in

By default, the home page of Master Data Manager includes a link to the download page for the

MDS Add-in on the Microsoft web site. On the download page, you choose the language and version

(

32-bit or 64-bit) that matches your Excel installation. You open the MSI file that downloads to

start the setup wizard, and then follow the prompts to accept the license agreement and confirm the

installation. When the installation completes, you can open Excel to view the new Master Data tab in

the ribbon, as shown in Figure 8-11.

FIGURE 8-11 Master Data tab in the Excel ribbon

Note The add-in works with either Excel 2007 or Excel 2010.

Master Data Management

The MDS add-in supports the primary tasks you need to perform for master data management. After

connecting to MDS, you can load data from MDS into a worksheet to use for reference or to make

additions or changes in bulk. You can also apply business rules and correct validation issues, check for

duplicates using Data Quality Services integration, and then publish the modified data back to MDS.

Connections

Before you can load MDS data into a worksheet, you must create a connection to the MDS database.

If you open a worksheet into which you previously loaded data, the MDS add-in automatically

connects

to MDS when you refresh the data or publish the data. To create a connection in the MDS

Add-in for Excel, follow these steps:

1. On the Master Data tab of the ribbon, click the arrow under the Connect button, and click

Manage Connections.

2. In the Manage Connections dialog box, click the Create A New Connection link.

3. In the Add New Connection dialog box, type a description for the connection. This description

displays when you click the arrow under the Connect button.

4. In the MDS Server Address box, type the URL that you use to open the Master Data Manager

web application, such as http://myserver/mds, and click the OK button. The connection

displays

in the Existing Connections section of the Manage Connections dialog box.

5. Click the Test button to test the connection, and then click the OK button to close the dialog

box that confirms the connection or displays an error.

6. Click the Connect button.

7. In the Master Data Explorer pane, select a model and version from the respective drop-down

lists, as shown here:

Data Retrieval

Before you load data from MDS into a worksheet, you can filter the data. Even if you do not filter the

data, there are some limitations to the volume of data you can load. Periodically, you can update the

data in the worksheet to retrieve the latest updates from MDS.

Filter data Rather than load all entity members from MDS into a spreadsheet, which can be a

time-consuming task, you can select attributes and apply filters to minimize the amount of data you

retrieve from MDS. You can choose to focus on selected attributes to reduce the number of columns

to retrieve. Another option is to use filter criteria to eliminate members.

To filter and retrieve leaf data from MDS, follow these steps:

1. In the Master Data Explorer pane, select the entity you want to load into the spreadsheet.

2. On the Master Data tab of the ribbon, click the Filter button.

3. In the Filter dialog box, select the columns to load by selecting an attribute type, an explicit

hierarchy (if you select the Consolidated attribute type), an attribute group, and individual

attributes.

Tip You can change the order of attributes by using the Up and Down arrows to

the right of the attribute list to move each selected attribute.

4. Next, select the rows to load by clicking the Add button and then selecting an attribute, a

filter operator, and filter criteria. You can repeat this step to continue adding filter criteria.

5. Click the Update Summary button to view the number of rows and columns resulting from

your filter selections, as shown here:

6. In the Filter dialog box, click the Load button. The data loads into the current spreadsheet, as

shown here:

Load data Filtering data before you load is optional. You can load all members for an entity by

clicking the Load Or Refresh button in the ribbon. A warning displays if there are more than 100,000

rows or more than 100 columns, but you can increase or decrease these values or disable the warning

by clicking the Settings button on the ribbon and changing the properties on the Data page of

the Settings dialog box. Regardless, if an entity is very large, the add-in automatically restricts the

retrieval of data to the first one million members. Also, if a column is a domain-based attribute, the

add-in retrieves only the first 1000 values.

Refresh data After you load MDS data into a worksheet, you can update the same worksheet

by adding columns of data from sources other than MDS or columns containing formulas. When you

want to refresh the MDS data without losing the data you added, click the Load Or Refresh button in

the ribbon.

The refresh process modifies the contents of the worksheet. Deleted members disappear, and new

members appear at the bottom of the table with green highlighting. Attribute values update to match

the value stored in MDS, but the cell does not change color to identify a new value.

Warning If you add new members or change attribute values, you must publish these

changes before refreshing the data. Otherwise, you lose your work. Cell comments on MDS

data are deleted, and non-MDS data in rows below MDS data might be replaced if the

refresh

process adds new members to the worksheet.

Review transactions and annotations You can review transactions for any member by

right-

clicking the member’s row and selecting View Transactions in the context menu. To view

an annotation

or to add an annotation for a transaction, select the transaction row in the View

Transactions

dialog box, which is shown in Figure 8-12.

FIGURE 8-12 Transactions and annotations for a member

Data Publication

If you make changes to the MDS data in the worksheet, such as altering attribute values, adding new

members, or deleting members, you can publish your changes to MDS to make it available to other

users. Each change you make saves to MDS as a transaction, which you have the option to annotate to

document the reason for the change. An exception is a deletion, which you cannot annotate although

the deletion does generate a transaction.

When you click the Publish button, the Publish And Annotate dialog box displays (unless you

disable

it in Settings). You can provide a single annotation for all changes or separate annotations for

each change, as shown in Figure 8-13. An annotation must be 500 characters or less.

Warning Cell comments on MDS data are deleted during the publication process. Also,

a change to the code value for a member does not save as a transaction and renders all

transactions for that member inaccessible.

FIGURE 8-13 Addition of an annotation for a published data change

During the publication process, MDS validates your changes. First, MDS applies business rules to

the data. Second, MDS confirms the validity of attribute values, including the length and data type.

If a member passes validation, the MDS database updates with the change. Otherwise, the invalid

data displays in the worksheet with red highlighting and the description of the error appears in the

InputStatus$ column. You can apply business rules prior to publishing your changes by clicking the

Apply Rules button in the ribbon.

Model-Building Tasks

Use of the MDS Add-in for Excel is not limited to data stewards. If you are an administrator, you can

also use the add-in to create entities and add attributes. However, you must first create a model by

using Master Data Manager, and then you can continue adding entities to the model by using the

add-in.

Entities and Attributes

Before you add an entity to MDS, you create data in a worksheet. The data must include a header

row and at least one row of data. Each row should include at least a Name column. If you include a

Code column, the column values must be unique for each row. You can add other columns to create

attributes for the entity, but you do not need to provide values for them. If you do, you can use text,

numeric, or data values, but you cannot use formulas or time values.

To create a new entity in MDS, follow these steps:

1. Select all cells in the header and data rows to load into the new entity.

2. Click the Create Entity button in the ribbon.

3. In the Create Entity dialog box, ensure the range includes only the data you want to load and

do not clear the My Data Has Headers check box.

4. Select a model and version from the respective drop-down lists, and provide a name in the

New Entity Name box.

5. In the Code drop-down list, select the column that contains unique values for entity members

or select the Generate Code Automatically option.

6. In the Name drop-down list, select the column that contains member names, as shown next,

and then click OK. The add-in creates the new entity in the MDS database and validates the

data.

Note You might need to correct the data type or length of an attribute after creating the

entity. To do this, click any cell in the attribute’s column, and click the Attribute Properties

button in the ribbon. You can make changes as necessary in the Attribute Properties dialog

box. However, you cannot change the data type or length of the Name or Code column.

Domain-Based Attributes

If you want to restrict column values of an existing entity to a specific set of values, you can create a

domain-based attribute from values in a worksheet or an existing entity. To create a domain-based

attribute, follow these steps:

1. Load the entity into a worksheet, and click a cell in the column that you want to change to a

domain-based attribute.

2. Click the Attribute Properties button in the ribbon.

3. In the Attribute Properties dialog box, select Constrained List (Domain-Based) in the Attribute

Type drop-down list.

4. Select an option from the Populate The Attribute With Values From drop-down list. You

can choose The Selected Column to create a new entity based on the values in the selected

column,

as shown next, or you can choose an entity to use values from that entity.

5. Click OK. The column now allows you to select from a list of values. You can change the

available

values in the list by loading the entity on which the attribute is based into a separate

worksheet, making changes by adding new members or updating values, and then publishing

the changes back to MDS.

Shortcut Query Files

You can easily load frequently accessed data by using a shortcut query file. This file contains

information

about the connection to the MDS database, the model and version containing the MDS

data, the entity to load, filters to apply, and the column order. After loading MDS data into a worksheet,

you create a shortcut query file by clicking the Save Query button in the ribbon and selecting

Save As Query in the menu. When you want to use it later, you open an empty worksheet, click the

Save Query button, and select the shortcut query file from the list that displays.

You can also use the shortcut query file as a way to share up-to-date MDS data with other users

without emailing the worksheet. Instead, you can email the shortcut query file as long as you have

Microsoft Outlook 2010 or later installed on your computer. First, load MDS data into a worksheet,

and then click the Send Query button in the ribbon to create an email message with the shortcut

query file as an attachment. As long as the recipient of the email message has the add-in installed, he

can double-click the file to open it.

Data Quality Matching

Before you add new members to an entity using the add-in, you can prepare data in a worksheet

and combine it with MDS data for comparison. Then you use Data Quality Services (DQS) to identify

duplicates. The matching process adds detail columns to show matching scores you can use to decide

which data to publish to MDS.

As we explained in Chapter 7, “Data Quality Services,” you must enable DQS integration in the

Master Data Services Configuration Manager and create a matching policy in a knowledge base. In

addition, both the MDS database and the DQS_MAIN database must exist in the same SQL Server

instance.

The first step in the data-quality matching process is to combine data from two worksheets into a

single worksheet. The first worksheet must contain data you load from MDS. The second worksheet

must contain data with a header row and one or more detail rows. To combine data, follow these

steps:

1. On the first worksheet, click the Combine Data button in the ribbon.

2. In the Combine Data dialog box, click the icon next to the Range To Combine With MDS Data

text box.

3. Click the second worksheet, and highlight the header and detail rows to combine with MDS

data.

4. In the Combine Data dialog box, click the icon to the right of the Range To Combine With

MDS Data box.

5. Navigate to the second worksheet, highlight the header row and detail rows, and then click

the icon next to the range in the collapsed Combine Data dialog box.

6. In the expanded Combine Data dialog box, in the Corresponding Column drop-down list,

select

a column from the second worksheet that corresponds to the entity column that

displays

to its left, as shown here:

7. Click the Combine button. The rows from the second worksheet display in the first worksheet

below the existing rows, and the SOURCE column displays whether the row data comes from

MDS or from an external source, as shown here:

8. Click the Match Data button in the ribbon.

9. In the Match Data dialog box, select a knowledge base from the DQS Knowledge Base

drop-

down list, and map worksheet columns to each domain listed in the dialog box.

Note Rather than using a custom knowledge base as shown in the preceding

screen shot, you can use the default knowledge base, DQS Data. In that case, you

add a row to the dialog box for each column you want to use for matching and

assign

a weight value. The sum of weight values for all rows must equal 100.

10. Click OK, and then click the Show Details button in the ribbon to view columns containing

matching details. The SCORE column indicates the similarity between the pivot record

(

indicated by Pivot in the PIVOT_MARK column) and the matching record. You can use this

information to eliminate records from the worksheet before publishing your changes and

additions

to MDS.

Miscellaneous Changes

Thus far in this chapter, our focus has been on the new features available in MDS. However, there

are some more additions and changes to review. SQL Server 2012 offers some new features for

SharePoint

integration, and it retains some features from the SQL Server 2008 R2 version that are

still available, but deprecated. There are also features that are discontinued. In this section, we review

these feature changes and describe alternatives where applicable.

SharePoint Integration

There are two ways you can integrate MDS with SharePoint. First, when you add the Master Data

Manager web site to a SharePoint page, you can add &hosted=true as a query parameter to reduce

the amount of required display space. This query parameter removes the header, menu bar, and

padding

at the bottom of the page. Second, you can save shortcut query files to a SharePoint

document

library to provide lists of reference data to other users.

Metadata

The Metadata model continues to display in Master Data Manager, but it is deprecated. Microsoft

recommends that you do not use it because it will be removed in a future release of SQL Server.

You cannot create versions of the Metadata model, and users cannot view metadata in the Explorer functional

area.

Bulk Updates and Export

Making changes to master data one record at a time can be a tedious process. In the previous version

of MDS, you can update an attribute value for multiple members at the same time, but this capability

is no longer available in SQL Server 2012. Instead, you can use the staging process to load the

new values into the stg.name_Leaf table as we described earlier in the “Integration Management”

section of this chapter. As an alternative, you can use the MDS add-in to load the entity into an Excel

worksheet (described in the “Master Data Management” section of this chapter), update the attribute

values in bulk using copy and paste, and then publish the results to MDS.

The purpose of storing master data in MDS is to have access to this data for other purposes. You

use the Export To Excel button on the Member Information page when using the previous version of

MDS, but this button is not available in SQL Server 2012. When you require MDS data in Excel, you

use the MDS add-in to load entity members from MDS into a worksheet, as we described in the “Data

Retrieval” section of this chapter.

Transactions

MDS uses transactions to log every change users make to master data. In the previous version, users

review transactions in the Explorer functional area and optionally reverse their own transactions to

restore a prior value. Now only administrators can revert transactions in the Version Management

functional area.

MDS allows you to annotate transactions. In SQL Server 2008 R2, MDS stores annotations as

transactions

and allows you to delete them by reverting the transaction. However, annotations

are now permanent in SQL Server 2012. Although you continue to associate an annotation with a

transaction,

MDS stores the annotation separately and does not allow you to delete it.

Windows PowerShell

In the previous version of MDS, you can use PowerShell cmdlets for administration of MDS. One

cmdlet allows you to create the database, another cmdlet allows you to configure settings for MDS,

and other cmdlets allow you to retrieve information about your MDS environment. No cmdlets are

available in the current version of MDS.

199

CHAP TE R 9

Analysis Services and PowerPivot

In SQL Server 2005 and SQL Server 2008, there is only one mode of SQL Server Analysis Services

(SSAS) available. Then in SQL Server 2008 R2, VertiPaq mode debuts as the engine for PowerPivot

for SharePoint. These two server modes persist in SQL Server 2012 with some enhancements, and

now you also have the option to deploy an Analysis Services instance in tabular mode. In addition,

Microsoft SQL Server 2012 PowerPivot for Excel has several new features that extend the types of

analysis it can support.

Analysis Services

Before you deploy an Analysis Services instance, you must decide what type of functionality you

want to support and install the appropriate server mode. In this section, we compare the three

server modes, explain the various Analysis Services templates from which to choose when starting

a new Analysis Services project, and introduce the components of the new tabular mode. We

also review several new options this release provides for managing your server. Last, we discuss the

programmability

enhancements in the current release.

Server Modes

In SQL Server 2012, an Analysis Services instance can run in one of the following server modes:

multidimensional,

tabular, or PowerPivot for SharePoint. Each server mode supports a different type

of database by using different storage structures, memory architectures, and engines. Multidimensional

mode uses the Analysis Services engine you find in SQL Server 2005 and later versions. Both

tabular mode and PowerPivot for SharePoint mode use the VertiPaq engine introduced in SQL Server

2008 R2, which compresses data for storage in memory at runtime. However, tabular mode does not

have a dependency on SharePoint like PowerPivot for SharePoint does.

Each server mode supports a different set of data sources, tools, languages, and security features.

Table 9-1 provides a comparison of these features by server mode.

200 PART 2 Business Intelligence Development

TABLE 9-1 Server-Mode Comparison of Various Sources, Tools, Languages, and Security Features

Feature

Multidimensional

Tabular

PowerPivot for SharePoint

Data Sources

Relational database

Analysis Services

Reporting Services report

Azure DataMarket dataset

Data feed

Excel file

Text file

Relational database

Analysis Services

Reporting Services report

Azure DataMarket dataset

Data feed

Excel file

Text file

Development Tool

SQL Server Data Tools

PowerPivot for Excel

Management Tool

SQL Server Management

Studio

SQL Server Management

Studio

SharePoint Central

Administration

PowerPivot Configuration Tool

Reporting and Analysis

Tool

Report Builder

Report Designer

Excel PivotTable

PerformancePoint dashboard

Report Builder

Report Designer

Excel PivotTable

PerformancePoint dashboard

Power View

Report Builder

Report Designer

Excel PivotTable

PerformancePoint dashboard

Power View

Application

Programming Interface

AMO

AMOMD.NET

AMO

AMOMD.NET

No support

Query and Expression

Language

MDX for calculations and

queries

DMX for data-mining

queries

DAX for calculations and

queries

MDX for queries

DAX for calculations and queries

MDX for queries

Security

Cell-level security

Role-based permissions

in SSAS

Row-level security

Role-based permissions in

SSAS

File-level security using

SharePoint permissions

Another factor you must consider is the set of model design features that satisfy your users’

business

requirements for reporting and analysis. Table 9-2 shows the model design features that

each server mode supports.

TABLE 9-2 Server-Mode Comparison of Design Features

Model Design Feature

Multidimensional

Tabular

PowerPivot for

SharePoint

Actions

Aggregations

Calculated Measures

Custom Assemblies

Custom Rollups

Distinct Count

Drillthrough

CHAPTER 9 Analysis Services and PowerPivot 201

Model Design Feature

Multidimensional

Tabular

PowerPivot for

SharePoint

Hierarchies

Key Performance

Indicators

Linked Objects

(Linked tables

only)

Many-to-Many

Relationships

Parent-Child Hierarchies

Partitions

Perspectives

Semi-additive Measures

Translations

Writeback

You assign the server mode during installation of Analysis Services. On the Setup Role page of

SQL Server Setup, you select the SQL Server Feature Installation option for multidimensional or

tabular mode, or you select the SQL Server PowerPivot For SharePoint option for the PowerPivot for

SharePoint

mode. If you select the SQL Server Feature Installation option, you will specify the server

mode to install on the Analysis Services Configuration page. On that page, you must choose either

the Multidimensional And Data Mining Mode option or the Tabular Mode option. After you complete

the installation, you cannot change the server mode of an existing instance.

Note Multiple instances of Analysis Services can co-exist on the same server, each running

a different server mode.

Analysis Services Projects

SQL Server Data Tools (SSDT) is the model development tool for multidimensional models, data-mining

models, and tabular models. Just as you do with any business intelligence project, you open

the File menu in SSDT, point to New, and then select Project to display the New Project dialog box.

In the Installed Templates list in that dialog box, you can choose from several Analysis Services

templates,

as shown in Figure 9-1.

FIGURE 9-1 New Project dialog box displaying Analysis Services templates

There are five templates available for Analysis Services projects:

¦ ¦ Analysis Services Multidimensional and Data Mining Project You use this template

to develop the traditional type of project for Analysis Services, which is now known as the

multidimensional model and is the only model that includes support for the Analysis Services

data-mining features.

¦ ¦ Import from Server (Multidimensional and Data Mining) You use this template when

a multidimensional model exists on a server and you want to create a new project using the

same model design.

¦ ¦ Analysis Services Tabular Project You use this template to create the new tabular model.

You can deploy this model to an Analysis Services instance running in tabular mode only.

¦ ¦ Import from PowerPivot You use this template to import a model from a workbook

deployed

to a PowerPivot for SharePoint instance of Analysis Services. You can extend the

model using features supported in tabular modeling and then deploy this model to an

Analysis

Services instance running in tabular mode only.

¦ ¦ Import from Server (Tabular) You use this template when a tabular model exists on a

server and you want to create a new project using the same model design.

Tabular Modeling

A tabular model is a new type of database structure that Analysis Services supports in SQL Server

2012. When you create a tabular project, SSDT adds a Model.bim file to the project and creates

a workspace database on the Analysis Services instance that you specify. It then uses this workspace

database

as temporary storage for data while you develop the model by importing data and

designing

objects that organize, enhance, and secure the data.

Tip You can use the tutorial at http://msdn.microsoft.com/en-us/library

/hh231691(SQL.110).aspx to learn how to work with a tabular model project.

Workspace Database

As you work with a tabular model project in SSDT, a corresponding workspace database resides in

memory. This workspace database stores the data you add to the project using the Table Import

Wizard. Whenever you view data in the diagram view or the data view of the model designer, SSDT

retrieves the data from the workspace database.

When you select the Model.bim file in Solution Explorer, you can use the Properties window to

access

the following workspace database properties:

¦ ¦ Data Backup The default setting is Do Not Backup To Disk. You can change this to Backup

To Disk to create a backup of the workspace database as an ABF file each time you save the

Model.bim file. However, you cannot use the Backup To Disk option if you are using a remote

Analysis Services instance to host the workspace database.

¦ ¦ Workspace Database This property displays the name that Analysis Services assigns to the

workspace database. You cannot change this value.

¦ ¦ Workspace Retention Analysis Services uses this value to determine whether to keep

the workspace database in memory when you close the project in SSDT. The default option,

Unload From Memory, keeps the database on disk, but removes it from memory. For faster

loading when you next open the project, you can choose the Keep In Memory option. The

third option, Delete Workspace, deletes the workspace database from both memory and disk,

which takes the longest time to reload because Analysis Services requires additional time to

import data into the new workspace database. You can change the default for this setting if

you open the Tools menu, select Options, and open the Data Modeling page in the Analysis

Server settings.

¦ ¦ Workspace Server This property specifies the server you use to host the workspace

database.

For best performance, you should use a local instance of Analysis Services.

Note You must be an administrator for the Analysis Services instance hosting the

workspace

database.

Table Import Wizard

You use the Table Import Wizard to import data from one or more data sources. In addition to providing

connection information, such as a server name and database name for a relational data

source, you must also complete the Impersonation Information page of the Table Import Wizard.

Analysis Services uses the credentials you specify on this page to import and process data. For

credentials,

you can provide a Windows login and password or you can designate the Analysis Services

service account.

The next step in the Table Import Wizard is to specify how you want to retrieve the data. For

example, if you are using a relational data source, you can select from a list of tables and views or

provide a query. Regardless of the data source you use, you have the option to filter the data before

importing it into the model. One option is to eliminate an entire column by clearing the check box

in the column header. You can also eliminate rows by clicking the arrow to the right of the column

name, and clearing one or more check boxes for a text value, as shown in Figure 9-2.

FIGURE 9-2 Selection of rows to include during import

As an alternative, you can create more specific filters by using the Text Filters or Numeric Filters

options, as applicable to the column’s data type. For example, you can create a filter to import only

values equal to a specific value or values containing a designated string.

Note There is no limit to the number of rows you can import for a single table, although

any column in the table can have no more than 2 billion distinct values. However, the

query performance of the model is optimal when you reduce the number of rows as much

as possible.

Tabular Model Designer

After you import data into the model, the model designer displays the data in the workspace as

shown in Figure 9-3. If you decide that you need to rename columns, you can double-click on the

column name and type a new name. For example, you might add a space between words to make the

column name more user friendly. When you finish typing, press Enter to save the change.

FIGURE 9-3 Model with multiple tabs containing data

When you import data from a relational data source, the import process detects the existing

relationships

and adds them to the model. To view the relationships, switch to Diagram View (shown

in Figure 9-4), either by clicking the Diagram button in the bottom right corner of the workspace or

by opening the Model menu, pointing to Model View, and selecting Diagram View. When you point

to a line connecting two tables, the model designer highlights the related columns in each table.

FIGURE 9-4 Model in diagram view highlighting columns in a relationship between two tables

Relationships

You can add new relationships by clicking a column in one table and dragging the cursor to the

corresponding

column in a second table. Because the model design automatically detects the primary

table and the related lookup table, you do not need to select the tables in a specific order. If you

prefer, you can open the Table menu and click Manage Relationships to view all relationships in one

dialog box, as shown in Figure 9-5. You can use this dialog box to add a new relationship or to edit or

delete an existing relationship.

FIGURE 9-5 Manage Relationships dialog box displaying all relationships in the model

Note You can create only one-to-one or one-to-many relationships.

You can also create multiple relationships between two tables, but only one relationship at a time

is active. Calculations use the active relationship by default, unless you override this behavior by using

the USERELATIONSHIP() function as we explain in the “DAX” section later in this chapter.

Calculated Columns

A calculated column is a column of data you derive by using a Data Analysis Expression (DAX)

formula.

For example, you can concatenate values from two columns into a single column, as shown

in Figure 9-6. To create a calculated column, you must switch to Data View. You can either right-click

an existing column and then select Insert Column, or you can click Add Column on the Column menu.

In the formula bar, type a valid DAX formula, and press Enter. The model designer calculates and

displays column values for each row in the table.

FIGURE 9-6 Calculated column values and the corresponding DAX formula

Measures

Whereas the tabular model evaluates a calculated column at the row level and stores the result in

the tabular model, it evaluates a measure as an aggregate value within the context of rows, columns,

filters, and slicers for a pivot table. To add a new measure, click any cell in the calculation area, which

then displays as a grid below the table data. Then type a DAX formula in the formula bar, and press

Enter to add a new measure, as shown in Figure 9-7. You can override the default measure name, such

as Measure1, by replacing the name with a new value in the formula bar.

FIGURE 9-7 Calculation area displaying three measures

To create a measure that aggregates only row values, you click the column header and then

click the AutoSum button in the toolbar. For example, if you select Count for the ProductKey

column,

the measure grid displays the new measure with the following formula:

Count of ProductKey:=COUNTA([ProductKey]).

Key Performance Indicators

Key performance indicators (KPIs) are a special type of measure you can use to measure progress

toward a goal. You start by creating a base measure in the calculation area of a table. Then you rightclick

the measure and select Create KPI to open the Key Performance Indicator (KPI) dialog box as

shown in Figure 9-8. Next you define the measure or absolute value that represents the target value

or goal of the KPI. The status thresholds are the boundaries for each level of progress toward the goal,

and you can adjust them as needed. Analysis Services compares the base measure to the thresholds

to determine which icon to use when displaying the KPI status.

FIGURE 9-8 Key performance indicator definition

Hierarchies

A hierarchy is useful for analyzing data at different levels of detail using logical relationships that

allow a user to navigate from one level to the next. In Diagram View, right-click on the column you

want to set as the parent level and select Create Hierarchy, or click the Create Hierarchy button that

appears when you hover the cursor over the column header. Type a name for the hierarchy, and drag

columns to the new hierarchy, as shown in Figure 9-9. You can add only columns from the same table

to the hierarchy. If necessary, create a calculated column that uses the RELATED() function in a DAX

formula to reference a column from a related table in the hierarchy.

FIGURE 9-9 Hierarchy in the diagram view of a model

Perspectives

When you have many objects in a model, you can create a perspective to display a subset of the

model objects so that users can more easily find the objects they need. Select Perspectives from

the Model menu to view existing perspectives or to add a new perspective, as shown in Figure 9-10.

When you define a perspective, you select tables, columns, measures, KPIs, and hierarchies to include.

FIGURE 9-10 Perspective definition

Partitions

At a minimum, each table in a tabular model has one partition, but you can divide a table into

multiple

partitions when you want to manage the reprocessing of each partition separately. For

example,

you might want to reprocess a partition containing current data frequently but have no

need to reprocess a partition containing historical data. To open the Partition Manager, which you use

to create and configure partitions, open the Table menu and select Partitions. Click the Query Editor

button to view the SQL statement and append a WHERE clause, as shown in Figure 9-11.

FIGURE 9-11 Addition of a WHERE clause to a SQL statement for partition

For example, if you want to create a partition for a month and year, such as March 2004, the

WHERE clause looks like this:

WHERE

(([OrderDate] >= N’2004-03-01 00:00:00′) AND

([OrderDate] < N’2004-04-01 00:00:00′))

After you create all the partitions, you open the table in Data View, open the Model menu, and

point to Process. You then have the option to select Process Partitions to refresh the data in each

partition selectively or select Process Table to refresh the data in all partitions. After you deploy the

model to Analysis Services, you can use scripts to manage the processing of individual partitions.

Roles

The tabular model is secure by default. You must create Analysis Services database roles and assign

Windows users or groups to a role to grant users access to the model. In addition, you add one of the

following permissions to the role to authorize the actions that the role members can perform:

¦ ¦ None A member cannot use the model in any way.

¦ ¦ Read A member can query the data only.

¦ ¦ Read And Process A member can query the data and execute process operations. However,

the member can neither view the model database in SQL Server Management Studio (SSMS)

nor make changes to the database.

¦ ¦ Process A member can process the data only, but has no permissions to query the data or

view the model database in SSMS.

¦ ¦ Administrator A member has full permissions to query the data, execute process

operations,

view the model database in SSMS, and make changes to the model.

To create a new role, click Roles on the Model menu to open the Role Manager dialog box. Type a

name for the role, select the applicable permissions, and add members to the role. If a user belongs

to roles having different permissions, Analysis Services combines the permissions and uses the least

restrictive permissions wherever it finds a conflict. For example, if one role has None set as the

permission

and another role has Read permissions, members of the role will have Read permissions.

Note As an alternative, you can add roles in SSMS after deploying the model to Analysis

Services.

To further refine security for members of roles with Read or Read And Process permissions, you

can create row-level filters. Each row filter is a DAX expression that evaluates as TRUE or FALSE and

defines the rows in a table that a user can see. For example, you can create a filter using the expression

=Category[EnglishProductCategoryName]=”Bikes” to allow a role to view data related to Bikes

only. If you want to prevent a role from accessing any rows in a table, use the expression =FALSE().

Note Row filters do not work when you deploy a tabular model in DirectQuery mode. We

explain DirectQuery mode later in this chapter.

Analyze in Excel

Before you deploy the tabular model, you can test the user experience by using the Analyze In Excel

feature. When you use the Analyze In Excel feature, SSDT opens Excel (which must be installed on the

same computer), creates a data-source connection to the model workspace, and adds a pivot table to

the worksheet. When you open this item on the Model menu, the Analyze In Excel dialog box displays

and prompts you to select a user or role to provide the security context for the data-source connection.

You must also choose a perspective, either the default perspective (which includes all model

objects) or a custom perspective, as shown in Figure 9-12.

FIGURE 9-12 Selection of a security context and perspective to test model in Excel

Reporting Properties

If you plan to implement Power View (which we explain in Chapter 10, “Reporting Services”), you

can access a set of reporting properties in the Properties window for each table and column. At a

minimum, you can change the Hidden property for the currently selected table or column to control

whether the user sees the object in the report field list. In addition, you can change reporting properties

for a selected table or column to enhance the user experience during the development of Power

View reports.

If you select a table in the model designer, you can change the following report properties:

¦ ¦ Default Field Set Select the list of columns and measures that Power View adds to the

report

canvas when a user selects the current table in the report field list.

¦ ¦ Default Image Identify the column containing images for each row in the table.

¦ ¦ Default Label Specify the column containing the display name for each row.

¦ ¦ Keep Unique Rows Indicate whether duplicate values display as unique values or as a single

value.

¦ ¦ Row Identifier Designate the column that contains values uniquely identifying each row in

the table.

If you select a column in the model designer, you can change the following report properties:

¦ ¦ Default Label Indicate whether the column contains a display name for each row. You can

set this property to True for one column only in the table.

¦ ¦ Image URL Indicate whether the column contains a URL to an image on the Web or on

a SharePoint site. Power View uses this indicator to retrieve the file as an image rather than

return the URL as a text.

¦ ¦ Row Identifier Indicate whether the column contains unique identifiers for each row. You

can set this property to True for one column only in the table.

¦ ¦ Table Detail Position Set the sequence order of the current column relative to other

columns

in the default field set.

DirectQuery Mode

When the volume of data for your model is too large to fit into memory or when you want queries

to return the most current data, you can enable DirectQuery mode for your tabular model. In DirectQuery

mode, Analysis Services responds to client tool queries by retrieving data and aggregates

directly from the source database rather than using data stored in the in-memory cache. Although

using cache provides faster response times, the time required to refresh the cache continually with

current data might be prohibitive when you have a large volume of data. To enable DirectQuery

mode, select the Model.bim file in Solution Explorer and then open the Properties window to change

the DirectQueryMode property from Off (the default) to On.

After you enable DirectQuery mode for your model, some design features are no longer available.

Table 9-3 compares the availability of features in in-memory mode and DirectQuery mode.

TABLE 9-3 In-Memory Mode vs. DirectQuery Mode

Feature

In-Memory

DirectQuery

Data Sources

Relational database

Analysis Services

Reporting Services report

Azure DataMarket dataset

Data feed

Excel file

Text file

SQL Server 2005 or later

Calculations

Measures

KPIs

Calculated columns

Measures

KPIs

DAX

Fully functional

Time intelligence functions invalid

Some statistical functions evaluate differently

Security

Analysis Services roles

SQL Server permissions

Client tool support

SSMS

Power View

Excel

SSMS

Power View

Note For a more in-depth explanation of the impact of switching the tabular model to

DirectQuery mode, refer to http://msdn.microsoft.com/en-us/library

/hh230898(SQL.110).aspx.

Deployment

Before you deploy a tabular model, you configure the target Analysis Services instance, running in

tabular mode and provide a name for the model in the project properties, as shown in Figure 9-13.

You must also select one of the following query modes, which you can change later in SSMS if necessary:

¦ ¦ DirectQuery Queries will use the relational data source only.

¦ ¦ DirectQuery With In-Memory Queries will use the relational data source unless the client

uses a connection string that specifies otherwise.

¦ ¦ In-Memory Queries will use the in-memory cache only.

¦ ¦ In-Memory With DirectQuery Queries will use the cache unless the client uses a

connection

string that specifies otherwise.

Note During development of a model in DirectQuery mode, you import a small amount

of data to use as a sample. The workspace database runs in a hybrid mode that caches

data during development. However, when you deploy the model, Analysis Services uses the

Query Mode value you specify in the deployment properties.

FIGURE 9-13 Project properties for tabular model deployment

After configuring the project properties, you deploy the model by using the Build menu or by

right-clicking the project in Solution Explorer and selecting Deploy. Following deployment, you can

use SSMS to can manage partitions, configure security, and perform backup and restore operations

for the tabular model database. Users with the appropriate permissions can access the deployed

tabular model as a data source for PowerPivot workbooks or for Power View reports.

Multidimensional Model Storage

The development process for a multidimensional model follows the same steps you use in SQL Server

2005 and later versions, with one exception. The MOLAP engine now uses a new type of storage for

string data that is more scalable than it was in previous versions of Analysis Services. Specifically, the

restriction to a 4-gigabyte maximum file size no longer exists, but you must configure a dimension to

use the new storage mode.

To do this, open the dimension designer in SSDT and select the parent node of the dimension

in the Attributes pane. In the Properties window, in the Advanced section, change the

StringStoreCompatibilityLevel

to 1100. You can also apply this setting to the measure group for a

distinct

count measure that uses a string as the basis for the distinct count. When you execute a

Process Full command, Analysis Services loads data into the new string store. As you add more data

to the database, the string storage file continues to grow as large as necessary. However, although

the file size limitation is gone, the file can contain only 4 billion unique strings or 4 billion records,

whichever occurs first.

Server Management

SQL Server 2012 includes several features that can help you manage your server. In this release,

you can more easily gather information for performance monitoring and diagnostic purposes

by capturing

events or querying Dynamic Management Views (DMVs). Also, you can configure

server properties to support a Non-Uniform Memory Access (NUMA) architecture or more than 64 processors.

Event Tracing

You can now use the SQL Server Extended Events framework, which we introduced in Chapter 5,

“Programmability

and Beyond-Relational Enhancements,” to capture any Analysis Services event

as an alternative to creating traces by using SQL Server Profiler. For example, consider a scenario

in which you are troubleshooting query performance on an Analysis Services instance running in

multidimensional

mode. You can execute an XML For Analysis (XMLA) create object script to enable

tracing for specific events, such as Query Subcube, Get Data From Aggregation, Get Data From Cache,

and Query End events. There are also some multidimensional events new to SQL Server 2012: Locks

Acquired, Locks Released, Locks Waiting, Deadlock, and LockTimeOut. The event-tracing process

stores the data it captures in a file and continues storing events until you disable event tracing by

executing an XMLA delete object script.

There are also events available for you to monitor the other server modes: VertiPaq SE Query

Begin, VertiPaq SE Query End, Direct Query Begin, and Direct Query End events. Furthermore, the

Resource Usage event is also new and applicable to any server mode. You can use it to capture the

size of reads and writes in kilobytes and the amount of CPU usage.

Note More information about event tracing is available at http://msdn.microsoft.com

/en-us/library/gg492139(SQL.110).aspx.

XML for Analysis Schema Rowsets

New schema rowsets are available not only to explore metadata of a tabular model, but also to

monitor the Analysis Services server. You can query the following schema rowsets by using Dynamic

Management Views in SSMS for VertiPaq engine and tabular models:

¦ ¦ DISCOVER_CALC_DEPENDENCY Find dependencies between columns, measures, and

formulas.

¦ ¦ DISCOVER_CSDL_METADATA Retrieve the Conceptual Schema Definition Language (CSDL)

for a tabular model. (CSDL is explained in the upcoming “Programmability” section.)

¦ ¦ DISCOVER_XEVENT_TRACE_DEFINITION Monitor SQL Server Extended Events.

¦ ¦ DISCOVER_TRACES Use the new column, Type, to filter traces by category.

¦ ¦ MDSCHEMA_HIERARCHIES Use the new column, Structure_Type, to filter hierarchies by

Natural, Unnatural, or Unknown.

Note You can learn more about the schema rowsets at http://msdn.microsoft.com

/en-us/library/ms126221(v=sql.110).aspx and http://msdn.microsoft.com/en-us/library

/ms126062(v=sql.110).aspx.

Architecture Improvements

You can deploy tabular and multidimensional Analysis Services instances on a server with a NUMA

architecture and more than 64 processors. To do this, you must configure the instance properties to

specify the group or groups of processors that the instance uses:

¦ ¦ Thread pools You can assign each process, IO process, query, parsing, and VertiPag thread

pool to a separate processor group.

¦ ¦ Affinity masks You use the processor group affinity mask to indicate whether the Analysis

Services instance should include or exclude a processor in a processor group from Analysis

Services operations.

¦ ¦ Memory allocation You can specify memory ranges to assign to processor groups.

Programmability

The BI Semantic Model (BISM) schema in SQL Server 2012 is the successor to the Unified Dimensional

Model (UDM) schema introduced in SQL Server 2005. BISM supports both an entity approach using

tables and relationships and a multidimensional approach using hierarchies and aggregations. This

release extends Analysis Management Objects (AMOs) and XMLA to support the management of

BISM models.

Note For more information, refer to the “Tabular Models Developer Roadmap” at

http://technet.microsoft.com/en-us/library/gg492113(SQL.110).aspx.

As an alternative to the SSMS graphical interface or to custom applications or scripts that you build

with AMO or XMLA, you can now use Windows PowerShell. Using cmdlets for Analysis Services, you

can navigate objects in a model or query a model. You can also perform administrative functions

like restarting the service, configuring members for security roles, performing backup or restore

operations,

and processing cube dimensions or partitions.

Note You can learn more about using PowerShell with Analysis Services at

http://msdn.microsoft.com/en-us/library/hh213141(SQL.110).aspx.

Another new programmability feature is the addition of Conceptual Schema Definition Language

(CSDL) extensions to present the tabular model definition to a reporting client. Analysis Services sends

the model’s entity definitions in XML format in response to a request from a client. In turn, the client

uses this information to show the user the fields, aggregations, and measures that are available for

reporting and the available options for grouping, sorting, and formatting the data. The extensions

added to CSDL to support tabular models include new elements for models, new attributes and

extensions

for entities, and properties for visualization and navigation.

Note A reference to the CSDL extensions is available at http://msdn.microsoft.com/en-us

/library/hh213142(SQL.110).aspx.

PowerPivot for Excel

PowerPivot for Excel is a client application that incorporates SQL Server technology into Excel 2010

as an add-in product. The updated version of PowerPivot for Excel that is available as part of the SQL

Server 2012 release includes several minor enhancements to improve usability that users will appreciate.

Moreover, it includes several major new features to make the PowerPivot model consistent with

the structure of the tabular model.

Installation and Upgrade

Installation of the PowerPivot for Excel add-in is straightforward, but it does have two prerequisites.

You must first install Excel 2010, and then Visual Studio 2010 Tools for Office Runtime. If you were

previously using SQL Server 2008 R2 PowerPivot for Excel, you must uninstall it because there is no

upgrade option for the add-in. After completing these steps, you can install the SQL Server 2012

PowerPivot for Excel add-in.

Note You can download Visual Studio 2010 Tools for Office Runtime at

http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=20479 and

download

the PowerPivot for Excel add-in at http://www.microsoft.com/download/en

/details.aspx?id=28150.

Usability

The usability enhancements in the new add-in make it easier to perform certain tasks during

PowerPivot

model development. The first noticeable change is in the addition of buttons to the Home

tab of the ribbon in the PowerPivot window, shown in Figure 9-14. In the View group, at the far right

of the ribbon, the Data View button and the Diagram View button allow you to toggle your view of

the model, just as you can do when working with the tabular model in SSDT. There are also buttons to

toggle the display of hidden columns and the calculation area.

FIGURE 9-14 Home tab on the ribbon in the PowerPivot window

Another new button on the Home tab is the Sort By Column button, which opens the dialog box

shown in Figure 9-15. You can now control the sorting of data in one column by the related values in

another column in the same table.

FIGURE 9-15 Sort By Column dialog box

The Design tab, shown in Figure 9-16, now includes the Freeze and Width buttons in the Columns

group to help you manage the interface as you review the data in the PowerPivot model. These

buttons

are found on the Home tab in the previous release of the PowerPivot for Excel add-in.

FIGURE 9-16 Design tab on the ribbon in the PowerPivot window

Notice also the new Mark As Date Table button on the Design tab. Open the table containing

dates, and click this button. A dialog box displays to prompt you for a column in the table that

contains

unique datetime values. Then you can create DAX expressions that use time intelligence

functions and get correct results without performing all the steps necessary in the previous version of

PowerPivot for Excel. You can select an advanced date filter as a row or column filter after you add a

date column to a pivot table, as shown in Figure 9-17.

FIGURE 9-17 Advanced filters available for use with date fields

The Advanced tab of the ribbon does not display by default. You must click the File button in the

top left corner above the ribbon and select the Switch To Advanced Mode command to display the

new tab, shown in Figure 9-18. You use this tab to add or delete perspectives, toggle the display of

implicit measures, add measures that use aggregate functions, and set reporting properties. With the

exception of Show Implicit Measures, all these tasks are similar to the corresponding tasks in tabular

model development. The Show Implicit Measures button toggles the display of measures that PowerPivot

for Excel creates automatically when you add a numeric column to a pivot table’s values in the

Excel workbook.

FIGURE 9-18 Advanced tab on the ribbon in the PowerPivot window

Not only can you add numeric columns as a pivot table value for aggregation, you can now also

add numeric columns to rows or to columns as distinct values. For example, you can place the [Sales

Amount] column both as a row and as a value to see the sum of sales for each amount, as shown in

Figure 9-19.

FIGURE 9-19 Use of numeric value on rows

In addition, you will find the following helpful enhancements in this release:

¦ ¦ In the PowerPivot window, you can configure the data type for a calculated column.

¦ ¦ In the PowerPivot window, you can right-click a column and select Hide From Client Tools

to prevent the user from accessing the field. The column remains in the Data View in the

PowerPivot

window with a gray background, but you can toggle its visibility in the Data View

by clicking the Show Hidden button on the Home tab of the ribbon.

¦ ¦ In the Excel window, the number format you specify for a measure persists.

¦ ¦ You can right-click a numeric value in the Excel window, and then select the Show Details

command on the context menu to open a separate worksheet listing the individual rows that

comprise the selected value. This feature does not work for calculated values other than simple

aggregates such as sum or count.

¦ ¦ In the PowerPivot window, you can add a description to a table, a measure, a KPI label, a KPI

value, a KPI status, or a KPI target. The description displays as a tooltip in the PowerPivot Field

List in the Excel window.

¦ ¦ The PowerPivot Field List now displays hierarchies at the top of the field list for each table, and

then displays all other fields in the table in alphabetical order.

Model Enhancements

If you have experience with PowerPivot prior to the release of SQL Server 2012 and then create your

first tabular model in SSDT, you notice that many steps in the modeling process are similar. Now as

you review the updated PowerPivot for Excel features, you should notice that the features of the

tabular model that were different from the previous version of PowerPivot for Excel are no longer

distinguishing features. Specifically, the following design features are now available in the PowerPivot

model:

¦ ¦ Table relationships

¦ ¦ Hierarchies

Note You can learn more about these functions at http://technet.microsoft.com/en-us

/library/ee634822(SQL.110).aspx.

PowerPivot for SharePoint

The key changes to PowerPivot for SharePoint in SQL Server 2012 provide you with a more

straightforward

installation and configuration process and a wider range of tools for managing the

server environment. These changes should help you get your server up and running quickly and keep

it running smoothly as usage increases over time.

Installation and Configuration

PowerPivot for SharePoint has many dependencies within the SharePoint Server 2010 farm that

can be challenging to install and configure correctly. Before you install PowerPivot for SharePoint,

you must install SharePoint Server 2010 and SharePoint Server 2010 Service Pack 1, although it is

not necessary

to run the SharePoint Configuration Wizard. You might find it easier to install the

PowerPivot

for SharePoint instance and then use the PowerPivot Configuration Tool to complete the

configuration of both the SharePoint farm and PowerPivot for SharePoint as one process.

Note For more details about the PowerPivot for SharePoint installation process, refer to

http://msdn.microsoft.com/en-us/library/ee210708(v=sql.110).aspx. Instructions for using

the PowerPivot Configuration Tool (or for using SharePoint Central Administration or

PowerShell cmdlets instead) are available at http://msdn.microsoft.com/en-us/library

/ee210609(v=sql.110).aspx.

Note You can now perform all configuration tasks by using PowerShell script containing

SharePoint PowerShell cmdlets and PowerPivot cmdlets. To learn more, see

http://msdn.microsoft.com/en-us/library/hh213341(SQL.110).aspx.

Management

To keep PowerPivot for SharePoint running optimally, you must frequently monitor its use of

resources

and ensure that the server can continue to support its workload. The SQL Server 2012

release extends the management tools that you have at your disposal to manage disk-space usage,

to identify potential problems with server health before users are adversely impacted, and to address

ongoing data-refresh failures.

Disk Space Usage

When you deploy a workbook to PowerPivot for SharePoint, PowerPivot for SharePoint stores

the workbook in a SharePoint content database. Then when a user later requests that workbook,

PowerPivot

for SharePoint caches it as a PowerPivot database on the server’s disk at \Program Files

\Microsoft SQL Server\MSAS11.PowerPivot\OLAP\Backup\Sandboxes\<serviceApplicationName> and

then loads the workbook into memory. When no one accesses the workbook for the specified period

of time, PowerPivot for SharePoint removes the workbook from memory, but it leaves the workbook

on disk so that it reloads into memory faster if someone requests it.

PowerPivot for SharePoint continues caching workbooks until it consumes all available disk space

unless you specify limits. The PowerPivot System Service runs a job on a periodic basis to remove

workbooks from cache if they have not been used recently or if a new version of the workbook exists

in the content database. In SQL Server 2012, you can configure the amount of total space that

PowerPivot for SharePoint can use for caching and how much data to delete when the total space is

used. You configure these settings at the server level by opening SharePoint Central Administration,

navigating to Application Management, selecting Manage Services On Server, and selecting SQL

Server Analysis Services. Here you can set the following two properties:

¦ ¦ Maximum Disk Space For Cached Files The default value of 0 instructs Analysis Services

that it can use all available disk space, but you can provide a specific value in gigabytes to

establish a maximum.

¦ ¦ Set A Last Access Time (In Hours) To Aid In Cache Reduction When the workbooks

in cache exceed the maximum disk space, Analysis Services uses this setting to determine

which workbooks to delete from cache. The default value is 4, in which case Analysis Services

removes

all workbooks that have been inactive for 4 hours or more.

As another option for managing the cache, you can go to Manage Service Applications (also in

Application

Management) and select Default PowerPivot Service Application. On the PowerPivot

Management Dashboard page, click the Configure Service Application Settings link in the Actions

section, and modify the following properties as necessary:

¦ ¦ Keep Inactive Database In Memory (In Hours) By default, Analysis Services keeps a

workbook

in memory for 48 hours following the last query. If a workbook is frequently

accessed

by users, Analysis Services never releases it from memory. You can decrease this

value if necessary.

¦ ¦ Keep Inactive Database In Cache (In Hours) After Analysis Services releases a workbook

from memory, the workbook persists in cache and consumes disk space for 120 hours, the

default time span, unless you reduce this value.

Server Health Rules

The key to managing server health is by identifying potential threats before problems occur. In this

release, you can customize server health rules to alert you to issues with resource consumption or

server availability. The current status of these rules is visible in Central Administration when you open

Monitoring and select Review Problems And Solutions.

To configure rules at the server level, go to Application Management, select Manage Services

On Server, and then select the SQL Server Analysis Services link. Review the default values for the

following

health rules, and change the settings as necessary:

¦ ¦ Insufficient CPU Resource Allocation Triggers a warning when the CPU utilization of

msmdsrv.exe remains at or over a specified percentage during the data-collection interval. The

default is 80 percent.

¦ ¦ Insufficient CPU Resources On The System Triggers a warning when the CPU usage of

the server remains at or above a specified percentage during the data collection interval. The

default is 90 percent.

¦ ¦ Insufficient Memory Threshold Triggers a memory warning when the available memory

falls below the specified value as a percentage of memory allocated to Analysis Services. The

default is 5 percent.

¦ ¦ Maximum Number of Connections Triggers a warning when the number of connections

exceeds the specified number. The default is 100, which is an arbitrary number unrelated to

the capacity of your server and requires adjustment.

¦ ¦ Insufficient Disk Space Triggers a warning when the percentage of available disk space on

the drive on which the backup folder resides falls below the specified value. The default is 5

percent.

¦ ¦ Data Collection Interval Defines the period of time during which calculations for serverlevel

health rules apply. The default is 4 hours.

To configure rules at the service-application level, go to Application Management, select Manage

Service Applications, and select Default PowerPivot Service Application. Next, select Configure Service

Application Settings in the Actions list. You can then review and adjust the values for the following

health rules:

¦ ¦ Load To Connection Ratio Triggers a warning when the number of load events relative to

the number of connection events exceeds the specified value. The default is 20 percent. When

this number is too high, the server might be unloading databases too quickly from memory or

from the cache.

¦ ¦ Data Collection Interval Defines the period of time during which calculations for service

application-level health rules apply. The default is 4 hours.

¦ ¦ Check For Updates To PowerPivot Management Dashboard.xlsx Triggers a warning

when the PowerPivot Management Dashboard.xlsx file fails to change during the specified

number of days. The default is 5. Under normal conditions, the PowerPivot Management

Dashboard.xlsx file refreshes daily.

Data Refresh Configuration

Because data-refresh operations consume server resources, you should allow data refresh to occur

only for active workbooks and when the data refresh consistently completes successfully. You now

have the option to configure the service application to deactivate the data-refresh schedule for a

workbook if either of these conditions is no longer true. To configure the data-refresh options, go to

Application Management, select Manage Service Applications, and select Default PowerPivot Service

Application. Next, select Configure Service Application Settings in the Actions list and adjust the

following

settings as necessary:

¦ ¦ Disable Data Refresh Due To Consecutive Failures If the data refresh fails a consecutive

number of times, the PowerPivot service application deactivates the data-refresh schedule for

a workbook. The default is 10, but you can set the value to 0 to prevent deactivation.

¦ ¦ Disable Data Refresh For Inactive Workbooks If no one queries a workbook during the

time required to execute the specified number of data-refresh cycles, the PowerPivot service

application deactivates the workbook. The default is 10, but you can set the value to 0 if you

prefer to keep the data-refresh operation active.

229

CHAP TE R 10

Reporting Services

Each release of Reporting Services since its introduction in Microsoft SQL Server 2000 has expanded

its feature base to improve your options for sharing reports, visualizing data, and empowering

users with self-service options. SQL Server 2012 Reporting Services is no exception, although almost

all the improvements affect only SharePoint integrated mode. The exception is the two new renderers

available in both native mode and SharePoint integrated mode. Reporting Services in SharePoint

integrated

mode has a completely new architecture, which you now configure as a SharePoint

shared service application. For expanded data visualization and self-service capabilities in SharePoint

integrated mode, you can use the new ad reporting tool, Power View. Another self-service feature

available only in SharePoint integrated mode is data alerts, which allows you to receive an email when

report data meets conditions you specify. If you have yet to try SharePoint integrated mode, these

features will surely entice you to begin!

New Renderers

Although rendering a report as a Microsoft Excel workbook has always been available in Reporting

Services, the ability to render a report as a Microsoft Word document has been possible only since

SQL Server 2008. Regardless, these renderers produce XLS and DOC file formats respectively, which

allows compatibility with Excel 2003 and Word 2003. Users of Excel 2010 and Word 2010 can open

these older file formats, of course, but they can now enjoy some additional benefits by using the new

renderers.

Excel 2010 Renderer

By default, the Excel rendering option now produces an XLSX file in Open Office XML format, which

you can open in either Excel 2007 or Excel 2010 if you have the client installed on your computer.

The benefit of the new file type is the higher number of maximum rows and columns per worksheet

that the later versions of Excel support—1,048,576 rows and 16,384 columns. You can also export

reports with a wider range of colors as well, because the XLSX format supports 16 million colors in the

24-bit color spectrum. Last, the new renderer uses compression to produce a smaller file size for the

exported report.

230 PART 2 Business Intelligence Development

Tip If you have only Excel 2003 installed, you can open this file type if you install the

Microsoft Office Compatibility Pack for Word, Excel, and PowerPoint, which you can download

at http://office.microsoft.com/en-us/products/microsoft-office-compatibilitypack-

for-word-excel-and-powerpoint-HA010168676.aspx. As an alternate solution, you can

enable the Excel 2003 renderer in the RsReportSErver.config and RsReportDesigner.config

files by following the instructions at http://msdn.microsoft.com/en-us/library

/dd255234(SQL.110).aspx#AvailabilityExcel.

Word 2010 Renderer

Although the ability to render a report as a DOCX file in Open Office XML format does not offer as

many benefits as the new Excel renderer, the new Word render does use compression to generate a

smaller file than the Word 2003 renderer. You can also create reports that use new features in Word

2007 or Word 2010.

Tip Just as with the Excel renderer, you can use the new Word renderer when you have

Word 2003 on your computer if you install the Microsoft Office Compatibility Pack for

Word, Excel, and PowerPoint, available for download at http://office.microsoft.com/en-us

/products/microsoft-office-compatibility-pack-for-word-excel-and-powerpoint-

HA010168676.aspx. If you prefer, you can enable the Word 2003 renderer in the

RsReportSErver.config and RsReportDesigner.config files by following the instructions at

http://msdn.microsoft.com/en-us/library/dd283105(SQL.110).aspx#AvailabilityWord.

SharePoint Shared Service Architecture

In previous version of Reporting Services, installing and configuring both Reporting Services and

SharePoint components required many steps and different tools to complete the task because

SharePoint

integrated mode was depending on features from two separate services. For the current

release, the development team has completely redesigned the architecture for better performance,

scalability, and administration. With the improved architecture, you also experience an easier

configuration

process.

Feature Support by SharePoint Edition

General Reporting Services features are supported in all editions of SharePoint. That is, all features

that were available in previous versions of Reporting Services continue to be available in all editions.

However, the new features in this release are available only in SharePoint Enterprise Edition.

Table 10-1 shows the supported features by SharePoint edition.

CHAPTER 10 Reporting Services 231

TABLE 10-1 SharePoint Edition Feature Support

Reporting Services

Feature

SharePoint Foundation

2010

SharePoint Server 2010

Standard Edition

SharePoint Server 2010

Enterprise Edition

General report viewing and

subscriptions

Data Alerts

Power View

Shared Service Architecture Benefits

With Reporting Services available as a shared service application, you can experience the following

new benefits:

¦ ¦ Scale Reporting Services across web applications and across your SharePoint Server 2010

farms with fewer resources than possible in previous versions.

¦ ¦ Use claims-based authentication to control access to Reporting Services reports.

¦ ¦ Rely on SharePoint backup and recovery processes for Reporting Services content.

Service Application Configuration

After installing the Reporting Services components, you must create and then configure the service

application. You no longer configure the Reporting Services settings by using the Reporting Services

Configuration Manager. Instead, you use the graphical interface in SharePoint Central Administration

or use SharePoint PowerShell cmdlets.

When you create the service application, you specify the application pool identity under which

Reporting Services runs. Because the SharePoint Shared Service Application pool now hosts Reporting

Services, you no longer see a Windows service for a SharePoint integrated-mode report server in the

service management console. You also create three report server databases—one for storing server

and catalog data; another for storing cached data sets, cached reports, and other temporary data;

and a third one for data-alert management. The database names include a unique identifier for the

service application, enabling you to create multiple service applications for Reporting Services in the

same SharePoint farm.

As you might expect, most configuration settings for the Reporting Services correspond to settings

you find in the Reporting Services Configuration Manager or the server properties you set in

SQL Server Management Studio when working with a native-mode report server. However, before

you can use subscriptions or data alerts, you must configure SQL Server Agent permissions correctly.

One way to do this is to open SharePoint Central Administration, navigate to Application Management,

access Manage Service Applications, click the link for the Reporting Services service application,

and then open the Provision Subscriptions And Alerts page. On that page, you can provide the

credentials if your SharePoint administrator credentials have db_owner permissions on the Reporting

Services databases.

If you prefer, you can download a Transact-SQL script from the same page, or run

a PowerShell cmdlet to build the same Transact-SQL script, that you can later execute in SQL Server

Management Studio.

Whether you use the interface or the script, the provisioning process creates the RSExec role if

necessary, creates a login for the application pool identity, and assigns it to the RSExec role in each

of the three report server databases. If any scheduled jobs, such as subscriptions or alerts, exist in the

report server database, the script assigns the application pool identity as the owner of those jobs. In

addition, the script assigns the login to the SQLAgentUserRole and grants it the necessary permissions

for this login to interact with SQL Server Agent and to administer jobs.

Note For detailed information about installing and configuring Reporting Services in

SharePoint integrated mode, refer to http://msdn.microsoft.com/en-us/library

/cc281311(SQL.110).aspx.

Power View

Power View is the latest self-service feature available in Reporting Services. It is a browser-based

Silverlight application that requires Reporting Services to run in SharePoint integrated mode using

SharePoint Server 2010 Enterprise Edition. It also requires a specific type of data source—either a

tabular model that you deploy to an Analysis Services server or a PowerPivot workbook that you

deploy to a SharePoint document library.

Rather than working in design mode and then previewing the report, as you do when using Report

Designer or Report Builder, you work directly with the data in the presentation layout of Power View.

You start with a tabular view of the data that you can change into various data visualizations. As you

explore and examine the data, you can fine-tune and adjust the layout by modifying the sort order,

adding more views to the report, highlighting values, and applying filters. When you finish, you can

save the report in the new RDLX file format to a SharePoint document library or PowerPivot Gallery or

export the report to Microsoft PowerPoint to make it available to others.

Note To use Power View, you can install Reporting Services in SharePoint integrated

mode using any of the following editions: Evaluation, Developer, Business Intelligence,

or Enterprise. The browser you use depends on your computer’s operating system. You

can use Internet Explorer 8 and later when using a Windows operating system or Safari 5

when using a Mac operating system. Windows Vista and Windows Server 2008 both support

Internet Explorer 7. Windows Vista, Windows 7, and Windows Server 2008 support

Firefox 7.

Data Sources

Just as for any report you create using the Report Designer in SQL Server Data Tools or using Report

Builder, you must have a data source available for use with your Power View report. You can use any

of the following data source types:

¦ ¦ PowerPivot workbook Select a PowerPivot workbook directly from the PowerPivot Gallery

as a source for your Power View report.

¦ ¦ Shared data source Create a Reporting Services Shared Data Source (RSDS) file with

the Data Source Type property set to Microsoft BI Semantic Model For Power View. You

then define

a connection string that references a PowerPivot workbook (such as

http://SharePointServer/PowerPivot Gallery/myWorkbook.xlsx) or an Analysis Services tabular

model (such as Data Source=MyAnalysisServer; Initial Catalog=MyTabularModel). (We introduced

tabular models in Chapter 9, “Analysis Services and PowerPivot.”). You use this type of

shared data source only for the creation of Power View reports.

Note If your web application uses claims forms-based authentication, you must

configure the shared data source to use stored credentials and specify a Windows

Windows claims authentication and your RSDS file references an Analysis Services

outside the SharePoint farm, you must configure Kerberos authentication

and use

integrated security.

¦ ¦ Business Intelligence Semantic Model (BISM) connection file Create a BISM file to

connect

to either a PowerPivot workbook or to an Analysis Services tabular model. You can

use this file as a data source for both Power View reports and Excel workbooks.

Note If your web application uses Windows classic or Windows claims authentication

and your BISM file references an Analysis Services outside the SharePoint farm, you must

configure

Kerberos authentication and use integrated security.

BI Semantic Model Connection Files

Whether you want to connect to an Analysis Services tabular database or to a PowerPivot

workbook deployed to a SharePoint server, you can use a BISM connection file as a data source

for your Power View reports and even for Excel workbooks. To do this, you must first add the BI

Semantic Model Connection content type to a document library in a PowerPivot site.

To create a BISM file for a PowerPivot workbook, you open the document library in which

you want to store the file and for which you have Contribute permission. Click New Document

on the Documents tab of the SharePoint ribbon, and select BI Semantic Model Connection on

the menu. You then configure the properties of the connection according to the data source, as

follows:

¦ ¦ PowerPivot workbook Provide the name of the file, and set the Workbook URL

Or Server Name property as the SharePoint URL for the workbook, such as

http://SharePointServer/PowerPivot Gallery/myWorkbook.xlsx.

¦ ¦ Analysis Services tabular database Provide the name of the file; set the Workbook

URL Or Server Name property using the server name, the fully qualified domain name, the

Internet Protocol (IP) address, or the server and instance name (such as MyAnalysisServer

\MyInstance); and set the Database property using the name of the tabular model on the

server.

You must grant users the Read permission on the file to enable them to open the file and use

it as a data source. If the data source is a PowerPivot workbook, users must also have Read permission

on the workbook. If the data source is a tabular model, the shared service requesting

the tabular model data must have administrator permission on the tabular instance and users

must be in a role with Read permission for the tabular model.

Power View Design Environment

You can open the Power View design environment for a new report in one of the following ways:

¦ ¦ From a PowerPivot workbook Click the Create Power View Report icon in the upperright

corner of a PowerPivot workbook that displays in the PowerPivot Gallery, as shown in Figure

10-1.

¦ ¦ From a data connection library Click the down arrow next to the BISM or RSDS file, and

then click Create Power View Report.

FIGURE 10-1 Create Power View Report icon in PowerPivot Gallery

In the Power View design environment that displays when you first create a report, a blank view

workspace appears in the center, as shown in Figure 10-2. As you develop the report, you add tables

and data visualizations to the view workspace. The view size is a fixed height and width, just like the

slide size in PowerPoint. If you need more space to display data, you can add more views to the report

and then navigate between views by using the Views pane.

Above the view workspace is a ribbon that initially displays only the Home tab. As you add fields

to the report, the Design and Layout tabs appear. The contents of each tab on the ribbon change

dynamically to show only the buttons and menus applicable to the currently selected item.

View Workspace Dynamic Ribbon Field List

Views Pane Layout Section

FIGURE 10-2 Power View design environment

The field list from the tabular model appears in the upper-right corner. When you open the design

environment for a new report, you see only the table names of the model. You can expand a table

name to see the fields it contains, as shown in Figure 10-3. Individual fields, such as Category, display

in the field list without an icon. You also see row label fields with a gray-and-white icon, such as

Drawing. The row label fields identify a column configured with report properties in the model, as we

explain in Chapter 9. Calculated columns, such as Attendees, display with a sigma icon, and measures,

such as Quantity Served YTD, appear with a calculator icon.

You select the check box next to a field to include that field in your report, or you can drag the

field into the report. You can also double-click a table name in the field list to add the default field

set, as defined in the model, to the current view.

Tip The examples in this chapter use the sample files available for download at

http://www.microsoft.com/download/en/details.aspx?id=26718 and

http://www.microsoft.com/download/en/details.aspx?id=26719. You can use these files

with the tutorial at http://social.technet.microsoft.com/wiki/contents/articles/6175.aspx for

a hands-on experience with Power View.

Fields

Row Label Fields

Calculated Columns

Measure

FIGURE 10-3 Tabular model field list

You click a field to add it to the view as a single column table, as shown in Figure 10-4. You must

select the table before you select additional fields, or else selection of the new field starts a new table.

As you add fields, Power View formats the field according to data type.

FIGURE 10-4 Single-column table

Tip Power View retrieves only the data it can display in the current view and retrieves

additional

rows as the user scrolls to see more data. That way, Power View can optimize

performance even when the source table contains millions of rows.

The field you select appears below the field list in the layout section. The contents of the layout

section change according to the current visualization. You can use the layout section to rearrange the

sequence of fields or to change the behavior of a field, such as changing which aggregation function

to use or whether to display rows with no data.

After you add fields to a table, you can move it and resize it as needed. To move an item in the

view, point to its border and, when the cursor changes to a hand, drag it to the desired location. To

resize it, point again to the border, and when the double-headed arrow appears, drag the border to

make the item smaller or larger.

To share a report when you complete the design, you save it by using the File menu. You can save

the file only if you have the Add Items permission on the destination folder or the Edit Items permission

to overwrite an existing file. When you save a file, if you keep the default option set to Save

Preview Images With Report, the views of your report display in the PowerPivot gallery. You should

disable this option if the information in your report is confidential. The report saves as an RDLX file,

which is not compatible with the other Reporting Services design environments.

Data Visualization

A table is only one way to explore data in Power View. After you add several fields to a table, you can

then convert it to a matrix, a chart, or a card. If you convert it to a scatter chart, you can add a play

axis to visualize changes in the data over multiple time periods. You can also create multiples of the

same chart to break down its data by different categorizations.

Charts

When you add a measure to a table, the Design tab includes icons for chart visualizations, such as

column,

bar, line, or scatter charts. After you select the icon for a chart type, the chart replaces the

table. You can then resize the visualization to improve legibility, as shown in Figure 10-5. You can continue

to add more fields to a visualization or remove fields by using the respective check boxes in the

field list. On the Layout tab of the ribbon, you can use buttons to add a chart title, position the legend

when your chart contains multiple series, or enable data labels.

FIGURE 10-5 Column chart

Tip You can use an existing visualization as the starting point for a new visualization by

selecting it and then using the Copy and Paste buttons on the Home tab of the ribbon. You

can paste the visualization to the same view or to a new view. However, you cannot use

shortcut keys to perform the copy and paste.

Arrangement

You can overlap and inset items, as shown in Figure 10-6. You use the Arrange buttons on the

Home tab of the ribbon to bring an item forward or send it back. When you want to view an item

in isolation,

you can click the button in its upper-right corner to fill the entire view with the selected

visualization.

FIGURE 10-6 Overlapping visualizations

Cards and Tiles

Another type of visualization you can use is cards, which is a scrollable list of grouped fields arranged

in a card format, as shown in Figure 10-7. Notice that the default label and default image fields are

more prominent than the other fields. The size of the card changes dynamically as you add or remove

fields until you resize using the handles, after which the size remains fixed. You can double-click the

sizing handle on the border of the card container to revert to auto-sizing.

FIGURE 10-7 Card visualization

You can change the sequence of fields by rearranging them in the layout section. You can also

move a field to the Tile By area to add a container above the cards that displays one tile for each

value in the selected field, as shown in Figure 10-8. When you select a tile, Power View filters the

collection

of cards to display those having the same value as the selected tile.

FIGURE 10-8 Tiles

The default layout of the tiles is tab-strip mode. You can toggle between tab-strip mode and

cover-flow mode by using the respective button in the Tile Visualizations group on the Design tab

of the ribbon. In cover-flow mode, the label or image appears below the container and the current

selection appears in the center of the strip and slightly larger than the other tile items.

Tip You can also convert a table or matrix directly to a tile container by using the Tiles

button on the Design tab of the ribbon. Depending on the model design, Power View

displays

the value of the first field in the table, the first row group value, or the default

field set. All other fields that were in the table or matrix display as a table inside the tile container.

Play Axis

When your table or chart contains two measures, you can convert it to a scatter chart to show one

measure on the horizontal axis and the second measure on the vertical axis. Another option is to use

a third measure to represent size in a bubble chart that you define in the layout section, as shown

in Figure 10-9. With either chart type, you can also add a field to the Color section for grouping

purposes.

Yet one more option is to add a field from a date table to the Play Axis area of the layout

section.

FIGURE 10-9 Layout section for a bubble chart

With a play axis in place, you can click the play button to display the visualization in sequence for

each time period that appears on the play axis or you can use the slider on the play axis to select a

specific time period to display. A watermark appears in the background of the visualization to indicate

the current time period. When you click a bubble or point in the chart, you filter the visualization to

focus on the selection and see the path that the selection follows over time, as shown in Figure 10-10.

You can also point to a bubble in the path to see its values display as a tooltip.

FIGURE 10-10 Bubble chart with a play axis

Multiples

Another way to view data is to break a chart into multiple copies of the same chart. You place the

field for which you want to create separate charts in the Vertical Multiples or Horizontal Multiples

area of the layout section. Then you use the Grid button on the Layout tab of the ribbon to select

the number of tiles across and down that you want to include, such as three tiles across and two tiles

down, as shown in Figure 10-11. If the visualization contains more tiles than the grid can show, a scroll

bar appears to allow you to access the other tiles. Power View aligns the horizontal and vertical axes

in the charts to facilitate comparisons between charts.

FIGURE 10-11 Vertical multiples of a line chart

Sort Order

Above some types of visualizations, the sort field and direction display. To change the sort field, click

on it to view a list of available fields, as shown in Figure 10-12. You can also click the abbreviated

direction label to reverse the sort. That is, click Asc to change an ascending sort to a descending sort

which then displays as Desc in the sort label.

FIGURE 10-12 Sort field selection

Multiple Views

You might find it easier to explore your data when you arrange visualizations as separate views. All

views in your report must use the same tabular model as their source, but otherwise each view can

contain separate visualizations. Furthermore, the filters you define for a view, as we explain later in

this chapter, apply only to that view. Use the New View button on the Home tab of the toolbar to

create

a new blank view or to duplicate the currently selected view, as shown in Figure 10-13.

FIGURE 10-13 Addition of a view to a Power View report

Highlighted Values

To help you better see relationships, you can select a value in your chart, such as a column or a

legend

item. Power View then highlights related values in the chart. You can even select a value in

one chart, such as Breads, to highlight values in other charts in the same view that are related to

Breads, as shown in Figure 10-14. You clear the highlighting by clicking in the chart area without

clicking

another bar.

FIGURE 10-14 Highlighted values

Filters

Power View provides you with several ways to filter the data in your report. You can use a slicer or

tile container to a view to incorporate the filter selection into the body of the report. As an alternative,

you can add a view filter or a visualization filter to the Filter Area of the design environment for

greater flexibility in the value selections for your filters.

Slicer

When you create a single-column table in a view, you can click the Slicer button on the Design tab

of the ribbon to use the values in the table as a filter. When you select one or more labels in a slicer,

Power View not only filters all visualizations in the view, but also other slicers in the view, as shown in

Figure 10-15. You can restore the unfiltered view by clicking the Clear Filter icon in the upper-right

corner of the slicer.

FIGURE 10-15 Slicers in a view

Tile Container

If you add a visualization inside a tile container, as shown in Figure 10-16, the tile selection filters

both the cards as well as the visualizations. However, the filtering behavior is limited only to the tile

selection.

Other visualizations or slicers in the same view remain unchanged.

FIGURE 10-16 Tile container to filter a visualization

When you add a visualization to a tile container, Power View does not synchronize the horizontal

and vertical axes for the visualizations as it does with multiples. In other words, when you select

one tile, the range of values for the vertical axis might be greater than the range of values on the

same axis for a different tile. You can use the buttons in the Synchronize group of the Layout tab to

synchronize

axes, series, or bubbles across the tiles to more easily compare the values in the chart

from one tile to another.

View Filter

A view filter allows you to define filter criteria for the current view without requiring you to include

it in the body of the report like a slicer. Each view in your report can have its own set of filters. If you

duplicate a view containing filters, the new view contains a copy of the first view’s filters. That is,

changing filter values for one view has no effect on another view, even if the filters are identical.

You define a view filter in the Filters area of the view workspace, which by default is not visible

when you create a new report. To toggle the visibility of this part of the workspace, click the Filters

Area button on the Home tab of the ribbon. After making the Filters area visible, you can collapse it

when you need to increase the size of your view by clicking the left arrow that appears at the top of

the Filters area.

To add a new filter in the default basic mode, drag a field to the Filters area and then select a

value. If the field contains string or date data types, you select one or more values by using a check

box. If the field contains a numeric data type, you use a slider to set the range of values. You can see

an example of filters for fields with numeric and string data types in Figure 10-17.

FIGURE 10-17 Filter selection of numeric and string data types in basic filter mode

To use more flexible filter criteria, you can switch to advanced filter mode by clicking the first icon

in the toolbar that appears to the right of the field name in the Filters area. The data type of the

field determines the conditions you can configure. For a filter with a string data type, you can create

conditions to filter on partial words using operators such as Contains, Starts With, and so on. With a

numeric data type, you can use an operator such as Less Than or Greater Than Or Equal To, among

others. If you create a filter based on a date data type, you can use a calendar control in combination

with operators such as Is Before or Is On Or After, and others, as shown in Figure 10-18. You can also

create compound conditions by using AND and OR operators.

FIGURE 10-18 Filter selection of date data types in advanced filter mode

Visualization Filter

You can also use the Filters area to configure the filter for a selected visualization. First, you must click

the Filter button in the top-right corner of the visualization, and then fields in the visualization display

in the Filters area, as shown in Figure 10-19. When you change values for fields in a visualization’s

filter, Power View updates only that visualization. All other visualizations in the same view are unaffected

by the filter. Just as you can with a view filter, you can configure values for a visualization filter

in basic filter mode or advanced filter mode.

FIGURE 10-19 Visualization filter

Display Modes

After you develop the views and filters for your report, you will likely spend more time navigating

between views and interacting with the visualizations than editing your report. To provide alternatives

for your viewing experience, Power View offers the following three display modes, which you enable

by using the respective button in the Display group of the Home tab of the ribbon:

¦ ¦ Fit To Window The view shrinks or expands to fill the available space in the window. When

you use this mode, you have access to the ribbon, field list, Filters area, and Views pane. You

can switch easily between editing and viewing activities.

¦ ¦ Reading Mode In this mode, Power View keeps your browser’s tabs and buttons visible, but

it hides the ribbon and the field list, which prevents you from editing your report. You can

navigate between views using the arrow keys on your keyboard, the multiview button in the

lower right corner of the screen that allows you to access thumbnails of the views, or the arrow

buttons in the lower right corner of the screen. Above the view, you can click the Edit Report

button to switch to Fit To Window mode or click the other button to switch to Full Screen

mode.

¦ ¦ Full Screen Mode Using Full Screen mode is similar to using the slideshow view in

PowerPoint.

Power View uses your entire screen to display the current view. This mode is

similar to Reading mode and provides the same navigation options, but instead it hides your

browser and uses the entire screen.

You can print the current view when you use Fit To Window or Reading mode only. To print, open

the File menu and click Print. The view always prints in landscape orientation and prints only the

data you currently see on your screen. In addition, the Filters area prints only if you expand it first. If

you have a play axis on a scatter or bubble chart, the printed view includes only the current frame.

Likewise,

if you select a tile in a tile container before printing, you see only the selected tile in the

printed view.

PowerPoint Export

A very useful Power View feature is the ability to export your report to PowerPoint. On the File menu,

click Export To PowerPoint and save the PPTX file. Each view becomes a separate slide in the PowerPoint

file. When you edit each slide, you see only a static image of the view. However, as long as

you have the correct permissions and an active connection to your report on the SharePoint server

when you display a slide in Reading View or Slideshow modes, you can click the Click To Interact

link in the lower-right corner of the slide to load the view from Power View and enable interactivity.

You can change filter values in the Filters area, in slicers, and in tile containers, and you can highlight

values. However, you cannot create new filters or new data visualizations from PowerPoint. Also, if

you navigate

to a different slide and then return to the Power View slide, you must use the Click To

Interact link to reload the view for interactivity.

Data Alerts

Rather than create a subscription to email a report on a periodic basis, regardless of the data values

that it contains, you can create a data alert to email a notification only when specific conditions in

the data are true at a scheduled time. This new self-service feature is available only with Reporting

Services

running in SharePoint integrated mode and works as soon as you have provisioned subscriptions

and alerts, as we described earlier in the “Service Application Configuration” section of this

chapter. However, it works only with reports you create by using Report Designer or Report Builder.

You cannot create data alerts for Power View reports.

Data Alert Designer

You can create one or more data alerts for any report you can access, as long as the data store uses a

data source with stored credentials or no credentials. The report must also include at least one data

region. In addition, the report must successfully return data at the time you create the new alert. If

these prerequisites are met, then after you open the report for viewing, you can select New Data

Alert from the Actions menu in the report toolbar. The Data Alert Designer, as shown in Figure 10-20,

displays.

FIGURE 10-20 Data Alert Designer

Tip You must have the SharePoint Create Alert permission to create an alert for any report

for which you have permission to view.

You use the Data Alert Designer to define rules for one or more data regions in the report that

control whether Reporting Services sends an alert. You also specify the recurring schedule for the

process that evaluates the rules and configure the email settings for Reporting Services to use when

generating the email notifications. When you save the resulting alert definition, Reporting Services

saves it in the alerting database and schedules a corresponding SQL Server Agent job, as shown in

Figure 10-21.

Data Alert

Designer

SQL Server

Agent Job

Alerting

Save DB

data alert

Create

data alert

Run

report

FIGURE 10-21 Data alert creation process

In the Data Alert Designer, you select a data region in the report to view a preview of the first 100

rows of data for reference as you develop the rules for that data region, which is also known as a data

feed. You can create a simple rule that sends a data alert if the data feed contains data when the SQL

Server Agent job runs. More commonly, you create one or more rules that compare a field value to

value that you enter or to a value in another field. The Data Alert Designer combines multiple rules

for the same data feed by using a logical AND operator only. You cannot change it to an OR operator.

Instead, you must create a separate data alert with the additional condition.

In the Schedule Settings section of the Data Alert Designer, you can configure the daily, weekly,

hourly, or minute intervals at which to run the SQL Server Agent job for the data alert. In the

Advanced

settings, shown in Figure 10-22, you can set the date and time at which you want to start

the job, and optionally set an end date. You also have the option to send the alert only when the alert

results change.

FIGURE 10-22 Data alert advanced schedule settings

Last, you must provide email settings for the data alert by specifying at least one email address as

a recipient for the data alert. If you want to send the data alert to multiple recipients, separate each

email address with a semicolon. You can also include a subject and a description for the data alert,

both of which are static strings.

Alerting Service

The Reporting Services alerting service manages the process of refreshing the data feed and applying

the rules in the data alert definition, as shown in Figure 10-23. Regardless of the results, the alerting

service adds an alerting instance to the alerting database to record the outcome of the evaluation. If

any rows in the data feed satisfy the conditions of the rules during processing, the alerting services

generate an email containing the alert results.

SQL Server

Agent Job

Alerting

Create

alert

instance

Alerting

Send

Apply

rules

Read data

feed

Reporting Services Alerting Service

FIGURE 10-23 Alerting service processing of data alerts

The email for a successful data alert includes the user name of the person who created the alert,

the description of the alert from the alert definition, and the rows from the data feed that generated

the alert, as shown in Figure 10-24. It also includes a link to the report, a description of the alert rules,

and the report parameters used when reading the data feed. The message is sent from the account

you specify in the email settings of the Reporting Services shared services application.

FIGURE 10-24 Email message containing a successful data alert

Note If an error occurs during alert processing, the alerting service saves the alerting

instance to the alerting database and sends an alert message describing the error to the

recipients.

Data Alert Manager

You use the Data Alert Manager to lists all data alerts you create for a report, as shown in

Figure 10-25. To open the Data Alert Manager, open the document library containing your report,

click the down arrow to the right of the report name, and select Manage Data Alerts. You can see

the alerts listed for the current report, but you can use the drop-down list at the top of the page to

change the view to a list of data alerts for all reports or a different report.

FIGURE 10-25 Data Alert Manager

The Data Alert Manager shows you the number of alerts sent by data alert, the last time it was run,

the last time it was modified, and the status of its last execution. If you right-click a data alert on this

page, you can edit the data alert, delete it, or run the alert on demand. No one else can view, edit, or

run the data alerts you create, although the site administrator can view and delete your data alerts.

If you are a site administrator, you can use the Manage Data Alerts link on the Site Settings page to

open the Data Alert Manager. Here you can select a user from a drop-down list to view all data alerts

created by the user for all reports. You can also filter the data alerts by report. As a site administrator,

you cannot edit these data alerts, but you can delete them if necessary.

Alerting Configuration

Data does not accumulate indefinitely in the alerting database. The report server configuration file

contains several settings you can change to override the default intervals for cleaning up data in the

alerting database or to disable alerting. There is no graphical interface available for making changes

to these settings. Instead, you must manually edit the RsReportServer.config file to change the

settings

shown in Table 10-2.

TABLE 10-2 Alerting Configuration Settings in RsReportServer.config

Alerting Setting

Description

Default Value

AlertingCleanupCyclingMinutes

Interval for starting the cleanup cycle of alerting

AlertingExecutionLogCleanupMinutes

Number of minutes to retain data in the alerting execution

log

10080

AlertingDataCleanupMinutes

Number of minutes to retain temporary alerting data

360

AlertingMaxDataRetentionDays

Number of days to retain alert execution metadata, alert

instances,

and execution results

180

IsAlertingService

Enable or disable the alerting service with True or False

values, respectively

True

The SharePoint configuration database contains settings that control how many times the alerting

service retries execution of an alert and the delay between retries. The default for MaxRetries is 3 and

for SecondsBeforeRetry is 900. You can change these settings by using PowerShell cmdlets. These settings

apply to all alert retries, but you can configure different retry settings for each of the following

events:

¦ ¦ FireAlert On-demand execution of an alert launched by a user

¦ ¦ FireSchedule Scheduled execution of an alert launched by a SQL Server Agent job

¦ ¦ CreateSchedule Process to save a schedule defined in a new or modified data alert

¦ ¦ UpdateSchedule Modification of a schedule in a data alert

¦ ¦ DeleteSchedule Deletion of a data alert

¦ ¦ GenerateAlert Alerting-service processing of data feeds, rules, and alert instances

¦ ¦ DeliverAlert Preparation and delivery of email message for a data alert

Index

251

Symbols

32-bit editions of SQL Server 2012, 17

64-bit editions of SQL Server 2012, 17

64-bit processors, 17

debugging on, 99

[catalog].[executions] table, 137

absolute environment references, 133–135

Active Directory security modules, 71

Active Operations dialog box, 130

active secondaries, 32–34

Administrator account, 176

Administrator permissions, 211

administrators

data alert management, 250

Database Engine access, 72

entity management, 192

master data management in Excel, 187–196

tasks of, 135–139

ADO.NET connection manager, 112–113

ad reporting tool, 229

Advanced Encryption Standard (AES), 71

affinity masks, 216

aggregate functions, 220

alerting service, 248–249. See also data alerts

All Executions report, 138

allocation_failure event, 90

All Operations report, 138

Allow All Connections connection mode, 33

Allow Only Read-Intent Connections connection mode,

All Validations report, 138

ALTER ANY SERVER ROLE, permissions on, 72

ALTER ANY USER, 70

ALTER SERVER AUDIT, WHERE clause, 63

ALTER USER DEFAULT_SCHEMA option, 59

AlwaysOn, 4–6, 21–23

AlwaysOn Availability Groups, 4–5, 23–32

availability group listeners, 28–29

availability replica roles, 26

configuring, 29–31

connection modes, 27

data synchronization modes, 27

deployment examples, 30–31

deployment strategy, 23–24

failover modes, 27

monitoring with Dashboard, 31–32

multiple availability group support, 25

multiple secondaries support, 25

prerequisites, 29–30

shared and nonshared storage support, 24

AlwaysOn Failover Cluster Instances, 5, 34–36

Analysis Management Objects (AMOs), 217

Analysis Services, 199–217

cached files settings, 225

cache reduction settings, 225

credentials for data import, 204

database roles, 210–211

DirectQuery mode, 213

event tracing, 215

in-memory vs. DirectQuery mode, 213

model designer, 205–206

model design feature support, 200–202

multidimensional models, 215

multiple instances, 201

with NUMA architecture, 216

PowerShell cmdlets for, 217

Process Full command, 215

processors, number of, 216

programmability, 217

projects, 201–202

query modes, 214

schema rowsets, 216

252

Analysis Services (continued)

server management, 215–216

server modes, 199–201

tabular models, 203–214

templates, 201–202

Analysis Services Multidimensional and Data Mining

Project template, 202

Analysis Services tabular database, 233, 234

Analysis Services Tabular Project template, 202

annotations, 191, 198

reviewing, 191

applications

data-tier, 10

file system storage of files and documents, 77

asynchronous-commit availability mode, 27

asynchronous database mirroring, 21–22

asynchronous data movement, 24

attribute group permissions, 184–185

attribute object permissions, 183

attributes

domain-based, 193–194

modifying, 193

Attunity, 103, 104, 111

auditing, 62–67

audit, creating, 64–65

audit data loss, 62

audit failures, continuing server operation, 63

audit failures, failing server operation, 63, 65

Extended Events infrastructure, 66

file destination, 63, 64

record filtering, 66–67

resilience and, 62–65

support on all SKUs, 62

user-defined audit events, 63, 65–66

users of contained databases, 63

audit log

customized information, 63, 65–66

filtering events, 63, 66–67

Transact-SQL stack frame information, 63

authentication

claims-based, 231, 233

Contained Database Authentication, 68–69

middle-tier, 66

user, 9

automatic failover, 27

availability. See also high availability (HA)

new features, 4–6

systemwide outages and, 62

availability databases

backups on secondaries, 33–34

failover, 25

hosting of, 26

availability group listeners, 28–29

availability groups. See AlwaysOn Availability Groups

availability replicas, 26

backup history, visualization of, 39–40

backups on secondaries, 33–34

bad data, 141

batch-mode query processing, 45

beyond-relational paradigm, 11, 73–90

example, 76

goals, 74

pain points, 73

bidirectional data synchronization, 9

BIDS (Business Intelligence Development Studio), 93.

Introducing Microsoft SQL Server 2008 R2

Posted on 4 Apr 20134 Apr 2013 by escapebusinesssolutions in Uncategorized

Contents at a Glance

Introduction xvii

PART I DATABASE ADMINISTRATION

CHAPTER 1 SQL Server 2008 R2 Editions and Enhancements 3

CHAPTER 2 Multi-Server Administration 21

CHAPTER 3 Data-Tier Applications 41

CHAPTER 4 High Availability and Virtualization Enhancements 63

CHAPTER 5 Consolidation and Monitoring 85

PART II BUSINESS INTELLIGENCE DEVELOPMENT

CHAPTER 6 Scalable Data Warehousing 109

CHAPTER 7 Master Data Services 125

CHAPTER 8 Complex Event Processing with StreamInsight 145

CHAPTER 9 Reporting Services Enhancements 165

CHAPTER 10 Self-Service Analysis with PowerPivot 189

What do you think of this book? We want to hear from you!

Microsoft is interested in hearing your feedback so we can continually improve our

books and learning resources for you. To participate in a brief online survey, please visit:

microsoft.com/learning/booksurvey

vii

Contents

Introduction xvii

PART I DATABASE ADMINISTRATION

CHAPTER 1 SQL Server 2008 R2 Editions and Enhancements 3

SQL Server 2008 R2 Enhancements for DBAs. 3

Application and Multi-Server Administration Enhancements 4

Additional SQL Server 2008 R2 Enhancements for DBAs 8

Advantages of Using Windows Server 2008 R2 . 10

SQL Server 2008 R2 Editions. 11

Premium Editions 12

Core Editions 12

Specialized Editions 13

Hardware and Software Requirements. 14

Installation, Upgrade, and Migration Strategies. 16

The In-Place Upgrade 16

Side-by-Side Migration 18

CHAPTER 2 Multi-Server Administration 21

The SQL Server Utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

SQL Server Utility Key Concepts 23

UCP Prerequisites 25

UCP Sizing and Maximum Capacity Specifications 25

viii Contents

Creating a UCP . 26

Creating a UCP by Using SSMS 26

Creating a UCP by Using Windows PowerShell 28

UCP Post-Installation Steps 29

Enrolling SQL Server Instances. 29

Managed Instance Enrollment Prerequisites 30

Enrolling SQL Server Instances by Using SSMS 30

Enrolling SQL Server Instances by Using Windows PowerShell 32

The Managed Instances Dashboard 32

Managing Utility Administration Settings . 33

Connecting to a UCP 33

The Policy Tab 34

The Security Tab 37

The Data Warehouse Tab 39

CHAPTER 3 Data-Tier Applications 41

Introduction to Data-Tier Applications. 41

The Data-Tier Application Life Cycle 42

Common Uses for Data-Tier Applications 43

Supported SQL Server Objects 44

Visual Studio 2010 and Data-Tier Application Projects. 45

Launching a Data-Tier Application

Project Template in Visual Studio 2010 45

Importing an Existing Data-Tier

Application Project into Visual Studio 2010 47

Extracting a Data-Tier Application with

SQL Server Management Studio. 49

Installing a New DAC Instance with the

Deploy Data-Tier Application Wizard. 52

Registering a Data-Tier Application. 55

Deleting a Data-Tier Application. 56

Upgrading a Data-Tier Application. 59

Contents ix

CHAPTER 4 High Availability and Virtualization Enhancements 63

Enhancements to High Availability with Windows Server 2008 R2. 63

Failover Clustering with Windows Server 2008 R2. 64

Traditional Failover Clustering 65

Guest Failover Clustering 67

Enhancements to the Validate A Configuration Wizard 68

The Windows Server 2008 R2 Best Practices Analyzer 71

SQL Server 2008 R2 Virtualization and Hyper-V. 72

Live Migration Support Through CSV 72

Windows Server 2008 R2 Hyper-V System Requirements 73

Practical Uses for Hyper-V and SQL Server 2008 R2 74

Implementing Live Migration for SQL Server 2008 R2. 75

Enabling CSV 76

Creating a SQL Server VM with Hyper-V 76

Configuring a SQL Server VM for Live Migration 79

Initiating a Live Migration of a SQL Server VM 83

CHAPTER 5 Consolidation and Monitoring 85

SQL Server Consolidation Strategies. 85

Consolidating Databases and Instances 86

Consolidating SQL Server Through Virtualization 87

Using the SQL Server Utility for Consolidation and Monitoring . 89

Using the SQL Server Utility Dashboard. 90

Using the Managed Instances Viewpoint . 95

The Managed Instances List View Columns 96

The Managed Instances Detail Tabs 97

Using the Data-Tier Application Viewpoint . 100

The Data-Tier Application List View 102

The Data-Tier Application Tabs 102

PART II BUSINESS INTELLIGENCE DEVELOPMENT

CHAPTER 6 Scalable Data Warehousing 109

Parallel Data Warehouse Architecture . 109

Data Warehouse Appliances 109

Processing Architecture 110

The Multi-Rack System 110

Hub-and-Spoke Architecture 115

Data Management. 115

Shared Nothing Architecture 115

Data Types 120

Query Processing 121

Data Load Processing 121

Monitoring and Management. 122

Business Intelligence Integration. 123

Integration Services 123

Reporting Services 123

Analysis Services and PowerPivot 123

CHAPTER 7 Master Data Services 125

Master Data Management . 125

Master Data Challenges 125

Key Features of Master Data Services 126

Master Data Services Components. 127

Master Data Services Configuration Manager 128

The Master Data Services Database 128

Master Data Manager 128

Data Stewardship . 129

Model Objects 129

Master Data Maintenance 131

Business Rules 132

Transaction Logging 134

Integration. 135

Importing Master Data 135

Exporting Master Data 136

Administration. 137

Versions 137

Security 138

Model Deployment 142

Programmability. 142

The Class Library 142

Master Data Services Web Service 143

Matching Functions 143

CHAPTER 8 Complex Event Processing with StreamInsight 145

Complex Event Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .145

Complex Event Processing Applications 145

StreamInsight Highlights 146

StreamInsight Architecture. 146

Data Structures 147

The CEP Server 147

Deployment Models 149

Application Development. 150

Event Types 150

Adapters 151

Query Templates 154

Queries 155

Query Template Binding 162

The Query Object 163

The Management Interface. 163

Diagnostic Views 163

Windows PowerShell Diagnostics 164

CHAPTER 9 Reporting Services Enhancements 165

New Data Sources. 165

Expression Language Improvements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .165

Combining Data from More Than One Dataset 166

Aggregation 168

Conditional Rendering Expressions 169

Page Numbering 170

Read/Write Report Variable 170

Layout Control. 171

Pagination Properties 172

Data Synchronization 173

Text Box Orientation 174

Data Visualization. 175

Data Bars 175

Sparklines 176

Indicators 176

Maps 177

Reusability. 178

Shared Datasets 179

Cache Refresh 179

Report Parts 180

Atom Data Feed 182

Report Builder 3.0. 183

Edit Sessions 183

The Report Part Gallery 183

Report Access and Management. 184

Report Manager Improvements 184

Report Viewer Improvements 186

Improved Browser Support 186

RDL Sandboxing 186

SharePoint Integration. 187

Improved Installation and Configuration 187

RS Utility Scripting 187

SharePoint Lists as Data Sources 187

SharePoint Unified Logging Service 188

CHAPTER 10 Self-Service Analysis with PowerPivot 189

PowerPivot for Excel. 190

The PowerPivot Add-in for Excel 190

Data Sources 191

Data Preparation 193

PowerPivot Reports 196

Data Analysis Expressions 199

PowerPivot for SharePoint. 201

Architecture 201

Content Management 204

Data Refresh 205

Linked Documents 205

The PowerPivot Web Service 205

The PowerPivot Management Dashboard. 206

Index 207

About the Authors 215

What do you think of this book? We want to hear from you!

Microsoft is interested in hearing your feedback so we can continually improve our

books and learning resources for you. To participate in a brief online survey, please visit:

microsoft.com/learning/booksurvey

Acknowledgments

I would like to first acknowledge Shirmattie Seenarine for assisting me on this

title. I couldn’t have written this book without your assistance in such a short

timeframe with everything else going on in my life. Your hard work, contributions,

edits, and perseverance are much appreciated.

Thank you to fellow SQL Server MVP Kevin Kline for introducing me to the

former SQL Server product group manager Matt Hollingsworth, who started the

chain of events that led up to this book. In addition, I would like recognize Ken

Jones, former product planner at Microsoft Press, for taking on this project. I

would also like to thank my coauthor, Stacia Misner, for doing a wonderful job

in writing the second portion of this book, which focuses on business intelligence

(BI). I appreciate your support and talent in the creation of this title.

I would also like to recognize the folks at Microsoft Press for providing me with

this opportunity and for putting the book together in a timely manner. Special

thanks goes to Maria Gargiulo, project editor, and Karen Szall, developmental editor,

for driving the project and bringing me up to speed on the “Microsoft Press”

way. Maria, your attention to detail and organizational skills during the multiple

rounds of edits and reviews is much appreciated. Also, thanks to all the folks on

the production team at Online Training Solutions, Inc. (OTSI): Jean Trenary, project

manager; Kathy Krause, copy editor; Rozanne Whalen, technical reviewer; and

Kathleen Atkins, proofreader.

This book would not have been possible without the support and assistance

of numerous individuals working for the SQL Server, High Availability, Failover

Clustering, and Virtualization product groups at Microsoft. To my colleagues on

the product team, thanks for your assistance in responding to my questions and

providing chapter reviews:

¦ SQL Server Manageability Dan Jones, Principal Group Program

Manager; Omri Bahat, Senior Program Manager; Morgan Oslake, Senior

Program Manager; Alan Brewer, Senior Programming Writer; and Tai Yee,

Program Manager II

¦ Clustering, High Availability, Virtualization, and Consolidation

Symon Perriman, Program Manager II; Ahmed Bisht, Senior Program

Manager; Max Verun, Senior Program Manager; Tai Yee, Program Manager;

Justin Erickson, Program Manager II; Zhen-Yu Zhao, SDET II; Madhan

Arumugam, Program Manager Lead II; and Steven Ekren, Senior Program

Manager

¦ General Overview and Enhancements Sabrena McBride, Senior Product

Manager

xvii

Introduction

Our purpose in Introducing Microsoft SQL Server 2008 R2 is to point out both

the new and the improved in the latest version of SQL Server. Because this

version is Release 2 (R2) of SQL Server 2008, you might think the changes are

relatively minor—more than a service pack, but not enough to justify an entirely

new version. However, as you read this book, we think you will find that there are a

lot of exciting enhancements and new capabilities engineered into SQL Server 2008 R2

that will have a positive impact on your applications, ranging from improvements

in operation to those in management. It is definitely not a minor release!

Who Is This Book For?

This book is for anyone who has an interest in SQL Server 2008 R2 and wants to

understand its capabilities. In a book of this size, we cannot cover every feature

that distinguishes SQL Server from other databases, and consequently we assume

that you have some familiarity with SQL Server already. You might be a database

administrator (DBA), an application developer, a power user, or a technical

decision maker. Regardless of your role, we hope that you can use this book to

discover the features in SQL Server 2008 R2 that are most beneficial to you.

How Is This Book Organized?

SQL Server 2008 R2, like its predecessors, is more than a database engine. It is a

collection of components that you can implement either separately or as a group

to form a scalable data platform. In broad terms, this data platform consists of

two types of components—those that help you manage data and those that help

you deliver business intelligence (BI). Accordingly, we have divided this book into

two parts to focus on the new capabilities for each of these areas.

Part I, “Database Administration,” is written with the DBA in mind and introduces

readers to the numerous innovations in SQL Server 2008 R2. Chapter 1, “SQL

Server 2008 R2 Editions and Enhancements,” discusses the key enhancements,

what’s new in the different editions of SQL Server 2008 R2, and the benefits of

running SQL Server 2008 R2 on Windows Server 2008 R2. In Chapter 2, “Multi-

Server Administration,” readers learn how centralized management capabilities

xviii Introduction

are improved with the introduction of the SQL Server Utility Control Point. Stepby-

step instructions show DBAs how to quickly designate a SQL Server instance as

a Utility Control Point and enroll instances for centralized multi-server management.

Chapter 3, “Data-Tier Applications,” focuses on how to streamline deployment

and manage and upgrade database applications with the new data-tier application

feature. Chapter 4, “High Availability and Virtualization Enhancements,”

covers high availability enhancements and includes step-by-step implementations

for ensuring business continuity with SQL Server 2008 R2, Windows Server 2008

R2, and Hyper-V Live Migration. Finally, in Chapter 5, “Consolidation and Monitoring,”

a discussion on consolidation strategies teaches readers how to improve

resource optimization. This chapter also explains how to use the new dashboard

and viewpoints to gain insight into application and database utilization, and it also

covers how to use capacity policy violations to help identify consolidation opportunities,

maximize investments, and ultimately maintain healthier systems.

In Part II, “Business Intelligence Development,” readers discover components

new to the SQL Server data platform, as well as significant enhancements to the

reporting component. Chapter 6, “Scalable Data Warehousing,” introduces the

data warehouse appliance known as SQL Server 2008 R2 Parallel Data Warehouse

by explaining its architecture, reviewing data layout strategies for optimal query

performance, and describing the integration points with SQL Server BI components.

In Chapter 7, “Master Data Services,” readers learn about master data

management concepts and the new Master Data Services component. Chapter 8,

“Complex Event Processing with StreamInsight,” describes scenarios that benefit

from complex event analysis, and it illustrates how to develop applications that

use the SQL Server StreamInsight engine for complex event processing. Chapter

9, “Reporting Services Enhancements,” reviews all the new features available in

SQL Server 2008 R2 Reporting Services that support self-service reporting and

address common report design problems. Last, Chapter 10, “Self-Service Analysis

with PowerPivot,” continues the theme of self-service by explaining how users can

integrate disparate data for analysis by using SQL Server PowerPivot for Excel, and

how to centralize and share the results of this analysis by using SQL Server PowerPivot

for SharePoint.

Pre-Release Software

To help you get familiar with SQL Server 2008 R2 as early as possible after its

release, we wrote this book using examples that work with the Release Candidate

0 (RC0) version of the product. Consequently, the final version might include new

features, and features we discuss might change or disappear. Refer to the “What’s

Introduction xix

New” topic in SQL Server Books Online at http://msdn.microsoft.com/en-us

/library/bb500435(SQL.105).aspx for the most up-to-date list of changes to the

product. Be aware that you might also notice some minor differences between the

RTM version of the product and the descriptions and screen shots that we provide.

Support for This Book

Every effort has been made to ensure the accuracy of this book. As corrections or

changes are collected, they will be added to a Microsoft Knowledge Base article

accessible via the Microsoft Help and Support site. Microsoft Press provides support

for books, including instructions for finding Knowledge Base articles, at the

following Web site:

http://www.microsoft.com/learning/support/books/

If you have questions regarding the book that are not answered by visiting this

site or viewing a Knowledge Base article, send them to Microsoft Press via e-mail

to mspinput@microsoft.com.

Please note that Microsoft software product support is not offered through

these addresses.

We Want to Hear from You

We welcome your feedback about this book. Please share your comments and

ideas via the following short survey:

http://www.microsoft.com/learning/booksurvey

Your participation will help Microsoft Press create books that better meet your

needs and your standards.

NOTE We hope that you will give us detailed feedback via our survey.

If you have questions about our publishing program, upcoming titles, or

Microsoft Press in general, we encourage you to interact with us via Twitter

at http://twitter.com/MicrosoftPress. For support issues, use only the e-mail

address shown above.

C H A P T E R 1

SQL Server 2008 R2 Editions

and Enhancements

Microsoft SQL Server 2008 R2 is the most advanced, trusted, and scalable data

platform released to date. Building on the success of the original SQL Server 2008

release, SQL Server 2008 R2 has made an impact on organizations worldwide with its

groundbreaking capabilities, empowering end users through self-service business intelligence

(BI), bolstering efficiency and collaboration between database administrators (DBAs) and application

developers, and scaling to accommodate the most demanding data workloads.

This chapter introduces the new SQL Server 2008 R2 features, capabilities, and editions

from a DBA’s perspective. It also discusses why Windows Server 2008 R2 is recommended

as the underlying operating system for deploying SQL Server 2008 R2. Last, SQL

Server 2008 R2 hardware and software requirements and installation strategies are also

identified.

SQL Server 2008 R2 Enhancements for DBAs

Now more than ever, organizations require a trusted, cost-effective, and scalable database

platform that offers efficiency and managed self-service BI. These organizations

face ever-changing business conditions in the global economy, IT budget constraints,

and the need to stay competitive by obtaining and utilizing the right information at the

right time.

With SQL Server 2008 R2, they can meet the pressures head on to achieve these

demanding goals. This release delivers an award-winning enterprise-class database platform

with robust capabilities that improve efficiency through better resource utilization,

end-user empowerment, and scaling out at lower costs. Enhancements to scalability and

performance, high availability, enterprise security, enterprise manageability, data warehousing,

reporting, self-service BI, collaboration, and tight integration with Microsoft

Visual Studio 2010, Microsoft SharePoint 2010, and SQL Server PowerPivot for SharePoint

make it the best database platform available.

SQL Server 2008 R2 is considered to be a minor version upgrade of SQL Server 2008.

However, for a minor upgrade it offers a tremendous amount of new, breakthrough

capabilities that DBAs can take advantage of.

4 CHAPTER 1 SQL Server 2008 R2 Editions and Enhancements

Microsoft has made major investments in the SQL Server product as a whole; however,

the new features and breakthrough capabilities that should interest DBAs the most are the

advancements in application and multi-server administration. This section introduces some of

the new features and capabilities.

Application and Multi-Server Administration Enhancements

The SQL Server product group has made sizeable investments in improving application and

multi-server management capabilities. Some of the main application and multi-server administration

enhancements that allow organizations to better manage their SQL Server environments

include

¦ The SQL Server Utility This is a new manageability feature used to centrally

monitor and manage database applications and SQL Server instances from a single

management interface known as a Utility Control Point (UCP). Instances of SQL Server,

data-tier applications, database files, and volumes are managed and viewed within the

SQL Server Utility.

¦ The Utility Control Point (UCP) As the central reasoning point for the SQL Server

Utility, the Utility Control Point collects configuration and performance information

from managed instances of SQL Server every 15 minutes. After data has been collected

from the managed instances, the SQL Server Utility dashboard and viewpoints in SQL

Server Management Studio (SSMS) provide DBAs with a health summary of SQL Server

resources through policy evaluation and historical analysis. For more information on

the SQL Server Utility, Utility Control Points, and managing instances of SQL Server, see

Chapter 2, “Multi-Server Administration.”

¦ Data-tier applications A data-tier application (DAC) is a single unit of deployment

containing all of the database’s schema, dependant objects, and deployment requirements

used by an application. A DAC can be deployed in one of two ways: it can be

authored by using the SQL Server data-tier application project in Visual Studio 2010,

or it can be created by extracting a DAC definition from an existing database with the

Extract Data-Tier Application Wizard in SSMS. Through the use of DACs, the deployment

of data applications and the collaboration between data-tier developers and

DBAs is significantly improved. For more information on authoring, deploying, and

managing data-tier applications, see Chapter 3, “Data-Tier Applications.”

¦ Utility Explorer dashboards The dashboards in the SQL Server Utility offer DBAs

tremendous insight into resource utilization and health state for managed instances of

SQL Server and deployed data-tier applications across the enterprise. Before the introduction

of the SQL Server Utility, DBAs did not have a powerful tool included with SQL

Server to assist them in monitoring resource utilization and health state. Most organizations

purchased third-party tools, which resulted in additional costs associated with

SQL Server 2008 R2 Enhancements for DBAs CHAPTER 1 5

the total cost of ownership of their database environment. The new SQL Server Utility

dashboards also assist with consolidation efforts. Figure 1-1 illustrates SQL Server Utility

dashboard and viewpoints for providing superior insight into resource utilization and

policy violations.

FIGURE 1-1 Monitoring resource utilization with the SQL Server Utility dashboard and viewpoints

¦ Consolidation management Organizations can maximize their investments by

consolidating SQL Server resources onto fewer systems. DBAs, in turn, can bolster their

consolidation efforts through their use of SQL Server Utility dashboards and viewpoints,

which easily identify underutilized and overutilized SQL Server resources across

the SQL Server Utility. As illustrated in Figure 1-2, dashboards and viewpoints make it

simple for DBAs to realize consolidation opportunities, start the process toward eliminating

underutilization, and resolve overutilization issues to create healthier, pristine

environments.

FIGURE 1-2 Identifying consolidation opportunities with the SQL Server Utility dashboard and

viewpoints

¦ Customization of utilization thresholds and policies DBAs can customize the

utilization threshold and policies for managed instances of SQL Server and deployed

data-tier applications to suit the needs of their environments. For example, DBAs can

specify the CPU utilization policies, file space utilization policies, computer CPU utilization

policies, and storage volume utilization policies for all managed instances of SQL Server.

Furthermore, they can customize the global utilization policies for data-tier applications.

For example, a DBA can specify the CPU utilization policies and file space utilization policies

for all data-tier applications. The default policy setting for overutilization is 70 percent,

whereas underutilization is set to 0 percent. By customizing the utilization threshold

policies, DBAs can maintain higher service levels for their SQL Server environments.

Figure 1-3 illustrates the SQL Server Utility. In this figure, a Utility Control Point has been

deployed and is collecting health state and resource utilization data from managed instances of

SQL Server and deployed data-tier applications. A DBA is making use of the SQL Server Utility

dashboards and viewpoints included in SSMS to proactively and efficiently manage the database

environment. This can be done at scale, with information on resource utilization throughout the

managed database environment, as a result of centralized visibility. In addition, a data-tier developer

is building a data-tier application with Visual Studio 2010; the newly created DAC package

will be deployed to a managed instance of SQL Server through the Utility Control Point.

Utility dashboard to

monitor health state

Uplo

ad dcaotlale csetion

Managed instance

Utility Control Point

UMDW msdb

Upload collection

data set

Uplo

ad collection data se t

Managed instance

DBA

SSMS

D e l i v e

D A

C p

a c k a g e o n t o

managed instance

Developer

Visual Studio

2010

DAC

FIGURE 1-3 The SQL Server Utility, including a UPC, managed instances, and a DAC

In the example in Figure 1-4, a DBA has optimized hardware resources within the environment

by modifying the global utilization policies to meet the needs of the organization. For

example, the global CPU overutilization policies of a managed instance of SQL Server and

computer have been configured to be overutilized when the utilization is greater than 85

percent. In addition, the global file space and storage volume overutilization policies for all

managed instances of SQL Server have been changed to 65 percent.

FIGURE 1-4 Configuring overutilization and underutilization global policies for managed instances

For more information on consolidation, monitoring, using the SQL Server Utility dashboards,

and modifying policies, see Chapter 5, “Consolidation and Monitoring.”

Additional SQL Server 2008 R2 Enhancements for DBAs

This section focuses on the SQL Server 2008 R2 enhancements that go above and beyond

application and multi-server administration. DBAs should be aware of the following new

capabilities:

¦ Parallel Data Warehouse Parallel Data Warehouse is a highly scalable appliance

for enterprise data warehousing. It consists of both software and hardware designed to

meet the needs of the largest data warehouses. This solution has the ability to massively

scale to hundreds of terabytes with the use of new technology, referred to as massively

parallel processing (MPP), and through inexpensive hardware configured in a hub-and-

spoke (control node and compute nodes) architecture. Performance improvements can

be attained with Parallel Data Warehouse’s design approach because it partitions large

tables over several physical nodes, resulting in each node having its own CPU, memory,

storage, and SQL Server instance. This design directly eliminates issues with speed and

provides scale because a control node evenly distributes data to all compute nodes.

The control node is also responsible for gathering data from all compute nodes when

returning queries to applications. There isn’t much a DBA needs to do from an implementation

perspective—the deployment and maintenance is simplified because the

solution comes preassembled from certified hardware vendors.

¦ Integration with Microsoft SQL Azure The client tools included with SQL Server

2008 R2 allow DBAs to connect to SQL Azure, a cloud-based service. SQL Azure is

part of the Windows Azure platform and offers a flexible and fully relational database

solution in the cloud. The hosted database is built on SQL Server technologies and is

completely managed. Therefore, organizations do not have to install, configure, or deal

with the day-to-day operations of managing a SQL Server infrastructure to support

their database needs. Other key benefits offered by SQL Azure include simplification

of the provisioning process, support for Transact-SQL, and transparent failover. Yet another

enhancement affiliated with SQL Azure is the Generate And Publish Scripts Wizard,

which now includes SQL Azure as both a source and a destination for publishing

scripts. SQL Azure has something for businesses of all sizes. For example, startups and

medium-sized businesses can use this service to create scalable, custom applications,

and larger businesses can use SQL Azure to build corporate departmental applications.

¦ Installation of SQL Server with Sysprep Organizations have been using the

System Preparation tool (Sysprep) for many years now to automate the deployment

of operating systems. SQL Server 2008 R2 introduces this technology to SQL Server.

Installing SQL Server with Sysprep involves a two-step procedure that is typically conducted

by using wizards on the Advanced page of the Installation Center. In the first

step, a stand-alone instance of SQL Server is prepared. This step prepares the image;

however, it stops the installation process after the binaries of SQL Server are installed.

To initiate this step, select the Image Preparation Of A Stand-Alone Instance For SysPrep

Deployment option on the Advanced page of the Installation Center. The second

step completes the configuration of a prepared instance of SQL Server by providing

the machine, network, and account-specific information for the SQL Server instance.

This task can be carried out by selecting the Image Completion Of A Prepared Stand-

Alone Instance step on the Advanced page of the Installation Center. SQL Server 2008

R2 Sysprep is recommended for DBAs seeking to automate the deployment of SQL

Server while investing the least amount of their time.

¦ Analysis Services integration with SharePoint SQL Server 2008 R2 introduces

a new option to individually select which feature components to install. SQL Server

PowerPivot for SharePoint is a new role-based installation option in which PowerPivot

for SharePoint will be installed on a new or existing SharePoint 2010 server to support

PowerPivot data access in the farm. This new approach promises better integration

with SharePoint while also enhancing SharePoint’s support of PowerPivot workbooks

published to SharePoint. Chapter 10, “Self-Service Analysis with PowerPivot,” discusses

PowerPivot for SharePoint.

NOTE In order to use this new installation feature option, SharePoint 2010 must be

installed but not configured prior to installing SQL Server 2008 R2.

¦ Premium Editions SQL Server 2008 R2 introduces two new premium editions to

meet the needs of large-scale data centers and data warehouses. The new editions,

Datacenter and Parallel Data Warehouse, will be discussed in the “SQL Server 2008 R2

Editions” section later in this chapter.

¦ Unicode Compression SQL Server 2008 R2 supports compression for Unicode

data types. The data types that support compression are the unicode compression and

the fixed-length nchar(n) and nvarchar(n) data types. Unfortunately, values stored off

row or in nvarchar(max) columns are not compressed. Compression rates of up to 50

percent in storage space can be achieved.

¦ Extended Protection SQL Server 2008 R2 introduces support for connecting to the

Database Engine by using Extended Protection for Authentication. Authentication is

achieved by using channel binding and service binding for operating systems that support

Extended Protection.

Advantages of Using Windows Server 2008 R2

The database platform is intimately related to the operating system. Because of this relationship,

Microsoft has designed Windows Server 2008 R2 to provide a solid IT foundation for

business-critical applications such as SQL Server 2008 R2. The combination of the two products

produces an impressive package. With these two products, an organization can achieve

maximum performance, scalability, reliability, and availability, while at the same time reducing

the total cost of ownership associated with its database platform.

It is a best practice to leverage Windows Server 2008 R2 as the underlying operating

system when deploying SQL Server 2008 R2 because the new and enhanced capabilities of

Windows Server 2008 R2 can enrich an organization’s experience with SQL Server 2008 R2.

The new capabilities that have direct impact on SQL Server 2008 R2 include

¦ Maximum scalability Windows Server 2008 R2 is capable of achieving unprecedented

workload size, dynamic scalability, and across-the-board availability and reliability.

For instance, Windows Server 2008 R2 supports up to 256 logical processors

and 2 terabytes of memory in a single operating system instance. When SQL Server

2008 R2 runs on Windows Server 2008 R2, the two products together can support

more intensive database and BI workloads than ever before.

¦ Hyper-V improvements Building on the approval and success of the original

Hyper-V release, Windows Server 2008 R2 delivers several new capabilities to the

Hyper-V platform to further improve the SQL Server virtualization experience. First,

availability can be stepped up with the introduction of Live Migration, which makes it

possible to move SQL Server virtual machines (VMs) between Hyper-V hosts without

service interruption. Second, Hyper-V can make use of up to 64 logical processors in

the host processor pool, which allows for consolidation of a greater number of SQL

Server VMs on a single Hyper-V host. Third, Dynamic Virtual Machine Storage, a new

feature, allows for the addition of virtual or physical disks to an existing VM without

requiring the VM to be restarted.

¦ Windows Server 2008 R2 Server Manager Server Manager has been optimized

in Windows Server 2008 R2. It is usually used to centrally manage and secure multiple

server roles across SQL Server instances running Windows Server 2008 R2. Remote

management of connections to remote computers is achievable with Server Manager.

Server Manager also includes a new Best Practices Analyzer tool to report best practice

violations.

¦ Best Practices Analyzer (BPA) Although there are only a few roles on Windows

Server 2008 R2 that the BPA can collect data for, this tool is still a good investment

because it helps reduce best practice violations, which ultimately helps fix and prevent

deterioration in performance, scalability, and downtime.

¦ Windows PowerShell 2.0 Windows Server 2008 R2 ships with Windows PowerShell

2.0. In addition to allowing DBAs to run Windows PowerShell commands against

remote computers and run commands as asynchronous background jobs, Windows

PowerShell 2.0 features include new and improved Windows Management Instrumentation

(WMI) cmdlets, a script debugging feature, and a graphical environment

for creating scripts. DBAs can improve their productivity with Windows PowerShell by

simplifying, automating, and consolidating repetitive tasks and server management

processes across a distributed SQL Server environment.

SQL Server 2008 R2 Editions

SQL Server 2008 R2 is available in nine different editions. The editions were designed to meet

the needs of almost any customer and are broken down into the following three categories:

¦ Premium editions

¦ Core editions

¦ Specialized editions

Premium Editions

The premium editions of SQL Server 2008 R2 are meant to meet the highest demands of

large-scale datacenters and data warehouse solutions. The two editions are

¦ Datacenter For the first time in the history of SQL Server, a datacenter edition is offered.

SQL Server 2008 R2 Datacenter provides the highest levels of security, reliability,

and scalability when compared to any other edition. SQL Server 2008 R2 Datacenter delivers

an enterprise-class data platform that provides maximum levels of scalability for

organizations looking to run very large database workloads. In addition, this edition offers

the best platform for the most demanding virtualization and consolidation efforts.

It offers the same features and functionality as the Enterprise edition; however, it differs

by supporting up to 256 logical processors, more than 25 managed instances of SQL

Server enrolled into a single Utility Control Point, unlimited virtualization, multi-instance

dashboard views and drilldowns, policy-based resource utilization evaluation, high-scale

complex event processing with Microsoft SQL Server StreamInsight, and the potential to

sustain up to the maximum amount of memory the operating system will support.

¦ Parallel Data Warehouse New to the family of SQL Server editions is SQL Server

2008 R2 Parallel Data Warehouse. It is a highly scalable appliance for enterprise data

warehousing. SQL Server 2008 R2 Parallel Data Warehouse uses massively parallel

processing (MPP) technology and hub-and-spoke architecture to support the largest

data warehouse and BI workloads, from tens or hundreds of terabytes to more than 1

petabyte, in a single solution. SQL Server 2008 R2 Parallel Data Warehouse appliances

are pre-built from leading hardware venders and include both the SQL Server software

and appropriate licenses.

Core Editions

The traditional Enterprise and Standard editions of SQL Server are considered to be core edition

offerings in SQL Server 2008 R2. The following section outlines the features associated

with both SQL Server 2008 R2 Enterprise and Standard:

¦ Enterprise SQL Server 2008 R2 Enterprise delivers a comprehensive, trusted data

platform for demanding, mission-critical applications, BI solutions, and reporting.

Some of the new features included in this edition include support for up to eight processors,

enrollment of up to 25 managed instances of SQL Server into a single Utility

Control Point, PowerPivot for SharePoint, data compression support for UCS-2 Unicode,

Master Data Services, support for up to four virtual machines, and the potential to

sustain up to 2 terabytes of RAM. It still provides high levels of availability, scalability, and

security, and includes classic SQL Server 2008 features such as data and backup compression,

Resource Governor, Transparent Data Encryption (TDE), advanced data mining

algorithms, mirrored backups, and Oracle publishing.

¦ Standard SQL Server 2008 R2 Standard is a complete data management and BI

platform that provides medium-class solutions for smaller organizations. It does not

include all the bells and whistles included in Datacenter and Enterprise; however, it

continues to offer best-in-class ease of use and manageability. Backup compression,

which was an enterprise feature with SQL Server 2008, is now a feature included with

the SQL Server 2008 R2 Standard. Compared to Datacenter and Enterprise, Standard

supports only up to four processors, up to 64 GB of RAM, one virtual machine, and two

failover clustering nodes.

Specialized Editions

SQL Server 2008 R2 continues to deliver specialized editions for organizations that have

unique sets of requirements.

¦ Developer Developer includes all of the features and functionality found in Datacenter;

however, it is strictly meant to be used for development, testing, and demonstration

purposes only. It is worth noting that it is possible to transition a SQL Server

Developer installation that is used for testing or development purposes directly into

production by upgrading it to SQL Server 2008 Enterprise without reinstallation.

¦ Web At a much more affordable price compared to Datacenter, Enterprise, and Standard,

SQL Server 2008 R2 Web is focused on service providers hosting Internet-facing

Web serving environments. Unlike Workgroup and Express, this edition doesn’t have

a small database size restriction, and it supports four processors and up to 64 GB of

memory. SQL Server 2008 R2 Web does not offer the same premium features found in

Datacenter, Enterprise, and Standard; however, it is still the ideal platform for hosting Web

sites and Web applications.

¦ Workgroup Workgroup is the next SQL Server 2008 R2 edition and is one step below

the Web edition in price and functionality. It is a cost-effective, secure, and reliable

database and reporting platform meant for running smaller workloads than Standard.

For example, this edition is ideal for branch office solutions such as branch data

storage, branch reporting, and remote synchronization. Similar to Web, it supports a

maximum database size of 524 terabytes; however, it supports only two processors

and up to 4 GB of RAM. It is worth noting that it is possible to upgrade Workgroup to

Standard or Enterprise.

¦ Express This free edition is the best entry-level alternative for independent software

vendors, nonprofessional developers, and hobbyists building client applications. This

edition is integrated with Visual Studio and is great for individuals learning about databases

and how to build client applications. Express is limited to one processor, 1 GB of

memory, and a maximum database size of 10 GB.

¦ Compact SQL Server 2008 R2 Compact is typically used to develop mobile and small

desktop applications. It is free to use and is commonly redistributed with embedded

and mobile independent software vendor (ISV) applications.

NOTE Review “Features Supported by the Editions of SQL Server 2008 R2” at

http://msdn.microsoft.com/en-us/library/cc645993(SQL.105).aspx for a complete comparison

of the key capabilities of the different editions of SQL Server 2008 R2.

Hardware and Software Requirements

The recommended hardware and software requirements for SQL Server 2008 R2 vary

depending on the component you want to install, the load anticipated on the servers, and

the type of processor class that you will use. Tables 1-1 and 1-2 describe the hardware and

software requirements for SQL Server 2008 R2.

Because SQL Server 2008 R2 supports many processor types and operating systems, Table

1-1 strictly covers the hardware requirements for a typical SQL Server 2008 R2 installation.

Typical installations include SQL Server 2008 R2 Standard and Enterprise running on Windows

Server operating systems. If you need information for Itanium-based systems or compatible

desktop operating systems, see “Hardware and Software Requirements for Installing SQL

Server 2008 R2″ at http://msdn.microsoft.com/en-us/library/ms143506(SQL.105).aspx.

TABLE 1-1 Hardware Requirements

HARDWARE COMPONENT

REQUIREMENTS

Processor

Processor type: (64-bit) x64

¦ Minimum: AMD Opteron, AMD Athlon 64, Intel Xeon

with Intel EM64T support, Intel Pentium IV with EM64T

support

¦ Processor speed: minimum 1.4 GHz; 2.0 GHz or faster

recommended

Processor type: (32-bit)

¦ Intel Pentium III-compatible processor or faster

¦ Processor speed: minimum 1.0 GHz; 2.0 GHz or faster

recommended

Memory (RAM)

Minimum: 1 GB

Recommended: 4 GB or more

Maximum: Operating system maximum

HARDWARE COMPONENT

REQUIREMENTS

Disk Space

Database Engine: 280 MB

Analysis Services: 90 MB

Reporting Services: 120 MB

Integration Services: 120 MB

Client components: 850 MB

SQL Server Books Online: 240 MB

TABLE 1-2 Software Requirements

SOFTWARE COMPONENT

REQUIREMENTS

Operating system

Windows Server 2003 SP2 x64 Datacenter, Enterprise, or Standard

edition

The 64-bit editions of Windows Server 2008 SP2 Datacenter,

Datacenter without Hyper-V, Enterprise, Enterprise without

Hyper-V, Standard, Standard without Hyper-V, or Windows

Web Server 2008

Windows Server 2008 R2 Datacenter, Enterprise, Standard, or

Windows Web Server

.NET Framework

Minimum: Microsoft .NET Framework 3.5 SP1

SQL Server support tools

and software

SQL Server 2008 R2 – SQL Server Native Client

SQL Server 2008 R2 – SQL Server Setup Support Files

Minimum: Windows Installer 4.5

Internet Explorer

Minimum: Windows Internet Explorer 6 SP1

Virtualization

Windows Server 2008 R2

Windows Server 2008

Microsoft Hyper-V Server 2008

Microsoft Hyper-V Server 2008 R2

NOTE Server hardware has offered both 32-bit and 64-bit processors for several years,

however, Windows Server 2008 R2 is 64-bit only. Please take this into consideration when

planning SQL Server 2008 R2 deployments on Windows Server 2008 R2.

Installation, Upgrade, and Migration Strategies

Like its predecessors, SQL Server 2008 R2 is available in both 32-bit and 64-bit editions, both

of which can be installed either with the SQL Server Installation Wizard or through a command

prompt. As was briefly mentioned earlier in this chapter, it is now also possible to use

Sysprep in conjunction with SQL Server for automated deployments with minimal administrator

intervention.

Last, DBAs also have the option to upgrade an existing installation of SQL Server or

conduct a side-by-side migration when installing SQL Server 2008 R2. The following sections

elaborate on the different strategies.

The In-Place Upgrade

An in-place upgrade is the upgrade of an existing SQL Server installation to SQL Server 2008

R2. When an in-place upgrade is conducted, the SQL Server 2008 R2 setup program replaces

the previous SQL Server binaries with the new SQL Server 2008 R2 binaries on the same

machine. SQL Server data is automatically converted from the previous version to SQL Server

2008 R2. This means that data does not have to be copied or migrated. In the example in

Figure 1-5, a DBA is conducting an in-place upgrade on a SQL Server 2005 instance running

on Server 1. When the upgrade is complete, Server 1 still exists, but the SQL Server 2005

instance, including all of its data, is now upgraded to SQL Server 2008 R2.

Upgrade

Pre-migration Post-migration

Server 1

SQL Server 2005

Server 1

SQL Server 2008 R2

FIGURE 1-5 An in-place upgrade from SQL Server 2005 to SQL Server 2008 R2

NOTE SQL Server 2000, SQL Server 2005, and SQL Server 2008 are all supported for an

in-place upgrade to SQL Server 2008 R2. Unfortunately, earlier editions, such as SQL Server

7.0 and SQL Server 6.5, cannot be upgraded to SQL Server 2008 R2.

Side-by-Side Migration

The term side-by-side migration describes the deployment of a brand-new SQL Server 2008

R2 instance alongside a legacy SQL Server instance. When the SQL Server 2008 R2 installation

is complete, a DBA migrates data from the legacy SQL Server database platform to the new

SQL Server 2008 R2 database platform. Side-by-side migration is depicted in Figure 1-6.

NOTE It is possible to conduct a side-by-side migration to SQL Server 2008 R2 by using

the same server. You can also use the side-by-side method to upgrade to SQL Server 2008

on a single server.

Data is migrated

from

SQL Server 2005

on Server 1

SQL Server 2008 R2

on Server 2

Migration

Pre-migration

Post-migration

Server 1

SQL Server 2005

Server 1

SQL Server 2005

Server 2

SQL Server 2008 R2

FIGURE 1-6 Side-by-side migration from SQL Server 2005 to SQL Server 2008 R2

Side-by-Side Migration Pros and Cons

The biggest benefit of a side-by-side migration over an in-place upgrade is the opportunity

to build out a new database infrastructure on SQL Server 2008 R2 and avoid potential migration

issues with an in-place upgrade. The side-by-side migration also provides more granular

control over the upgrade process because it is possible to migrate databases and components

independent of one another. The legacy instance remains online during the migration process.

All of these advantages result in a more powerful server. Moreover, when two instances

are running in parallel, additional testing and verification can be conducted, and rollback is

easy if a problem arises during the migration.

However, there are disadvantages to the side-by-side strategy. Additional hardware might

need to be purchased. Applications might also need to be directed to the new SQL Server

2008 R2 instance, and it might not be a best practice for very large databases because of the

duplicate amount of storage that is required during the migration process.

SQL Server 2008 R2 High-Level Side-by-Side Strategy

The high-level side-by-side migration strategy for upgrading to SQL Server 2008 R2 consists

of the following steps:

1. Ensure that the instance of SQL Server you plan to migrate to meets the hardware and

software requirements for SQL Server 2008 R2.

2. Review the deprecated and discontinued features in SQL Server 2008 R2 by referring

to “SQL Server Backward Compatibility” at http://msdn.microsoft.com/en-us/library

/cc707787(SQL.105).aspx.

3. Although you will not upgrade a legacy instance to SQL Server 2008 R2, it is still beneficial

to run the SQL Server 2008 R2 Upgrade Advisor to ensure that the data being

migrated to the new SQL Server 2008 R2 is supported and that there is nothing suggesting

that a break will occur after migration.

4. Procure hardware and install the operating system of your choice. Windows Server

2008 R2 is recommended.

5. Install the SQL Server 2008 R2 prerequisites and desired components.

6. Migrate objects from the legacy SQL Server to the new SQL Server 2008 R2 database

platform.

7. Point applications to the new SQL Server 2008 R2 database platform.

8. Decommission legacy servers after the migration is complete.

C H A P T E R 2

Multi-Server Administration

Over the years, an increasing number of organizations have turned to Microsoft SQL

Server because it embodies the Microsoft Data Platform vision to help organizations

manage any data, at any place, and at any time. The biggest challenges organizations

face with this increase of SQL Server installations have been in management.

With the release of Microsoft SQL Server 2008 came two new manageability features,

Policy-Based Management and the Data Collector, which drastically changed how database

administrators managed SQL Server instances. With Policy-Based Management,

database administrators can centrally create and enforce polices on targets such as SQL

Server instances, databases, and tables. The Data Collector helps integrate the collection,

analysis, troubleshooting, and persistence of SQL Server diagnostic information. When

introduced, both manageability features were a great enhancement to SQL Server 2008.

However, database administrators and organizations still lacked manageability tools to

help effectively manage a multi-server environment, understand resource utilization, and

enhance collaboration between development and IT departments.

SQL Server 2008 R2 addresses concerns about multi-server management with the

introduction of a new manageability feature, the SQL Server Utility. The SQL Server Utility

enhances the multi-server administration experience by helping database administrators

proactively manage database environments efficiently at scale, through centralized

visibility into resource utilization. The utility also provides improved capabilities to help

organizations maximize the value of consolidation efforts and ensure the streamlined

development and deployment of data-driven applications.

The SQL Server Utility

The SQL Server Utility is a breakthrough manageability feature included with SQL Server

2008 R2 that allows database administrators to centrally monitor and manage database

applications and SQL Server instances, all from a single management interface. This

interface, known as a Utility Control Point (UCP), is the central reasoning point in the

22 CHAPTER 2 Multi-Server Administration

SQL Server Utility. It forms a collection of managed instances with a repository for performance

data and management policies. After data is collected from managed instances, Utility

Explorer and SQL Server Utility dashboard and viewpoints in SQL Server Management Studio

(SSMS) provide administrators with a view of SQL Server resource health through policy

evaluation and analysis of trending instances and applications throughout the enterprise.

The following entities can be viewed in the SQL Server Utility:

¦ Instances of SQL Server

¦ Data-tier applications

¦ Database files

¦ Volumes

Figure 2-1 shows one possible configuration using the SQL Server Utility, which includes a

UCP, many managed instances, and a workstation running SSMS for managing the utility and

viewing the dashboard and viewpoints. The UCP stores configuration and collection information

in both the UMDW and msdb databases.

SQL Server

Management Studio

Uplo

ad dcaotlale csetion

Managed instance

Utility Control Point

UMDW msdb

Upload collection

data set

Uplo

ad collection data se t

Managed instance

FIGURE 2-1 A SQL Server Utility Control Point (UCP) and managed instances

The SQL Server Utility CHAPTER 2 23

REAL WORLD

Many organizations that participate in the Microsoft SQL Server early adopter

program are currently either evaluating SQL Server 2008 R2 or already using

it in their production infrastructure. The consensus is that organizations should

design a SQL Server Utility solution that factors in a SQL Server Utility with every

deployment. The SQL Server Utility allows you to increase visibility and control,

optimize resources, and improve overall efficiencies within your SQL Server infrastructure.

SQL Server Utility Key Concepts

Although many database administrators may be eager to implement a UCP and start proactively

monitoring their SQL Server environment, it is beneficial to take a few minutes and become

familiar with the new terminology and components that make up the SQL Server Utility.

¦ The SQL Server Utility This represents an organization’s SQL Server-related entities

in a unified view. The SQL Server Utility supports actions such as specifying resource

utilization policies that track the utilization requirements of an organization. Leveraging

Utility Explorer and SQL Server Utility viewpoints in SSMS can give you a holistic

view of SQL Server resource health.

¦ The Utility Control Point (UCP) The UCP provides the central reasoning point

for the SQL Server Utility by using SSMS to organize and monitor SQL Server resource

health. The UCP collects configuration and performance information from managed

instances of SQL Server every 15 minutes. Information is stored in the Utility Management

Data Warehouse (UMDW) on the UCP. SQL Server performance data is then compared

to policies to help identify resource bottlenecks and consolidation opportunities.

¦ The Utility Management Data Warehouse (UMDW) The UMDW is a relational

database used to store data collected by managed instances of SQL Server. The UMDW

database is automatically created on a SQL Server instance when the UCP is created.

Its name is sysutility_mdw, and it utilizes the Simple Recovery model. By default, the

collection upload frequency is set to every 15 minutes, and the data retention period is

set to 1 year.

¦ The Utility Explorer user interface A component of SSMS, this interface provides

a hierarchical tree view for managing and controlling the SQL Server Utility. Its uses

include connecting to a utility, creating a UCP, enrolling instances, deploying data-tier

applications, and viewing utilization reports affiliated with managed instances and

data-tier applications. You launch Utility Explorer from SSMS by selecting View and

then choosing Utility Explorer.

¦ The Utility Explorer dashboard and list views These provide a summary and

detailed presentations of resource health and configuration details for managed

instances of SQL Server, deployed data-tier applications, and host resources such as

CPU utilization, file space utilization, and volume space utilization. This allows superior

insight into resource utilization and policy violations and helps identify consolidation

opportunities, maximizes the value of hardware investments, and maintains healthy

systems. The utility dashboard is depicted in Figure 2-2.

FIGURE 2-2 The SQL Server Utility dashboard

UCP Prerequisites

As with other SQL Server components and features, the deployment of a SQL Server UCP

must meet the following specific prerequisites and requirements:

¦ The SQL Server version running the UCP must be SQL Server 2008 R2 or higher. (SQL

Server 2008 R2 is also referred to as version 10.5.)

¦ The SQL Server 2008 R2 edition must be Datacenter, Enterprise, Evaluation, or

Developer.

¦ The SQL Server system running the UCP must reside within a Windows Active Directory

domain.

¦ The underlying operating system must be Windows Server 2003, Windows Server

2008, or Windows Server 2008 R2. If Windows Server 2003 is used, the SQL Server

Agent service account must be a member of the Performance Monitor User group.

¦ It is recommended that the collation settings affiliated with the Database Engine instance

hosting the UCP be case-insensitive.

NOTE The Database Engine instance is the only component that can be managed

by a UCP. Other components, such as Analysis Services and Reporting Services, are not

supported.

After all these prerequisites are met, you can deploy the UCP. However, before installing

the UCP, it is beneficial to size the UMDW accordingly and understand the maximum capacity

specifications associated with a UCP.

UCP Sizing and Maximum Capacity Specifications

The wealth of information captured during capacity planning sessions can help an organization

better understand its environment and make informed decisions when designing the

UCP implementation. In the case of the SQL Server Utility, it is helpful to know that each

SQL Server UCP can manage and monitor up to 100 computers and up to 200 SQL Server

Database Engine instances. Both computers and instances can be either physical or virtual.

Additional UCPs should be provisioned if there is a need to monitor more computers and

instances.

Disk space consumption is another area you should look at in capacity planning. For

instance, the disk space consumed within the UMDW is approximately 2 GB of data per year

for each managed instance of SQL Server , whereas the disk space used by the msdb database

on the UCP instance is approximately 20 MB per managed instance of SQL Server. Last, a SQL

Server UCP can support up to a total of 1,000 user databases.

Creating a UCP

The UCP is relatively easy to set up and configure. You can deploy it either by using the

Create Utility Control Point Wizard in SSMS or by leveraging Windows PowerShell scripts.

The high-level steps for creating a UCP include specifying the instance of SQL Server in which

the UCP will be created, choosing the account to run the utility control set, ensuring that the

instance is validated and passes the conditions test, reviewing the selections made, and finalizing

the UCP deployment.

Although the setup is fairly straightforward, the following conditions must be met to successfully

deploy a UCP:

¦ You must have administrator privileges on the instance of SQL Server.

¦ The instance of SQL Server must be SQL Server 2008 R2 or higher.

¦ The SQL Server edition must support UCP creation.

¦ The instance of SQL Server cannot be enrolled with any other UCP.

¦ The instance of SQL Server cannot already be a UCP.

¦ There cannot be a database named sysutility_mdw on the specified instance of SQL

Server.

¦ The collection sets on the specified instance of SQL Server must be stopped.

¦ The SQL Server Agent service on the specified instance must be started and configured

to start automatically.

¦ The SQL Server Agent proxy account cannot be a built-in account such as Network

Service.

¦ The SQL Server Agent proxy account must be a valid Windows domain account on the

specified instance.

Creating a UCP by Using SSMS

It is important to understand how to effectively use the Create Utility Control Point Wizard in

SSMS to create a SQL Server UCP. Follow these steps when using SSMS:

1. In SSMS, connect to the SQL Server 2008 R2 Database Engine instance in which the

UCP will be created.

2. Launch the Utility Explorer by selecting View and then selecting Utility Explorer.

3. On the Getting Started tab, click the Create A Utility Control Point (UCP) link or click

the Create Utility Control Point icon on the Utility Explorer toolbar.

4. The Create Utility Control Point Wizard is now invoked. Review the introduction message,

and then click Next to begin the UCP creation process. If you want, you can

select the Do Not Show This Page Again check box.

5. On the Specify The Instance Of SQL Server page, click the Connect button to specify

the instance of SQL Server in which the new UCP will be created, and then click Connect

in the Connect To Server dialog box.

6. Specify a name for the UCP, as illustrated in Figure 2-3, and then click Next to continue.

FIGURE 2-3 The Specify The Instance Of SQL Server page

NOTE Using a meaningful name is beneficial and easier to remember, especially when

you plan on implementing more than one UCP within your SQL Server infrastructure.

For example, to easily distinguish between multiple UCPs you might name the UCP that

manages the production servers “Production Utility” and the UCP for Test Servers “Test

Utility.” When connected to the UCP, users will be able to distinguish between the different

control points in Utility Explorer.

7. On the Utility Collection Set Account page, there are two options available for identifying

the account that will run the utility collection set. The first option is a Windows

domain account, and the second option is the SQL Server Agent service account. Note

that the SQL Server Agent service account can only be used if the SQL Server Agent

service account is leveraging a Windows domain account. For security purposes, it is

recommended that you use a Windows domain account with low privileges. Indicate

that the Windows domain account will be used as the SQL Server Agent proxy account

for the utility collection set, and then click Next to continue.

8. On the next page, the SQL Server instance is compared against a series of prerequisites

before the UCP is created. Failed conditions are displayed in a validation report. Correct

all issues, and then click the Rerun Validation button to verify the changes against

the validation rules. To save a copy of the validation report for future reference, click

Save Report, and then specify a location for the file. To continue, click Next.

NOTE As mentioned in the prerequisite steps before these instructions, SQL Server

Agent is, by default, not configured to start automatically during the installation of SQL

Server 2008 R2. Use the SQL Server Configuration Manager tool to configure the SQL

Server Agent service to start automatically on the specified instance.

9. Review the options and settings selected on the Summary Of UCP Creation page, and

click Next to begin the installation.

10. The Utility Control Point Creation page communicates the steps and report status affiliated

with the creation of a UCP. The steps involve preparing the SQL Server instance

for UCP creation, creating the UMDW, initializing the UMDW, and configuring the SQL

Server Utility collection set. Review each step for success and completeness. If you wish,

save a report on the creation of the UCP operation. Next, click Save Report and choose

a location for the file. Click Finish to close the Create Utility Control Point Wizard.

Creating a UCP by Using Windows PowerShell

Windows PowerShell can be used instead of SSMS to create a UCP. The following syntax

(available in the article “How To: Enroll an Instance of SQL Server (SQL Server Utility),” online

at http://msdn.microsoft.com/en-us/library/ee210563(SQL.105).aspx), illustrates how to create

a UCP with Windows PowerShell. You will need to change the elements inside the quotes to

reflect your own desired arguments.

NOTE When working with Windows Server 2008 R2, you can launch Windows PowerShell

by clicking the Windows PowerShell icon on the Start Menu taskbar. For more information

on SQL Server and Windows PowerShell, see “SQL Server PowerShell Overview” at

http://msdn.microsoft.com/en-us/library/cc281954.aspx.

$UtilityInstance = new-object –Type Microsoft.SqlServer.Management.Smo.Server

“ComputerName\UCP-Name”;

$SqlStoreConnection = new-object –Type

Microsoft.SqlServer.Management.Sdk.Sfc.SqlStoreConnection

$UtilityInstance.ConnectionContext.SqlConnectionObject;

$Utility =

[Microsoft.SqlServer.Management.Utility.Utility]::CreateUtility(“Utility”,

$SqlStoreConnection, “ProxyAccount”, “ProxyAccountPassword”);

UCP Post-Installation Steps

When the Create Utility Control Point Wizard is closed, the Utility Explorer is invoked, and

you are automatically connected to the newly created UCP. The UCP is automatically enrolled

as a managed instance. The data collection process also commences immediately. The

dashboards, status icons, and utilization graphs associated with the SQL Server Utility display

meaningful information after the data is successfully uploaded.

NOTE Do not become alarmed if no data is displayed in the dashboard and viewpoints in

the Utility Explorer Content pane; it can take up to 45 minutes for data to appear at first.

All subsequent uploads generally occur every 15 minutes.

A beneficial post-installation task is to confirm the successful creation of the UMDW. This

can be done by using Object Explorer to verify that the sysutility_mdw database exists on the

SQL Server instance. At this point, you can modify database settings.such as the initial size

of the database, autogrowth settings, and file placement.based on the capacity planning

exercises discussed in the “UCP Sizing and Maximum Capacity Specifications” section earlier

in this chapter.

Enrolling SQL Server Instances

After you have established a UCP, the next task is to enroll an instance or instances of SQL

Server into a SQL Server Control Point. Similar to deploying a Utility Control Point, this task is

accomplished by using the Enroll Instance Wizard in SSMS or by leveraging Windows PowerShell.

The high-level steps affiliated with enrolling instances into the SQL Server UCP include

choosing the UCP to utilize, specifying the instance of SQL Server to enroll, selecting the account

to run the utility collection set, reviewing prerequisite validation results, and reviewing

your selections. The enrollment process then begins by preparing the instance for enrollment.

The cache directory is created for the collected data, and then the instance is enrolled into

the designated UCP.

IMPORTANT A UCP created on SQL Server 2008 R2 Enterprise can have a maximum of

25 managed instances of SQL Server. If more than 25 managed instances are required, then

you must utilize SQL Server 2008 R2 Datacenter.

Managed Instance Enrollment Prerequisites

As with many of the other tasks in this chapter, certain conditions must be satisfied to successfully

enroll an instance:

¦ You must have administrator privileges on the instance of SQL Server.

¦ The instance of SQL Server must be SQL Server 2008 R2 or higher.

¦ The SQL Server edition must support instance enrollment.

¦ The instance of SQL Server cannot be enrolled with any other UCP.

¦ The instance of SQL Server cannot already be a UCP.

¦ The instance of SQL Server must have the utility collection set installed.

¦ The collection sets on the specified instance of SQL Server must be stopped.

¦ The SQL Server Agent service on the specified instance must be started and configured

to start automatically.

¦ The SQL Server Agent proxy account cannot be a built-in account such as Network

Service.

¦ The SQL Server Agent proxy account must be a valid Windows domain account on the

specified instance.

Enrolling SQL Server Instances by Using SSMS

The following steps should be followed when enrolling a SQL Server instance via SSMS:

1. In Utility Explorer, connect to the desired SQL Server Utility (for example, Production

Utility), expand the UCP, and then select Managed Instances.

2. Right-click the Managed Instances node, and select Enroll Instance.

3. The Enroll Instance Wizard is launched. Review the introduction message, and then

click Next to begin the enrollment process. If you want, you can select the Do Not

Show This Page Again check box.

4. On the Specify The Instance Of SQL Server page, click the Connect button to specify

the instance of SQL Server to enroll in the UCP.

5. Supply the SQL Server instance name, and then click Connect in the Connect To Server

dialog box.

6. Click Next to proceed. The Utility Collection Set Account page is invoked.

7. There are two options available for specifying an account to run the utility collection

set. The first option is a Windows domain account, and the second option is the SQL

Server Agent service account. You can use the SQL Server Agent service account only if

the SQL Server Agent service account is leveraging a Windows domain account. For security

purposes, it is recommended that you use a Windows domain account with low

privileges. Specify the Windows domain account to be used as the SQL Server Agent

proxy account for the utility collection set, and then click Next to continue.

8. As shown in Figure 2-4, a series of conditions will be evaluated against the SQL Server

instance to ensure that it passes all of the prerequisites before the instance is enrolled.

If there are any failures preventing the enrollment of the SQL Server instance, correct

them and then click Rerun Validation. To save the validation report, click Save Report

and specify a location for the file. Click Next to continue.

FIGURE 2-4 The SQL Server Instance Validation screen

9. Review the Summary Of Instance Enrollment page, and then click Next to enroll your

instance of SQL Server.

10. The following actions will be automatically completed on the Enrollment Of SQL Server

Instance page: the instance will be prepared for enrollment, the cache directory for the

collected data will be created, and the instance will be enrolled. Review the results, and

click Finish to finalize the enrollment process.

11. Repeat the steps to enroll additional instances.

Enrolling SQL Server Instances by Using

Windows PowerShell

Windows PowerShell can also be used to enroll instances. In fact, scripting may be the way

to go if there is a need to enroll a large number of instances into a SQL Server UCP. Let’s say

you need to enroll 200 instances, for example. Using the Enroll Instance Wizard in SSMS can

be very time consuming, because the wizard is a manual process in which you can enroll only

one instance at a time. In contrast, you can enroll 200 instances with a single script by using

Windows PowerShell. The following syntax illustrates how to create a UCP by using Windows

PowerShell. Change the elements in the quotes to match your environment.

$UtilityInstance = new-object -Type Microsoft.SqlServer.Management.Smo.Server

“ComputerName\UCP-Name”;

$SqlStoreConnection = new-object –Type

Microsoft.SqlServer.Management.Sdk.Sfc.SqlStoreConnection

$UtilityInstance.ConnectionContext.SqlConnectionObject;

$Utility =

[Microsoft.SqlServer.Management.Utility.Utility]::Connect($SqlStoreConnection);

$Instance = new-object -Type Microsoft.SqlServer.Management.Smo.Server

“ComputerName\ManagedInstanceName”;

$InstanceConnection = new-object –Type

Microsoft.SqlServer.Management.Sdk.Sfc.SqlStoreConnection

$Instance.ConnectionContext.SqlConnectionObject;

$ManagedInstance = $Utility.EnrollInstance($InstanceConnection, “ProxyAccount”,

“ProxyPassword”);

The Managed Instances Dashboard

After you have enrolled all of your instances associated with a UCP, you can review the Managed

Instances dashboard, as illustrated in Figure 2-5, to gain quick insight into the health

and utilization of all of your managed instances. The Managed Instances dashboard is covered

in Chapter 5, “Consolidation and Monitoring.”

FIGURE 2-5 The Managed Instances dashboard

Managing Utility Administration Settings

After you are connected to a UCP, use the Utility Administration node in the Utility Explorer

navigation pane to view and configure global policy settings, security settings, and data

warehouse settings across the SQL Server Utility. The configuration tabs affiliated with the

Utility Administration node are the Policy, Security, and Data Warehouse tabs. The following

sections explore the Utility Administration settings available within each tab. You must first

connect to a SQL Server UCP before modifying settings.

Connecting to a UCP

Before managing or configuring UCP settings, a database administrator must connect to a

UCP by means of Utility Explorer in SSMS. Use the following procedure to connect to a UCP:

1. Launch SSMS and connect to an instance of SQL Server.

2. Select View and then Utility Explorer.

3. On the Utility Explorer toolbar, click the Connect To Utility icon.

4. In the Connect To Server dialog box, specify a UCP instance, and then click Connect.

5. After you are connected, you can deploy data-tier applications, manage instances, and

configure global settings.

NOTE It is not possible to connect to more than one UCP at the same time. Therefore,

before attempting to connect to an additional UCP, click the Disconnect From Utility icon

on the Utility Explorer toolbar to disconnect from the currently connected UCP.

The Policy Tab

You use the Policy tab to view or modify global monitoring settings. Changes on this tab are

effective across the SQL Server Utility. You can view the Policy tab by connecting to a UCP

through Utility Explorer and then selecting Utility Administration. Select the Policy tab in the

Utility Explorer Content pane. Policies are broken down into three sections: Global Policies For

Data-Tier Applications, Global Policies For Managed Instances, and Volatile Resource Policy

Evaluation. To expand the list of values for these options, click the arrow next to the policy

name or click the policy title.

Global Policies For Data-Tier Applications

Use the first section on the Policy tab, Global Polices For Data-Tier Applications, to view or

configure global utilization policies for data-tier applications. You can set underutilization or

overutilization policy thresholds for data-tier applications by specifying a percentage in the

controls on the right side of each policy description. For example, it is possible to configure

underutilized and overutilized settings for CPU utilization and file space utilization for data

files and logs. Click the Apply button to save changes, or click the Discard or Restore Default

buttons as needed. By default, the overutilized threshold is 70 percent, and the underutilized

threshold is 0 percent.

Global Policies For Managed Instances

Global Policies For Managed Instances is the next section on the Policy tab. Here you can set

global SQL Server managed instance application monitoring policies for the SQL Server Utility.

As illustrated in Figure 2-6, you can set underutilization and overutilization thresholds to

manage numerous issues, including processor capacity, file space, and storage volume space.

FIGURE 2-6 Modifying global policies for managed instances

Volatile Resource Policy Evaluation

The final section on the Policy tab is Volatile Resource Policy Evaluation. This section, displayed

in Figure 2-7, provides strategies to minimize unnecessary reporting noise and unwanted

violation reporting in the SQL Server Utility. You can choose how frequently the CPU

utilization policies can be in violation before reporting the CPU as overutilized. The default

evaluation period for processor overutilization is 1 hour; 6 hours, 12 hours, 1 day, and 1

week can also be selected. The default percentage of data points that must be in violation

before a CPU is reported as being overutilized is 20 percent. The options range from 0 percent

to 100 percent.

FIGURE 2-7 Volatile resource policy evaluation

The next set of configurable elements allows you to determine how frequently CPU utilization

polices should be in violation before the CPU is reported as being underutilized. The default

evaluation period for processor underutilization is 1 week. Options range from 1 day to 1

month. The default percentage of data points that must be in violation before a CPU is reported

as being underutilized is 90 percent. You can choose between 0 percent and 100 percent.

To change policies, use the slider controls to the right of the policy descriptions, and then

click Apply. You can also restore default values or discard changes by clicking the buttons at

the bottom of the display pane.

REAL WORLD

Let’s say you configure the CPU overutilization polices by setting the Evaluate SQL

Server Utility Polices Over This Moving Time Window setting to 12 hours and the

Percent Of SQL Server Utility Polices In Violation During The Time Window Before

CPU Is Reported As Overutilized setting to 30 percent. Over 12 hours, there will

be 48 policy evaluations . Fourteen of these must be in violation before the CPU is

marked as overutilized.

The Security Tab

From a security and authorization perspective, there are two security roles associated with a

UCP. The first role is the Utility Administrator, and the second role is the Utility Reader. The

Utility Administrator is ultimately the “superuser” who has the ability to manage any setting

or view any dashboard or viewpoint associated with the UCP. For example, a Utility Administrator

can enroll instances, manage settings in the Utility Administration node, and much

more. The second security role is the Utility Reader, which has rights to connect to the SQL

Server Utility, observe all viewpoints in Utility Explorer, and view settings on the Utility Administration

node in Utility Explorer.

You can use the Security tab in the Utility Administration node of Utility Explorer to view

and provide Utility Reader privileges to a SQL Server login. By default, logins that have

sysadmin privileges on the instance running the UCP automatically have full administrative

privileges over the UCP. A database administrator must use a combination of both Object Explorer

and the Security Tab in Utility Administration to add or modify login settings affiliated

with the UCP.

For example, the following steps grant a new user the Utility Administrator role by creating

a new SQL Server login that uses Windows Authentication:

1. Open Object Explorer in SSMS, and expand the folder of the server instance that is

running the UCP in which you want to create the new login.

2. Right-click the Security folder, point to New, and then select Login.

3. On the General page of the Login dialog box, enter the name of a Windows user in the

4. Select Windows Authentication.

5. On the Server Roles page, select the check box for the sysadmin role.

6. Click OK.

By default, this user is now a Utility Administrator, because he or she has been granted the

sysadmin role.

The next example will grant a standard SQL Server user the Utility Reader read-only privileges

for the SQL Server Utility dashboard and viewpoints.

1. Open Object Explorer in SSMS, and expand the folder of the server instance that

is running the UCP in which you want to create the new login. For this example,

SQL2K8R2-01\test2 will be used.

MORE INFO Review the article “CREATE LOGIN (Transact-SQL)” at the following link

for a refresher on how to create a login in SQL Server: http://technet.microsoft.com

/en-us/library/ms189751.aspx.

2. Right-click the Security folder, point to New, and then select Login.

3. On the General page, enter the name of a Windows user in the Login Name box.

4. Select Windows Authentication.

5. Click OK.

NOTE Unlike in the previous example, do not assign this user the sysadmin role on the

Server Role page. If you do, the user will automatically become a Utility Administrator

and not a Utility Reader on the UCP.

6. In Utility Explorer, connect to the UCP instance in which you created the login

(SQL2K8R2-01\Test2).

7. Select the Utility Administration node, and then select the Security tab in the Utility

Explorer Content pane.

8. Next to the newly created user (SQL2K8R2-01\Test2), as shown in Figure 2-8, grant the

Utility Reader privilege, and then click Apply.

FIGURE 2-8 Configuring read-only privileges for the SQL Server Utility

REAL WORLD

Many organizations have large teams managing their SQL Server infrastructures

because they have hundreds of SQL Server instances within their environment.

Let’s say you wanted to grant 50 users the read-only privilege for the SQL

Server Utility dashboard and viewpoints. It would be very impractical to grant every

single database administrator the read-only privilege. Therefore, if you have many

database administrators and you want to grant them the read-only role for the SQL

Server Utility within your environment, you can take advantage of a Role Based Access

model to streamline the process.

For example, you can create a security group within your Active Directory domain

called Utility Readers and then add all the desired database administrators and Windows

administrator accounts into this group. Then in SSMS, you create a new login

and select the Active Directory security group called Utility Readers. The final step

involves adding the Utility Reader role to the Utility Reader security group on the

Security tab in the Utility Administration node within Utility Explorer. By following

these steps, you provide access to all of your database administrators in a fraction

of the time. In addition, the use of RBA makes it quite easier to manage the ongoing

maintenance of security of the SQL Server Utility.

The Data Warehouse Tab

You view and modify the data retention period for utilization information collected for managed

instances of SQL Server on the Data Warehouse tab in the Utility Administration node in

Utility Explorer. In addition, the UMDW Database Name and Collection Set Upload Frequency

elements can be viewed; however, they cannot be modified in this version of SQL Server 2008

R2. There are plans to allow these settings to be modified in future versions of SQL Server.

The following steps illustrate how to modify the data retention period for the UMDW:

1. Launch SSMS and connect to a UCP through Utility Explorer.

2. Select the Utility Administration node in Utility Explorer.

3. Click the Data Warehouse tab in the Utility Explorer Content pane.

4. In the Utility Explorer Content pane, select the desired data retention period for

the UMDW, as displayed in Figure 2-9. The options are 1 month, 3 months, 6 months,

1 year, or 2 years.

FIGURE 2-9 Configuring the data retention period

5. Click the Apply button to save the changes. Alternatively, click the Discard Changes or

Restore Defaults buttons as needed.

C H A P T E R 3

Data-Tier Applications

Ask application developers or database administrators what it was like to work with

data-driven applications in the past, and most probably do not use adjectives such

as “easy,” “enjoyable,” or “wonderful” when they describe their experience. Indeed, the

development, deployment, and even the management of data-driven applications in the

past were a struggle. This was partly because Microsoft SQL Server and Microsoft Visual

Studio were not really outfitted to handle the development of data-driven applications,

the ability to create deployment policies did not exist, and application developers

couldn’t effortlessly hand off a single package to database administrators for deployment.

After a data-driven application was deployed, developers and administrations

found making changes to be a tedious process. Much later in the life cycle of data-driven

applications, they came to the stark realization that there was no tool available to centrally

manage a deployed environment. Obviously, many challenges existed throughout

the life cycle of a data-driven application.

Introduction to Data-Tier Applications

With the release of Microsoft SQL Server 2008 R2, the SQL Server Manageability team

addressed these struggles by introducing support for data-tier applications to help

streamline the deployment, management, and upgrade of database applications. A datatier

application, also referred to as a DAC, is a single unit of deployment that contains all

the elements used by an application, such as the database application schema, instancelevel

objects, associated database objects, files and scripts, and even a manifest defining

the organization’s deployment requirements.

The DAC improves collaboration between data-tier developers and database administrators

throughout the application life cycle and allows organizations to develop, deploy,

and manage data-tier applications in a much more efficient and effective manner than

ever before, mainly because the DAC file functions as a single unit. Database administrators

can now also centrally manage, monitor, deploy, and upgrade data-tier applications

with SQL Server Management Studio and view DAC resource utilization across the SQL

Server infrastructure in Utility Explorer at scale.

42 CHAPTER 3 Data-Tier Applications

The Data-Tier Application Life Cycle

There are two common methods for generating a DAC. One is to author and build a DAC

using a SQL Server data-tier application project in Microsoft Visual Studio 2010. In the second

method, you can extract a DAC from an existing database by using the Extract Data-Tier Application

Wizard in SQL Server Management Studio. Alternatively, a DAC can be generated

with Windows PowerShell commands.

Figure 3-1 illustrates the data-tier application generation and deployment life cycle for

both a new data-tier application project in Visual Studio 2010 and an extracted DAC created

with the Extract Data-Tier Application Wizard in SQL Server Management Studio (SSMS). In

the illustration, the DAC package is deployed to the same instance of SQL Server 2008 R2 in

both methodologies.

.dacpac

Visual Studio

Data-tier

developer

Build

Deploy

SQL Server

.dacpac

Extract

Deploy

Managed instances

SQL Server 2008 R2

Upload collection

data set

DBA

Utility Control Point

SQL Server

Management

Studio

FIGURE 3-1 The data-tier application life cycle

Introduction to Data-Tier Applications CHAPTER 3 43

Data-tier developers using a data-tier application project template in Visual Studio 2010

first build a DAC and then deploy the DAC package to an instance of SQL Server 2008 R2.

In contrast, database administrators using the Extract Data-Tier Application Wizard in SQL

Server Management Studio generate a DAC from an existing database. The DAC package is

then deployed to a SQL Server 2008 R2 instance. In both methods, the deployment creates a

DAC definition that is stored in the msdb system database and a user database that stores the

objects identified in the DAC definition. Finally, the applications connect to the database associated

with the DAC. Database administrators use the Utility Control Point and Utility Explorer in

SQL Server Management Studio to centrally manage and monitor data-tier applications at scale.

Common Uses for Data-Tier Applications

Data-tier applications are used in a multitude of ways to serve many different needs. For

example, organizations may use data-tier applications when they need to

¦ Deploy a data-tier application for test, staging, and production instances of the Database

Engine.

¦ Create DAC packages to tighten integration handoffs between data-tier developers

and database administrators.

¦ Move changes from development to production.

¦ Upgrade an existing DAC instance to a newer version of the DAC by using the Upgrade

Data-Tier Wizard.

¦ Compare database schemas between two data-tier applications.

¦ Upgrade database schemas from older versions of SQL Server to SQL Server 2008 R2—

for example, to extract a data-tier application from SQL Server 2000 and then deploy

the package on SQL Server 2008 R2.

¦ Consider next generation development, which is achieved by importing an existing

version of a DAC into Visual Studio and then modifying the schema, objects, or deployment

strategies.

¦ Author database objects by using source code control systems such as Team

Foundation Server.

¦ Integrate data-tier applications with Microsoft SQL Azure. Currently, it is possible to

deploy, register, and delete. Upgrading of data-tier applications with SQL Azure is

likely to be supported in future releases.

Real World

Organizations looking to accelerate and standardize deployment of database

applications within their database environments should leverage data-tier

applications included in SQL Server 2008 R2. By utilizing data-tier applications, an

organization captures intent and produces a single deployment package, providing

a more reliable and consistent deployment experience than ever before. In addition,

data-tier applications facilitate streamlined collaboration between development

and database administrator teams, which improves efficiency.

Supported SQL Server Objects

Every DAC contains objects used by the application, including schemas, tables, and views.

However, some objects are not supported in data-tier applications. The following list can help

you become acquainted with some of the SQL Server objects that are supported.

¦ Database role

¦ Function: Inline Table-valued

¦ Function: Multistatement Table-valued

¦ Function: Scalar

¦ Index: Clustered

¦ Index: Non-clustered

¦ Index: Unique

¦ Login

¦ Schema

¦ Stored Procedure: Transact-SQL

¦ Table: Check Constraint

¦ Table: Collation

¦ Table: Column, including computed columns

¦ Table: Constraint, Default

¦ Table: Constraint, Foreign Key

¦ Table: Constraint, Index

¦ Table: Constraint, Primary Key

¦ Table: Constraint, Unique

¦ Trigger: DML

¦ Type: User-defined Data Type

¦ Type: User-defined Table Type

¦ User

¦ View

Database administrators do not have to worry about looking for unsupported objects. This

laborious task is accomplished with the Extract Data-Tier Application Wizard. Unsupported

objects such as DDL triggers, service broker objects, and full-text catalog objects are identified

and reported by the wizard. Unsupported objects are identified with a red icon that represents

an invalid entry. Database administrators must also pay close attention to objects with a yellow

icon, because this communicates a warning. A yellow icon usually warns database administrators

that although an object is supported, it is linked to and quite reliant on an unsupported

object. Database administrators need to review and address all objects with red and yellow

icons. The wizard does not create a DAC package until unsupported objects are removed. For

a list of some common supported objects, review the topic “SQL Server Objects Supported in

Data-tier Applications” at http://msdn.microsoft.com/en-us/library/ee210549(SQL.105).aspx.

Visual Studio 2010 and Data-Tier Application Projects

By leveraging the new project DAC template in Visual Studio 2010, data-tier developers

can create new data-tier applications from scratch or edit existing data-tier applications by

importing them directly into a project. Data-tier developers then add database objects such

as tables, views, and stored procedures to the data-tier application project. Data-tier developers

can also define specific deployment requirements for the data-tier application. When

the data-tier application project is complete, the data-tier developer creates a single unit of

deployment, known as a DAC file package, from within Visual Studio 2010. This package is

delivered to a database administrator, who deploys it to one or more SQL Server 2008 R2

instances. Alternatively, database administrators can use the DAC package to upgrade an

existing data-tier application that has already been deployed.

Launching a Data-Tier Application Project Template in

Visual Studio 2010

The following steps describe how to launch a data-tier application project template in Visual

Studio 2010:

1. Launch Visual Studio 2010.

2. In Visual Studio 2010, select File, and then select New Project.

3. In the Installed Templates list, expand the Database node, and then select SQL Server.

4. In the Project Template pane, select Data-Tier Application.

5. Specify the name, location, and solution name for the data-tier application, as shown

in Figure 3-2, and click OK.

FIGURE 3-2 Selecting the Data-Tier Application project template in Visual Studio 2010

6. Select Project, and then click Add New Item to add and create a database object based

on the Data-Tier Application project template. Some of the database objects included

in the template are scalar-valued function, schema, table, index, login, stored procedure,

user, user-defined table type, view, table-valued function, trigger, user-defined

data type, database role, data generation plan, and inline function.

Figure 3-3 illustrates the syntax for creating a sample Employees table schema for a

data-tier application in Visual Studio 2010. The Solution Explorer pane also includes the

other schema objects—specifically the tables associated with the data-tier application.

FIGURE 3-3 The Create Table schema and the Solution Explorer pane in a Visual Studio 2010

DAC project

Importing an Existing Data-Tier Application Project into

Visual Studio 2010

Instead of creating a DAC from the ground up in Visual Studio 2010, a data-tier developer can

choose to import an existing data-tier application into Visual Studio 2010 and then either edit

the DAC or completely reverse-engineer it. The following steps enable you to import objects

from a data-tier application package to a data-tier application project in Visual Studio 2010:

1. Create a new data-tier application project in Visual Studio.

2. In the Visual Studio Solution Explorer pane, navigate to the node for the desired datatier

application project.

3. Right-click the node for the desired data-tier application project, and then select the

Import Data-Tier Application Wizard.

4. Review the information on the Welcome page, and then click Next.

5. On the Specify Import Options page, select the option that allows you to import from

a data-tier application package.

6. Click the Browse button, and navigate to the folder in which you placed the .dacpac to

import. Select the file, and then click Open. Click Next to continue.

7. Review the report that shows the status of the import actions, as illustrated in Figure

3-4, and then click Finish.

FIGURE 3-4 Reviewing the results when importing an existing DAC into Visual Studio 2010

8. In Schema View, navigate to the dbo schema, navigate to the Tables, Views, and

Stored Procedures nodes, and verify that the objects created are now in the data-tier

application.

Data-tier developers and database administrators interested in working in Visual Studio

to initiate any of the actions mentioned in this section, such as importing or creating

a data-tier application, can refer to the article “Creating and Managing Databases and

Data-tier Applications in Visual Studio” at http://msdn.microsoft.com/en-us/library

/dd193245(VS.100).aspx.

Extracting a Data-Tier Application with SQL Server

Management Studio

The Extract Data-Tier Application Wizard is another tool that you can use for creating a new

data-tier application. The wizard is in SQL Server 2008 R2 Management Studio. In this method,

the wizard works its way into an existing SQL Server database, reads the content of the

database and the logins associated with it, and ensures that the new data-tier application can

be created. Finally, the wizard either creates a new DAC package or communicates all errors

and issues that need to be addressed before one can be created. This approach comes with a

big advantage. The extraction process can be applied to many versions of SQL Server, not just

SQL Server 2008 R2. For example, database administrators can use the wizard to generate a

DAC package from SQL Server 2000, SQL Server 2005, SQL Server 2008, or SQL Server 2008

R2 databases.

IMPORTANT DAC definitions remain unregistered when you use the Extract Data-Tier

Application Wizard. Database administrators must use the Register Data-Tier Application

Wizard in SQL Server 2008 R2 Management Studio to register a DAC definition. For

additional information on registering a DAC definition, see the “Registering a Data-Tier

Application” section later in this chapter.

Follow these steps to extract a data-tier application:

1. In Object Explorer, connect to a SQL Server instance containing the database that

houses the data-tier application to be extracted.

2. Expand the Database folder, and select a database to extract.

3. Invoke the Extract Data-Tier Application Wizard by right-clicking the desired database,

selecting Tasks, and then selecting Extract Data-Tier Application.

4. Review the information on the Introduction page, and then click Next to begin the extraction

process. Select the Do Not Show This Page Again check box if you do not want

the Introduction page displayed in the future when using the wizard.

5. On the Set Properties page, illustrated in Figure 3-5, complete the DAC properties by

typing in the application name, version, and description, as described here:

¦ Application name This refers to the name of the DAC. Although this name can

be different from the DAC package file, it is recommended that you make it similar

enough so that it still identifies the application.

¦ Version The DAC version identification helps developers manage changes when

working in Visual Studio. In addition, the version information helps identify the DAC

package version used during deployment. The DAC version information is stored

in the msdb database and can be viewed in SQL Server Management Studio in the

data-tier applications node.

¦ Description This property is optional. Use it to describe the DAC. If this section

is completed, the information is saved in the msdb database under the data-tier

applications node in Management Studio.

FIGURE 3-5 Specifying DAC properties when using the Extract Data-Tier Application Wizard

6. Next, indicate where the DAC package file is to be saved. Remember to use the appropriate

extension, .dacpac. Alternatively, click the Browse button and identify the name

and location for the DAC package file.

7. You also have the option to select the Overwrite Existing File check box to replace a

DAC package with the same name. If you choose a name that already exists for a DAC

package, the existing file is not automatically overwritten. Instead, an exclamation mark

appears next to the Browse button. The Next button on the page is also disabled until

you change the name you specified or select the Overwrite Existing File check box.

8. After you have entered all the DAC properties, click Next to continue.

9. On the Validation And Summary page, illustrated in Figure 3-6, review the information

presented in the DAC properties summary tree because these settings are used to

extract the DAC you specified. The wizard checks and validates object dependencies,

confirms that the information is supported by the DAC, and displays DAC object issues,

DAC object warnings, and DAC objects that are supported. If there are no issues, click Next

to continue. You also have the option to click Save Report to capture the entire report.

FIGURE 3-6 The Extract Data-Tier Application Wizard’s Validation And Summary page

NOTE The Next button is disabled on the Validation And Summary page if one or

more objects are not supported by the DAC. These items need to be addressed before

the wizard can proceed. You usually remedy these issues by removing the unsupported

objects from the database and rerunning the wizard.

10. The Build Package page is the final screen and is used to monitor the status of the

extraction and build process affiliated with the DAC package file. The wizard extracts

a DAC from the selected database, creates the package in memory, and saves the file

to the location specified in the previous steps. You can also click the links in the Result

column to review the outcome and any additional corresponding steps if required, and

then click Save to capture the entire report. Click Finish to complete the data-tier application

extraction process.

Installing a New DAC Instance with the Deploy

Data-Tier Application Wizard

After the DAC package has been created using the data-tier application project template

in Visual Studio 2010, the Extract Data-Tier Application Wizard in SQL Server Management

Studio, or Windows PowerShell commands, the next step is to deploy the DAC package to

a Database Engine instance running SQL Server 2008 R2. This can be achieved by using the

Deploy Data-Tier Application Wizard located in SQL Server Management Studio.

During the deployment process, the wizard registers a DAC instance by storing the DAC

definition in the msdb system database, creates the new database, and then populates the

database with all the database objects defined in the DAC. If a DAC is installed on a managed

instance of the Database Engine, the Data-Tier Application is monitored by the SQL Server Utility.

The DAC can be viewed in the Deployed Data-Tier Applications node of the Management

Studio Utility Explorer and reported in the Deployed Data-Tier Applications details page.

NOTE Data-tier applications can be deployed only on Database Engine instances of SQL

Server running SQL Server 2008 R2. Unfortunately, SQL Server 2008, SQL Server 2005, and

SQL Server 2000 are not supported when you are deploying data-tier applications. However,

upcoming SQL Server cumulative updates or service packs will include functionality

for down-level support.

Follow these steps to deploy a DAC package to an existing SQL Server 2008 R2 Database

Engine instance:

1. In Object Explorer, connect to the SQL Server instance in which you plan to deploy the

Data-Tier Application.

2. Expand the SQL Server instance, and then expand the Management folder.

3. Right-click the Data-Tier Applications node, and then select Deploy Data-Tier Application

to invoke the Deploy Data-Tier Application Wizard.

4. Review the information in the Introduction page, and then click Next to begin the

deployment process. Select the Do Not Show This Page Again check box if you do not

want the Introduction page displayed in the future when using the wizard.

5. On the Select Package page, specify the DAC package you want to deploy. Alternatively,

use the Browse button to specify the location for the DAC package.

6. When the DAC package is selected, verify the DAC details, such as the application

name, version number, and description in the read-only text boxes, as shown in Figure

3-7. Click Next to continue.

FIGURE 3-7 Specifying a DAC package to deploy with the Deploy Data-Tier Application Wizard

NOTE If a database with the same name already exists on the instance of SQL Server,

the wizard cannot proceed.

7. The wizard then analyzes the DAC package to ensure that it is valid. If the DAC package

is valid, the Update Configuration page is automatically invoked. Otherwise, an error is

displayed. You need to address the error(s) and start over again.

8. On the Update Configuration page, specify the database deployment properties. The

options include

¦ Name Specify the name of the deployed DAC and database.

¦ Data File Path Accept the default location or use the Browse button to specify

the location and path where the data file will reside.

¦ Log File Path Accept the default location or use the Browse button to specify the

location and path where the transaction log file will reside.

9. The next page includes a summary of the settings that are used to deploy the data-tier

application. Review the information displayed in the Summary page and DAC properties

tree to ensure that the actions taken are correct, and then click Next to continue.

10. The Deploy DAC page, shown in Figure 3-8, includes results such as success or failure

based on each action performed during the deployment process. These actions include

preparing system tables in msdb, preparing deployment scripts, creating the database,

creating schema objects affiliated with the database, renaming the database, and registering

the DAC in msdb. Review the results for every action to confirm success. You

can also click Save Report to capture the entire report. Then click Finish to complete

the deployment.

FIGURE 3-8 Viewing the deployment and results page associated with deploying the DAC

NOTE Throughout this chapter, you can also use Windows PowerShell scripts in conjunction

with data-tier applications to do many of the tasks discussed, such as

¦ Creating data-tier applications.

¦ Creating server objects.

¦ Loading DAC packages from a file.

¦ Upgrading data-tier applications.

¦ Deleting data-tier applications.

If you are interested in learning more about building Windows PowerShell scripts for

data-tier applications, you can find more information in the white paper “Data-tier

Applications in SQL Server 2008 R2″ at http://go.microsoft.com/fwlink/?LinkID=183214.

Registering a Data-Tier Application

There may be situations in which a database administrator needs to create a data-tier application

based on an existing database and then register and store the newly created DAC definition

for the database in the msdb system database. This execution, often referred to as creating

a DAC in place, is achieved by using either the Register Data-Tier Application Wizard or

Windows PowerShell. Unlike the Extract Data-Tier Application Wizard, which creates a .dacpac

file from an existing database, the Register Data-Tier Application Wizard creates a DAC in place

by registering the DAC definition and metadata in the msdb system database. A DAC registration

can be performed only on a Database Engine instance running SQL Server 2008 R2.

Use the following steps to register a data-tier application from an existing database by using

the Register Data-Tier Application Wizard in Management Studio:

1. In Object Explorer, connect to a SQL Server instance containing the database you want

to register as a data-tier application.

2. Expand the SQL Server instance, and then expand the Databases folder.

3. Invoke the Register Data-Tier Application Wizard by right-clicking the desired database,

selecting Tasks, and then selecting Register As Data-Tier Application.

4. Review the information on the Introduction page, and then click Next to begin the

registration process. Select the Do Not Show This Page Again check box if you do not

want the Introduction page displayed in the future when using the wizard.

5. On the Set Properties page, complete the DAC properties by typing in the application

name, version, and description, as described here:

¦ Application name This refers to the name of the DAC. This value cannot be

altered and is always identical to the name of the database.

¦ Version The DAC version identification helps developers working in Visual Studio

identify the version in which they are currently working. In addition, creating a version

helps identify the version of the DAC package used during deployment. The

DAC version information is stored in the msdb database and can be viewed in SQL

Server Management Studio in the Data-Tier Applications node.

¦ Description This property is optional. Use it to describe the DAC. If this section

is completed, the information is saved in the msdb database under the Data-Tier

Applications node in Management Studio.

6. On the Validation And Summary page, review the information presented in the DAC

properties summary tree because these settings are used to register the specified DAC.

The wizard checks and validates SchemaName, ObjectName, and object dependencies,

and it confirms that the information is supported by the DAC. Review the summary. It

displays DAC object issues, DAC object warnings, and the DAC objects supported. If

there are no issues, click Next to continue. You can also click Save Report to capture

the entire report.

7. The Register DAC screen indicates whether or not the DAC was successfully registered

in the msdb system database. Review the success and failure of each action, and then

click Finish to conclude the registration process.

The data-tier application can now be viewed under the Data-Tier Applications node in SQL

Server Management Studio. Moreover, if a database resides on a utility-managed instance,

resource utilization associated with the data-tier application can be viewed in Utility Explorer

after you connect to a Utility Control Point.

Deleting a Data-Tier Application

Database administrators may encounter occasions when they need to delete a data-tier application

from an instance of SQL Server. This is accomplished by using the Delete Data-Tier Application

Wizard in SQL Server Management Studio. Database administrators should be aware that

they will be prompted by the wizard to choose one of three predefined options for handling

the database linked to the application before the DAC is deleted. The three options are

¦ Delete Registration This method keeps the associated database and login in

place while deleting the DAC metadata from the instance.

¦ Detach Database This method detaches the associated database and removes

the DAC metadata. Detaching the associated database means that although the

data files, log files, and logins remain in place, the database can no longer be referenced

by an instance of the Database Engine.

¦ Delete Database The DAC metadata and the associated database are dropped.

The data and log files are deleted. Logins are not removed.

To delete the DAC, follow these steps:

1. In Object Explorer, connect to a SQL Server instance containing the data-tier application

you plan to delete.

2. Expand the SQL Server instance, and then expand the Management folder.

3. Expand the Data-Tier Applications node, right-click the data-tier application you want

to delete, and then select Delete Data-Tier Application.

4. Review the information in the Introduction page, and then click Next to begin the deletion

process. Select the Do Not Show This Page Again check box if you do not want

the Introduction page displayed in the future when using the wizard.

5. On the Choose Method page, specify the method you want to use to delete the

data-tier application, as illustrated in Figure 3-9. The options are Delete Registration,

Detach Database, and Delete Database. Click Next to continue.

FIGURE 3-9 Choosing the method with which to delete the DAC with the Delete Data-Tier Application

Wizard

6. Review the information displayed in the Summary page, as shown in Figure 3-10.

FIGURE 3-10 Viewing the Summary page when deleting a DAC

Ensure that the application name, database name, and delete method are correct. If

the information is correct, click Next to continue.

7. On the Delete DAC page, take a moment to review the information. This page communicates

which actions failed or succeeded. Unsuccessful actions have a link next to

them in the Result column. Click the link for detailed information about the error. In

addition, you can click Save Report to save the results on the Delete DAC page to an

HTML file. Click Finish to complete the deletion process and close the wizard.

Upgrading a Data-Tier Application

Let us recall the past for a moment, when updating changes to existing database schemas

and database applications was a noticeably challenging task. Database administrators usually

created scripts that included the new or updated database schema changes to be deployed.

The other option was to use third-party tools. Both processes could be expensive, time

consuming, and challenging to manage from a release or build perspective. Today, with SQL

Server 2008 R2, database administrators and developers can upgrade their existing deployed

data-tier applications to a new version of the DAC by simply building a new DAC package that

contains the new or updated schema and properties.

The upgrade can be accomplished by using Windows PowerShell commands or the Upgrade

Data-Tier Application Wizard in SQL Server Management Studio. The tools are intended

to upgrade a deployed DAC to a different version of the same application. For example,

an organization may want to upgrade the Accounting DAC from version 1.0 to version 2.0.

The upgrade wizard first preserves the database that will be upgraded by making a copy of

it. It then creates a new database that includes the schema and objects of the new version of

the DAC. The original database’s mode is then set to read-only, and the data is copied to the

new version. After the data transfer is complete, the new DAC assumes the original database

name. The renamed DAC remains on the SQL Server instance.

There are a few actions that data-tier developers and database administrators should

always perform before a data-tier application upgrade. First, the schema associated with the

original DAC should be compared to the new DAC. Second, database administrators must

confirm that the amount of data held in the existing DAC does not exceed the size limit of the

new DAC database. To upgrade a data-tier application by using the Upgrade Data-Tier Application

Wizard, follow these steps:

1. In Object Explorer, connect to a SQL Server instance containing the DAC you want to

upgrade.

2. Expand the SQL Server instance, and then expand the Management folder.

3. Expand the data-tier applications tree and select the data-tier application that you

want to upgrade.

4. Right-click the data-tier application, and select Upgrade Data-Tier Application. This

starts the Upgrade Data-Tier Application Wizard.

5. Review the information on the Introduction page, and then click Next to begin the upgrade

process. Select the Do Not Show This Page Again check box if you do not want

the Introduction page displayed in the future when using the wizard.

6. On the Select Package page, specify the DAC package that contains the new DAC version

to upgrade to. Alternatively, you can use the Browse button to specify the location of the

DAC package. When the DAC package is selected, you can verify the DAC details, such as

the application name, version number, and description in the read-only text boxes.

IMPORTANT Ensure that the DAC package and the original DAC have the same name.

7. When invoked, the Detect Change page starts off by displaying a progress bar while the

wizard verifies differences between the current schema of the database and the objects

in the DAC definition. The change detection results indicate whether the database objects

have changed or remain the same. If the database has changed, you are warned that

there may be data loss if you proceed with the upgrade, as illustrated in Figure 3-11.

Select the Proceed Despite Possible Loss Of Changes check box, and click Next to continue.

FIGURE 3-11 The Detect Change page of the Upgrade Data-Tier Application Wizard

NOTE If the database has changed, it is a best practice to review the potential data

losses before you proceed and verify that this is the outcome you want for the upgraded

database. However, the original database is still preserved, renamed, and maintained

on the SQL Server instance. Any data changes can be migrated from the original database

to the new database after the upgrade is complete.

8. The next page includes a summary of the settings that will be used to upgrade the

data-tier application. Review the information displayed in the Summary page and the

DAC properties tree to ensure that the actions to be taken are correct, and then click

Next to continue.

9. The Upgrade DAC page, shown in Figure 3-12, includes results, such as the success

or failure of each action performed during the upgrade process. Some of the actions

tested include

¦ Validating the upgrade.

¦ Preparing system tables in msdb.

¦ Preparing the deployment script.

¦ Creating the new database.

¦ Creating schema objects in the database.

¦ Setting the source database as read-only.

¦ Disconnecting users from the existing source database.

¦ Preparing scripts to copy data from the database.

¦ Disabling constraints on the database.

¦ Setting the database to read/write.

¦ Renaming the database.

¦ Upgrading the DAC metadata in msdb to reflect the new DAC version.

Review the result for every action. You can also click Save Report to capture the entire

report. Then click Finish to complete the upgrade.

FIGURE 3-12 Reviewing the result information on the Upgrade DAC page

NOTE Data-tier applications are a large and intricate subject. See the following

sources for more information:

¦ “Designing and Implementing Data-tier Applications” at http://msdn.microsoft.com

/en-us/library/ee210546(SQL.105).aspx

¦ “Creating and Managing Data-tier Applications” at http://msdn.microsoft.com

/en-us/library/ee361996(VS.100).aspx

¦ Tutorials at http://msdn.microsoft.com/en-us/library/ee210554(SQL.105).aspx

¦ “Data-tier Applications in SQL Server 2008 R2” white paper at http://go.microsoft.com

/fwlink/?LinkID=183214

C H A P T E R 4

High Availability and

Virtualization Enhancements

Microsoft SQL Server 2008 R2 delivers several enhancements in the areas of high

availability and virtualization. Many of the enhancements are affiliated with the

Windows Server 2008 R2 operating system and the Hyper-V platform. Windows Server

2008 R2 builds on the successes and foundation of Windows Server 2008 by expanding

on the existing high availability technologies, while adding new features that allow

for maximum availability and reliability for SQL Server 2008 R2 implementations. This

chapter discusses the enhancements to high availability that significantly contribute to

the capabilities of SQL Server 2008 R2 in both physical and virtual environments.

Enhancements to High Availability with

Windows Server 2008 R2

In the following list are a few of the improvements that will appeal to SQL Server and

Windows Server professionals looking to gain maximum high availability within their

database infrastructures.

¦ Hot add CPU and memory When using SQL Server 2008 R2 in conjunction

with Windows Server 2008 R2, database administrators can upgrade hardware

online by dynamically adding processors and memory to a system that supports

dynamic hardware partitioning. This is a very convenient feature for organizations

that cannot endure downtime for SQL Server systems running in mission-critical

environments.

¦ Failover clustering Greater high availability is achievable for SQL Server R2 with

failover clustering on Windows Server 2008 R2. Windows Server 2008 R2 enhances

the failover cluster installation experience by increasing the number of validation

tests within the Cluster Validation Wizard. Moreover, Windows Server 2008 R2

introduces a Best Practices Analyzer tool to help database administrators reduce best

practice violations. Similar to its predecessor, Windows Server 2008 R2 continues to

supports up to 16 nodes within a failover cluster and organizations can also protect

their applications from site failures with SQL Server multi-site failover cluster support

by using stretched VLANs built on Windows Server support for multi-site clusters.

64 CHAPTER 4 High Availability and Virtualization Enhancements

¦ Windows Server 2008 R2 Hyper-V The Hyper-V virtualization technology improvements

in Windows Server 2008 R2 were the most sought-after and anticipated

enhancements for Windows Server 2008 R2. It is now possible to virtualize heavy SQL

Server workloads because Windows Server 2008 R2 scales far beyond its predecessors.

In addition, database administrators can achieve increased virtualization availability by

leveraging new technologies, such as Clustered Shared Volumes (CSV) and Live Migration,

both of which are included in Windows Server 2008 R2. Guest clustering with SQL

Server 2008 R2 in Windows Server 2008 R2 Hyper-V is also supported.

¦ Live Migration and Hyper-V By leveraging Live Migration and CSV—two new

technologies included with Hyper-V and failover clustering on Windows Server 2008

R2—it is possible to move virtual machines between Hyper-V hosts within a failover

cluster without downtime. It is worth noting that CSV and Live Migration are independent

technologies; CSV is not required for Live Migration.

¦ Cluster Shared Volumes (CSV) CSV enables multiple Windows servers running

Hyper-V to access Storage Area Network (SAN) storage using a single consistent

namespace for all volumes on all hosts. This provides the foundation for Live Migration

and allows for the movement of virtual machines between Hyper-V hosts.

¦ Dynamic virtual machine (VM) storage It is possible to add or remove virtual

hard disk (VHD) files and pass-through disks while a VM is running. Support for hot

plugging and hot removal of storage is based on Hyper-V. This is very handy when you

are working with dynamic SQL Server 2008 R2 storage workloads, which are continuously

evolving.

¦ Second Level Address Translation (SLAT) Enhanced processor support and

memory management can be achieved with SLAT, which is a new feature supported

with Hyper-V in Windows Server 2008 R2. SLAT leverages Intel Virtualization Technology

(VT) Extended Page Tables (EPT) and AMD-V Rapid Virtualization Indexing (RVI)

technology in an effort to reduce the overhead incurred during mapping of a guest

virtual address to a physical address for virtual machines. This significantly reduces

hypervisor CPU time and saves memory for each VM, allowing the physical computer

to do more work while utilizing fewer system resources.

Failover Clustering with Windows Server 2008 R2

If you’re unfamiliar with failover clustering, don’t stop reading to run out and purchase a book

on the topic—this section begins with an overview of failover clustering. It may surprise some

readers to know that SQL Server failover clustering has been available since Microsoft SQL

Server 7.0. Back in those days, failover clustering proved to be quite a challenge to set up. It

was necessary to install multiple Microsoft products to form the Microsoft cluster environment,

Failover Clustering with Windows Server 2008 R2 CHAPTER 4 65

including Internet Information Services (IIS), Cluster Server, SQL Server 7.0 Enterprise Edition,

Microsoft Distributed Transaction Coordinator (MSDTC) 2.0, and sometimes the Windows NT

4.0 Option Pack. Moreover, the hardware support, driver support, and documentation were

not as forthcoming as they are today. Many IT organizations came to believe that failover

clustering was a difficult technology to install and maintain. That has all changed, thanks to the

efforts of the SQL Server and Failover Clustering product groups at Microsoft. Today, forming a

cluster with SQL Server 2008 R2 on Windows Server 2008 R2 is very easy. In addition, the two

technologies combined provide maximum availability compared to previous versions, especially

for database administrators who want to virtualize their SQL Server workloads.

Now that you know some of the history behind failover clustering, it’s time to take a

closer look into what failover clustering is all about and what it means for organizations and

database administrators. A SQL Server failover cluster is built on the foundation of a Windows

failover cluster, while providing high availability and protecting the whole instance of SQL

Server in the event of a server failure. Failover clustering allows organizations to meet their

high availability uptime requirements through redundancy in their SQL Server infrastructure

by eliminating single points of failure for the clustered application. The server that is used to

form a cluster can be either physical or virtual. The next section introduces the different types

of failover clusters that can be achieved with these two products (SQL Server 2008 R2 and

Windows Server 2008 R2), which work very well with one another.

Traditional Failover Clustering

The traditional SQL Server failover cluster has been around for years. With a traditional

failover cluster, there are two or more nodes (servers) connected to shared storage. A quorum

is formed between all nodes in the failover cluster, and this quorum determines the health

and number of failures the failover cluster can sustain. Communication between cluster nodes

is required for cluster operations and is achieved by using two or more independent networks

that connect the nodes of a cluster to avoid a single point of failure. SQL Server 2008 R2 is

installed on all nodes within a failover cluster. If a node in the cluster fails, the SQL Server

instance automatically fails over to a surviving node within the failover cluster. Note that the

failover is seamless from an end-user or application perspective. Like its predecessor, SQL

Server 2008 R2 delivers single-instance and multiple-instance failover cluster configurations.

In addition, SQL Server 2008 R2 on Windows Server 2008 R2 supports up to 16 nodes and a

maximum of 23 instances within a failover cluster due to the drive letter limitation.

IMPORTANT When you are configuring a cluster, make sure to connect the nodes by

more than one network; otherwise Microsoft Product Support Services does not support

the implementation. In addition, it is a best practice to always use more than one network.

Figure 4-1 illustrates a two-node single-instance failover cluster running SQL Server on

Windows Server 2008 R2.

Public Network

Heartbeat Network

SQL Cluster\Instance01

SAN Storage

Node1 Node2

FIGURE 4-1 A two-node single-instance failover cluster

Figure 4-2 illustrates a multiple-instance failover cluster running SQL Server on Windows

Server 2008 R2.

Public Network

Heartbeat Network

SQL Cluster\Instance01

SAN Storage

Node1 Node2

SQL Cluster\Instance02

FIGURE 4-2 A two-node multiple-instance failover cluster

Guest Failover Clustering

In the past, physical servers were usually affiliated with the nodes in a failover cluster. Today,

virtualization technologies make it possible to form a cluster with each node being a guest

operating system on virtual servers. This is known as guest failover clustering. To achieve a

guest failover cluster, you must have a quorum, a public network, a private network, and

shared storage; however, instead of using physical servers for each node in the SQL Server

failover cluster, each node is virtualized through Hyper-V. Organizations taking advantage of

guest failover clustering with SQL Server 2008 R2 must have the physical host running Hyper-V

on Windows Server 2008 R2, and the configurations must be certified through the Server Virtualization

Validation Program (SVVP). Likewise, the guest operating system must be Windows

Server 2008 R2, and the virtualization environment must meet the requirements of Windows

Server 2008 R2 failover clustering, including passing the Validate a Configuration tests.

NOTE When implementing failover clusters, you can combine both physical and virtual

nodes in a single failover cluster solution.

Figure 4-3 illustrates a multiple-instance guest failover cluster running SQL Server 2008

R2 on Windows Server 2008 R2. SQLNode1 is a virtual machine running on the server called

Hyper-V01, which is a Hyper-V host, and SQLNode2 is a virtual machine running on the

Hyper-V02 Hyper-V host.

Public Network

Heartbeat Network

SQL Cluster\Instance01

SAN Storage

SQL Node1 SQL Node2

SQL Cluster\Instance02

Hyper-V01 Hyper-V02

V V

P P

P = Physical Server

V = Virtual Server

FIGURE 4-3 A two-node guest failover cluster

NOTE Guest clustering is also supported when Hyper-V is on Windows Server 2008.

However, Windows Server 2008 R2 provides Live Migration for moving virtual machines

between physical hosts. This is much more beneficial for a virtualized environment running

SQL Server 2008 R2.

Real World

When you use guest failover clustering, make sure that the virtualized guest

operating systems used for the nodes in the guest failover cluster are not

on the same physical Hyper-V host. If this situation exists, you have a physical host

running Hyper-V, which means that you have created a single point of failure. For

example, if a single physical host running all of the guest operating systems suddenly

failed, all the nodes associated with the guest failover cluster would no longer

be available, ultimately causing the whole SQL Server failover cluster instance to fail.

This could be catastrophic in a mission-critical production environment. This problem

can be avoided, however, if you use multiple Hyper-V hosts and Live Migration,

and ensure that each guest operating system is running on a separate Hyper-V host.

Enhancements to the Validate A Configuration Wizard

As mentioned earlier in this chapter, organizations in the past found it difficult to implement a

SQL Server failover cluster. One thing that clearly stood out was the need for an intuitive tool

that could verify whether or not an organization’s configuration met the failover clustering

prerequisites. This issue was addressed with the introduction of Windows Server 2008, which

offered for the first time a tool called the Validate A Configuration Wizard.

Database administrators and Windows administrators used this tool to conduct validation

tests to determine whether servers, settings, networks, and storage affiliated with a failover

cluster were set up correctly. This tool was also used to verify whether or not prerequisite tasks

were met and to confirm that the hardware supported a successful cluster implementation.

The Validate A Configuration Wizard tool included with Windows Server 2008 R2 still

delivers inventory, network, storage, and system configuration tests. In addition, the Failover

Clustering product team made enhancements to the Validate A Configuration Wizard tool

that further improve the testing ability of this tool. Some of the enrichments include the following

options:

¦ Cluster Configuration

• List Cluster Core Groups

• List Cluster Network Information

• List Cluster Resources

• List Cluster Volumes

• List Cluster Services And Applications

• Validate Quorum Configuration

• Validate Resource Status

• Validate Service Principal Name

• Validate Volume Consistency

¦ Network

• List Network Binding Order

• Validate Multiple Subnet Properties

¦ System Configuration

• Validate Cluster Service And Driver Settings

• Validate Memory Dump Settings

• Validate System Drive Variable

NOTE The wizard tests configurations and also lists information. See “Failover Cluster

Step-by-Step Guide: Validating Hardware for a Failover Cluster,” a Knowledge Base

article that describes each test in detail, at http://technet.microsoft.com/en-us/library

/cc732035(WS.10).aspx.

Running the Validate A Configuration Wizard

Prior to installing a failover cluster for SQL Server 2008 R2 on Windows Server 2008 R2, administrators

should run the Validate A Configuration Wizard tool by following these steps:

1. Ensure that the failover clustering feature is installed on all the nodes associated with

the new cluster being validated.

2. On one of the nodes of the cluster, open the Failover Cluster Management snap-in.

3. Review the information on the Before You Begin page, and then click Next. You can

select the option to hide this page when using the wizard in the future.

4. On the Select Servers Or A Cluster page, in the Enter Name field, type either the host

name or the fully qualified domain name (FQDN) of a node in the cluster. Alternatively,

you can click the Browse button and select one or more nodes in the cluster. Click Next

to continue.

5. On the Testing Options page, select Run All Tests or Run Only Test I Select, and then

click Next. It is recommended that you choose Run All Tests when using the wizard for

the first time. The tests are organized into Inventory, Network, Storage, and System

Configuration categories.

6. On the Confirmation page, review the details for each test, and then click Next to

begin the validation process. While the validation process is running, status information

is continually displayed on the Validating page until all tests are complete. After all

tests are complete, the Summary page is displayed, as shown in Figure 4-4. It includes

the results of the validation tests and numerous details about the information collected

during each test. Any errors or warnings listed in the validation results should

be looked into and rectified as soon as possible. It is also possible to proceed without

fixing errors; however, the failover cluster will not be supported by Microsoft.

FIGURE 4-4 The Failover Cluster Validation Report

7. Click View Report to observe the report in the default Web browser. The report is displayed

in Web archive (.mht) format. Click Finish to close the wizard.

NOTE The Validate A Configuration Wizard is quite useful for troubleshooting a failover

cluster. Administrators who run tests relating to the specific issues they are experiencing

are likely to yield valuable information and answers on how to address their issues. For example,

if you are experiencing issues with Multipath I/O (MPIO), a specific driver, or shared

storage after a successful implementation of a failover cluster, the wizard would identify

the problem for quick resolution.

The Windows Server 2008 R2 Best Practices Analyzer

Another tool available in Windows Server 2008 R2 is a server management tool referred to

as the Best Practices Analyzer (BPA). The BPA determines how compliant a server role is by

comparing it against best practices in eight categories: security, performance, configuration,

policy, operation, pre-deployment, post-deployment, and BPA prerequisites. In each category,

the effectiveness, trustworthiness, and reliability of a role is taken into consideration. Each

role measured by the BPA will be assigned one of the following three severity levels: Noncompliant,

Compliant, or Warning. A server role not in agreement with best practice guidelines

is labeled as Noncompliant, and a role in agreement with best practice guidelines is

labeled as Compliant. Server roles inherit the Warning severity level when a BPA scan detects

compliance but also a risk that the server role will fall out of compliance.

Database administrators find this tool instrumental in achieving success with their failover

cluster setup. First, the Windows Server 2008 R2 BPA can help database administrators reduce

best-practice violations by scanning one or more roles installed on a server running Windows

Server 2008 R2. On completion, the BPA creates a report that itemizes every best-practice

violation, from the most severe to the least severe. It is also possible to customize a BPA

report. For example, database administrators can omit results they deem unnecessary or

unimportant. Last, administrators can also perform BPA tasks by using either the Server Manager

GUI or Windows PowerShell cmdlets.

Running the Best Practices Analyzer

The BPA is installed by default on all editions of Windows Server 2008 R2 except the Server

Core installation option. If BPA is installed on your edition, run it in Server Manager. Follow

these steps:

1. Click Start, click Administrative Tools, and then select Server Manager.

2. Open Roles from the navigation pane. Next, select the role to be scanned with BPA.

3. Open the Summary section in the details pane. Next, open the Best Practices Analyzer

area.

4. Click Scan This Role to initiate the scan.

5. When the scan is complete, review the results in the Best Practices Analyzer results

window.

SQL Server 2008 R2 Virtualization and Hyper-V

Virtualization is one of the hottest topics of discussion in almost every SQL Server architecture

design session or executive briefing session, mainly because organizations are beginning to

understand the immediate and long-term benefits virtualization can offer them. SQL Server

virtualization not only promises to be very positive and rewarding from an environmental

perspective—reducing power and thermal costs which translate to green IT—it also promises

to help organizations achieve strategic business objectives and consolidation goals, including

lower hardware costs, smaller data centers, and less management associated with SQL Server.

As a result, increasing numbers of organizations are showing interest in virtualizing their

SQL Server workloads, including their test, staging, and even production environments. This

trend toward virtualization has undoubtedly become stronger with the release of Windows

Server 2008 R2, which includes Live Migration and Cluster Shared Volumes (CSV). By leveraging

Live Migration and CSV, organizations can achieve high availability for SQL Server virtual

machines (VMs). In addition, it is possible to move virtualized SQL Server 2008 R2 guest operating

systems between physical Hyper-V hosts without any perceived downtime.

Live Migration Support Through CSV

Live Migration is a new Hyper-V feature in Windows Server 2008 R2 that is used to increase

high availability of SQL Server VMs. By leveraging the new Live Migration feature, organizations

can transparently move SQL Server 2008 R2 VMs from one Hyper-V physical host to

another Hyper-V physical host within the same cluster, without disrupting the services of the

guest operating system or SQL Server application running on the VM. This is achieved via an

intricate process. First, all VM memory pages are transferred from the source Hyper-V physical

host to the destination Hyper-V physical host. Second, any VM modifications to the VMs

memory pages on the source Hyper-V physical host are tracked. These tracked and modified

pages are transferred to the physical Hyper-V target computer. Third, the storage handles for

the VMs’ VHD files are moved to the Hyper-V target computer. Finally, the destination VM is

brought online.

The Live Migration feature is supported only when Hyper-V is run on Windows Server

2008 R2. Live Migration can take advantage of the new CSV feature within failover clustering

in Windows Server 2008 R2. The CSVs let multiple nodes in the same failover cluster concurrently

access the same logical unit number (LUN). Equally important, because a Hyper-V

cluster must be formed as a prerequisite task, Live Migration requires the failover clustering

feature to be added and configured on all of the servers running Hyper-V. In addition, the

Hyper-V cluster hosts require shared storage for the cluster nodes. This can be achieved by

either an iSCSI, Serial Attached SCSI (SAS) or Fibre Channel Storage Area Network (SAN).

Figure 4-5 illustrates a four-node Hyper-V failover cluster with two CSVs and eight SQL

Server guest operating systems. With Live Migration, running SQL Server VMs can be seamlessly

moved between Hyper-V hosts.

Hyper-V01 Hyper-V02 Hyper-V03 Hyper-V04

C:\ClusterShares\Volume1

VHD VHD VHD VHD

C:\ClusterShares\Volume2

VHD VHD VHD VHD

FIGURE 4-5 A Hyper-V cluster and Live Migration

Windows Server 2008 R2 Hyper-V System Requirements

Table 4-1 below outlines the minimum requirements, along with the recommended system

configuration, for using Hyper-V on Windows Server 2008 R2.

TABLE 4-1 Hyper-V System Requirements

MINIMUM

RECOMMENDED

Processor

x64-compatible processor with

Intel VT or AMD-V technology

enabled

—

CPU speed

1.4 GHz

2.0 GHz or faster—additional CPUs

are required for each guest operating

system

RAM

1 GB—additional RAM is required

for each guest operating system

2 GB or higher—additional RAM is

required for each guest operating

system

Disk space

8 GB—additional disk space is

needed for each guest operating

system

20 GB or higher—additional disk

space is needed for each guest

operating system

NOTE System requirements vary based on an organization’s virtualization requirements.

Organizations should size their workloads to ensure that the Hyper-V hosts can successfully

accommodate all of the virtual servers and associated workloads from a CPU, memory, and

disk perspective.

Practical Uses for Hyper-V and SQL Server 2008 R2

Hyper-V on Windows Server 2008 R2 is capable of accomplishing almost the same successes

as dedicated servers, including the same kinds of peak load handling and security. Knowing

this, you might wonder when Hyper-V on Windows Server 2008 R2 should be employed from

a SQL Server 2008 R2 perspective. Hyper-V on Windows Server 2008 R2 can be utilized for

¦ Consolidating SQL Server databases or instances on a single physical server.

¦ Virtualizing SQL Server infrastructure workloads with low utilization.

¦ Achieving high availability for SQL Server VMs by using Live Migration or guest clustering.

¦ Maintaining different versions of SQL Server and the operating system on the same

physical server.

¦ Virtualizing test and development environments to reduce total cost of ownership.

¦ Reducing licensing, power, and thermal costs.

¦ Extending physical space when the data center lacks it.

¦ Repurposing and extending the life of old SQL Server hardware by conducting a

physical-to-virtual (P2V) migration.

¦ Migrating legacy SQL Server editions off hardware that is old and that has expired

warranties.

¦ Generating self-contained SQL Server environments, also known as sandboxes.

¦ Taking advantage of the rapid deployment capabilities of SQL Server VMs by using

Microsoft System Center Virtual Machine Manager (VMM) 2008 R2.

¦ Storing and managing SQL Server VMs in VMM libraries.

By using virtual servers, organizations can take advantage of powerful features such as

multi-core technology, and they can achieve better handling of disk access and greater memory

support. In addition, Hyper-V improves scalability and performance for a SQL Server VM.

NOTE The Microsoft Assessment and Planning Toolkit can be used to identify whether or

not an organization’s SQL Server systems are good candidates for virtualization. The toolkit

also includes tools for SQL Server inventory, assessments, and intuitive reporting. A download

of the Microsoft Assessment and Planning Toolkit is available on the Microsoft Download

Center at http://www.microsoft.com/downloads/details.aspx?FamilyID=67240b76-

3148-4e49-943d-4d9ea7f77730&displaylang=en.

Implementing Live Migration for SQL Server 2008 R2

Follow these steps to take advantage of Live Migration for SQL Server 2008 R2 VMs:

1. Ensure that the hardware, software, drivers, and components are supported by

Microsoft and Windows Server 2008 R2.

2. Set up the hardware, shared storage, and networks as recommended in the failover

cluster deployment guides.

NOTE “Hyper-V: Using Hyper-V and Failover Clustering,” the TechNet article at the

following link, includes step-by-step instructions on how to implement Hyper-V and

failover clustering: http://technet.microsoft.com/en-us/library/cc732181(WS.10).aspx.

In addition to step-by-step instructions on how to implement Hyper-V and failover

clustering, this page also gives information on the requirements for using Hyper-V and

failover clustering, which might be helpful because, the steps in the following sections

assume that a Hyper-V cluster is already in place.

3. For all nodes that you are including in the failover cluster, install Windows Server 2008 R2

(full installation or Server Core installation).

4. Enable the Hyper-V role on each node of the failover cluster.

5. Install the Failover Clustering feature on each node of the failover cluster.

6. Validate the cluster configuration by using the Validate A Configuration Wizard tool

located in Failover Cluster Manager.

7. Configure CSV.

8. Create a SQL Server VM with Hyper-V.

9. Set up a SQL Server VM for Live Migration.

10. Configure cluster networks for Live Migration.

Enabling CSV

Assuming that the Hyper-V cluster has already been built, the next step is enabling CSV in

Failover Cluster Manager. Follow the steps in this section to enable CSV on a Hyper-V failover

cluster running on Windows Server 2008 R2.

1. On a server in the Hyper-V failover cluster, click Start, click Administrator Tools, and

then click Failover Cluster Manager.

2. In the Failover Cluster Manager snap-in, verify that CSV is present for the cluster that is

being enabled. If it is not in the console tree, right-click Failover Cluster Manager, click

Manage A Cluster, and then select or specify the cluster to be configured.

3. Right-click the failover cluster, and then choose Enable Cluster Shared Volumes.

4. The Enable Cluster Shared Volumes dialog box opens. Read and accept the terms and

restrictions associated with CSV. Then click OK.

5. In this step, you add storage to the CSV. You can do this either by right-clicking Cluster

Shared Volumes and selecting Add Storage or by selecting Add Storage under Actions.

6. In the Add Storage dialog box, select from the list of available disks, and then click OK.

7. After the disk or disks selected have been added, they appear in the Results pane for

Cluster Shared Volumes.

NOTE SystemDrive\ClusterStorage is the CSV storage location for each node associated

with the failover cluster. Folders for each volume added to the CSV are stored in this

location. Administrators needing to view the list of volumes can do so in Failover Cluster

Manager.

Creating a SQL Server VM with Hyper-V

Before leveraging Live Migration, organizations must follow the instructions in this section to

create a SQL Server VM with Hyper-V in Windows Server 2008 R2.

1. Ensure that the Hyper-V role is installed on the server that you use to create the SQL

Server 2008 R2 VM.

2. Click Start, click Administrative Tools, and then click Hyper-V Manager.

3. In the Action pane, click New, and then click Virtual Machine. The New Virtual Machine

Wizard starts.

4. Read the information on the Before You Begin page, and then click Next. You can

select the option to hide this page on all future uses of the wizard.

5. On the Specify Name And Location page, enter the name of the SQL Server VM and

specify where it will be stored. For example, the name SQLServer2008R2-VM01 and the

VM can be stored on Cluster Shared Volume 1, as displayed in Figure 4-6.

FIGURE 4-6 The Specify Name And Location Screen when a new virtual machine is being created

NOTE If a folder is not selected, the SQL Server VM is stored in the default folder configured

for the Hyper-V server.

6. On the Memory page, enter the amount of memory to be allocated to the SQL Server’s

VM guest operating system. Click Next.

NOTE With SQL Server 2008 R2, it is recommended that you have 2.048 GB or more of

RAM, whereas with Windows Server 2008 R2 a minimum of 512 MB of RAM is recommended.

Remember to ensure that SQL Server workloads are sized accordingly, and

remember to take into consideration the amount of RAM required for each SQL Server

VM. Also, remember that it is possible to shut down the guest operating system and

add more RAM to the virtual machine if necessary.

7. On the Networking page, connect the network adapter to an existing virtual network

by selecting the appropriate network adapter from the menu. Click Next to continue.

8. On the Connect Virtual Hard Disk page, as shown in Figure 4-7, specify the name, location,

and size to create a virtual hard disk so that you can install an operating system.

Click Next to continue.

FIGURE 4-7 The Connect Virtual Hard Disk page when a new virtual machine is being created

9. On the Installation Options page, choose a method to install the operating system. The

options include

¦ Installing an operating system from a boot CD/DVD-ROM.

¦ Installing an operating system from a boot floppy disk.

¦ Installing an operating system from a network-based installation server.

¦ Installing an operating system at a later time.

After choosing the method, click Next to continue.

10. Review the selections in the Completing The New Virtual Machine Wizard, and then

click Finish.

The new VM is created; however, it is in an offline state.

11. From the Virtual Machines section of the results pane in Hyper-V Manager, rightclick

the name of the SQL Server VM you just created, and click Connect. The Virtual

Machine Connection tool opens.

12. In the Action menu in the Virtual Machine Connection window, click Start.

13. Follow the prompts to install the Windows Server 2008 R2 operating system.

14. When the operating system installation is complete, install SQL Server 2008 R2.

Real World

After an operating system is set up, best practice guidelines recommend the

installation of the Hyper-V Integration Services tools for every VM that was

created. The Hyper-V Integration Services tool provides virtual server client (VSC)

code, which ultimately increases Hyper-V performance of the VM from an I/O,

memory management, and network performance perspective. Hyper-V Integration

Services is installed by connecting to the VM and selecting Insert The Integration

Services Setup Disk from the Action Menu of the Virtual Machine Connection window.

Click Install in the AutoPlay dialog box to install the tools.

Configuring a SQL Server VM for Live Migration

Organizations interested in using Live Migration need to set up a VM for Live Migration. This

is accomplished by reconfiguring the automatic start action for the VM and then preparing

the VM for high availability by using Failover Cluster Manager. The following steps illustrate

this series of actions in more detail:

1. Create a SQL Server 2008 R2 VM based on the steps in the previous section. Verify that

the VM is using CSV.

2. In Hyper-V Manager, under Virtual Machines, highlight the VM created in the previous

steps (SQLServer2008R2-VM01 in the example in this chapter). In the Action pane,

under the VM name, click Settings.

3. In the left pane, click Automatic Start Action.

4. Under Automatic Start Action, for the What Do You Want This Virtual Machine To Do

When The Physical Computer Starts? question, select Nothing, as shown in Figure 4-8.

Then click Apply and OK.

FIGURE 4-8 Configuring the Automatic Start Action Setting screen

5. Launch Failover Cluster Manager from Administrative Tools on the Start menu.

6. In the Failover Cluster Manager snap-in, if the cluster that will be configured is not displayed

in the console tree, right-click Failover Cluster Manager. Click Manage A Cluster,

and then select or specify the cluster.

7. If the console tree is collapsed, expand the tree under the cluster you want.

8. Click Services And Applications.

9. In the Action pane, click Configure A Service Or Application.

10. If the Before You Begin page of the High Availability Wizard appears, click Next.

11. On the Select Service Or Application page, shown in Figure 4-9, click Virtual Machine,

and then click Next.

FIGURE 4-9 Selecting the service and application for high availability

12. On the Select Virtual Machine page, shown in Figure 4-10, confirm the name of the VM

you plan to make highly available. In this example, SQLServer2008R2-VM01 is used.

Click Next.

FIGURE 4-10 Configuring a VM for high availability

NOTE To make a VM highly available, you must ensure that it is not running. It must

be either turned off or shut down.

13. Confirm the selection, and then click Next.

14. The wizard configures the VM for high availability and provides a summary. To view

the details of the configuration, click View Report. To close the wizard, click Finish.

15. To verify that the virtual machine is now highly available, look in one of two places in

the console tree:

¦ Expand Services And Applications, shown in Figure 4-11. The VM should be listed

under Services And Applications.

¦ Expand Nodes. Select the node on which the VM was created. The VM should be

listed under Services And Applications in the Results pane.

FIGURE 4-11 Verifying that the VM is now highly available

16. To bring the VM online, right-click it under Services And Applications, and then click

Start Virtual Machine. This action brings the VM online and starts it.

Initiating a Live Migration of a SQL Server VM

After an administrator has enabled CSV, created a SQL Server 2008 R2 VM, configured the

automatic start option, and made the VM highly available, it is time to initiate a live migration.

Perform the following steps to initiate Live Migration:

1. In the Failover Cluster Manager snap-in, if the cluster to be configured is not displayed

in the console tree, right-click Failover Cluster Manager.

2. Click Manage A Cluster, and then select or specify the cluster. Expand Nodes.

3. In the console tree located on the left side, select the node to which Live Migration will

move the clustered VM.

4. Right-click the VM resource that is displayed in the center pane, and then click Live

Migrate Virtual Machine To Another Node.

5. Select the node that the VM will be moved to in the migration, as shown in Figure 4-12.

After the migration is complete, the VM should be running on the node selected.

FIGURE 4-12 Initiating Live Migration for a SQL Server VM.

6. Verify that the VM successfully migrated to the node selected. The VM should be listed

under the new node in Current Owner.

C H A P T E R 5

Consolidation and Monitoring

Today’s competitive economy dictates that organizations reduce cost and improve

agility in their database environments. This means the large percentage of organizations

out there running underutilized Microsoft SQL Server installations must take control

of their environments in order to experience significant cost savings and increased

activity. Thankfully, enhancements in hardware and software technologies have unlocked

new opportunities to reduce costs through consolidation. Consolidation reduces the

number of physical servers in an organization’s environment, directly impacting costs in

numerous areas including, but not limited to hardware, administration, power consumption,

and licenses. Equally important, by leveraging the new SQL Server Utility feature

in Microsoft SQL Server 2008 R2, organizations can streamline consolidation efforts

because this feature provides database administrators (DBAs) with insight into resource

utilization through policy evaluation and historical analysis.

This chapter begins by describing the consolidation options available to DBAs. It then explains

how DBAs can take advantage of viewpoints and dashboards in the SQL Server Utility

to identify consolidation opportunities, which is done by monitoring resource utilization

and health state for SQL Server instances, databases, and deployed data-tier applications.

SQL Server Consolidation Strategies

The goal of SQL Server consolidation is to identify underutilized hardware and improve

utilization by choosing an appropriate consolidation strategy. With SQL Server, hardware

could be considered to be underutilized when workloads are using less than 30 percent

of server resources. However, underutilization thresholds vary based on the hardware

utilized for SQL Server and the organization. Some compelling reasons for organizations

to consolidate are to reduce costs, improve efficiency, address lack of physical space in

the data center, create more effective service levels, standardize, and centralize management.

Some common consolidation strategies organizations can apply are described in

the rest of this section.

86 CHAPTER 5 Consolidation and Monitoring

Consolidating Databases and Instances

A very common SQL Server consolidation strategy involves placing many databases on a single

instance of SQL Server. This approach offers organizations improved operations through

centralized management, standardization, and improved performance. For example,

multiple databases belonging to the same SQL Server instance facilitates shared memory

optimization, and database consolidation helps to reduce overhead due to fixed resource

costs per instance. There are some limitations with database-level consolidation, however. For

example, in this scenario, all databases share the same service account, maintain the same

global settings, and share a single tempdb database for processing temporary workloads.

Figure 5-1 shows many databases being consolidated onto a single physical host running one

instance of SQL Server.

SQLInstance01

FIGURE 5-1 Consolidating many databases onto a single physical host running one instance of SQL Server

Many times, it is not possible to consolidate all of your databases onto a single instance,

possibly because additional service isolation is required or a single instance cannot sustain

the workload of all of the databases. In addition, a single tempdb database could be a

performance bottleneck. Your organization might also find this scenario problematic if it has

requirements to maintain different service level agreements for each database, if there are

too many databases consolidated on the system, if databases need to be isolated for security

and regulatory compliance reasons, or if databases require different collation settings.

You can still consolidate databases if you have these types of requirements; however, you

may need more instances or physical hosts to support your consolidation needs. For example,

the diagram in Figure 5-2 illustrates the consolidation of many databases onto a single physical

host running three instances of SQL Server, whereas the diagram in Figure 5-3 represents

an alternative, in which many databases are consolidated onto many instances residing on

two separate physical hosts.

SQL Server Consolidation Strategies CHAPTER 5 87

SQLInstance01

SQLInstance02

SQLInstance03

FIGURE 5-2 Consolidating many databases onto a single physical host running three instances

of SQL Server

SQLInstance01

SQLInstance02

SQLInstance03

FIGURE 5-3 Consolidating many databases onto multiple physical hosts running multiple instances

of SQL Server

Consolidating SQL Server Through Virtualization

Another SQL Server consolidation strategy attracting interest is virtualization. Virtualization’s

growing popularity is based on many factors, including its ability to significantly reduce

total cost of ownership (TCO) and the number of physical servers within an infrastructure.

Benefits include the need for fewer physical servers, as well as lower licensing costs. At the

heart of all the excitement over virtualization is Live Migration. This new, built-in feature is

a Windows Server 2008 R2 Hyper-V enhancement. Live Migration increases high availability

and improves service by reducing planned outages. It allows DBAs to move SQL Server

virtual machines (VMs) between physical Hyper-V hosts without any perceived interruption in

service. Hyper-V on Windows Server 2008 R2 also allows for maximum scalability because it

supports up to 64 logical processors. As a result, it is possible to virtualize and consolidate numerous

SQL Server instances, databases, and workloads onto a single host. Another benefit is

that Live Migration allows an organization to not only completely isolate its operating system

with virtualization but also to host multiple editions of SQL Server while running both 32-bit

and 64-bit versions within a single host. In addition, physical SQL Servers can easily be virtualized

by using the physical-to-virtual (P2V) migration tool included with System Center Virtual

Machine Manager 2008 R2. Figure 5-4 illustrates a consolidation strategy in which many databases,

instances, and physical SQL Server systems are virtualized on a single Hyper-V host.

SQLInstance01

SQLInstance02

SQLInstance03

SQLInstance01

SQLInstance03

Server 1

Server 2

Server 3

Hyper-V host

FIGURE 5-4 Consolidating many databases, instances, and physical hosts with virtualization

No matter what consolidation strategy an organization adapts, the benefits are significant

without any sacrifice of scalability and overall performance. Now that the consolidation

strategies have been explained, it is time to explore how an organization can quickly recognize

whether its database environment is a candidate for consolidation and can ultimately

streamline its consolidation efforts by monitoring resource utilization.

Using the SQL Server Utility for Consolidation and

Monitoring

The SQL Server Utility is the center of operations for monitoring managed instances of SQL

Server, databases, and deployed data-tier applications. By using the dashboards and viewpoints

included in the SQL Server Utility, DBAs can proactively monitor and view resource

utilization, health state, and health policies for managed instances, databases, and deployed

data-tier applications at scale. The results obtained from monitoring allow DBAs to easily

identify consolidation candidates across an organization’s database environment. To experience

the dashboards and viewpoints yourself, launch the SQL Server Utility by following these steps:

IMPORTANT Before you can carry out these steps, you must have created a Utility

Control Point, and you must enroll at least one instance of SQL Server. For more information

on how to do this, see Chapter 2,”Multi-Server Administration.”

1. In SQL Server Management Studio, connect to the SQL Server 2008 R2 Database Engine

instance in which the UCP was created.

2. Launch Utility Explorer by clicking View and then selecting Utility Explorer.

3. In the Utility Explorer navigation pane, click the Connect To Utility icon.

4. In the Connect To Server dialog box, specify the SQL Server instance running the UCP,

select the type of authentication, and then click Connect.

5. Connection to a Utility Control Point is complete. Begin monitoring the health state

and resource utilization by viewing the dashboards and viewpoints.

Utility Explorer in SQL Server Management Studio provides a tree view that includes nodes

for monitoring and managing settings within the SQL Server Utility. The summary dashboard

is automatically displayed in the Utility Explorer Content pane when you connect to a UCP.

You can view additional dashboards and viewpoints by clicking the Managed Instances node

or the Deployed Data-Tier Applications node in the Utility Explorer navigation pane, as displayed

in Figure 5-5.

FIGURE 5-5 Utility Explorer and the navigation tree

The three main dashboards for monitoring and managing resource utilization and consolidation

efforts are discussed in the next sections. These dashboards and viewpoints are

¦ The SQL Server Utility dashboard.

¦ The Managed Instance viewpoint.

¦ The Data-Tier Applications viewpoint.

Using the SQL Server Utility Dashboard

The SQL Server Utility dashboard is the starting place for obtaining summary information

about managed instances of SQL Server and deployed data-tier applications in the SQL Server

Utility. The summary of the data, as illustrated in Figure 5-6, is sectioned into nine parts and

can be viewed in the Utility Explorer Content pane by clicking a Utility Control Point, which is

the top node in the Utility Explorer tree.

FIGURE 5-6 The SQL Server Utility dashboard

The SQL Server Utility dashboard includes the following information:

¦ Utility Summary Found in the center of the top row of the Utility Explorer Content

pane, this section is the first place to look. It displays the number of managed instances

of SQL Server and the number of deployed data-tier applications managed by the SQL

Server Utility. Use the Utility Summary section to gain quick insight into the number of

objects being managed by the SQL Server Utility. In Figure 5-6, there are 14 managed instances

and nine deployed data-tier applications displayed in the Utility Summary section.

NOTE After you have reviewed the summary information, it is recommended that

you analyze either the managed instances or deployed data-tier application section

in its entirety to gain a comprehensive understanding of its overall health status. For

example, the first set of the following bullets interpret the health of managed instances.

After managed instances are analyzed and explained, then the health of data-tier

applications is reviewed from beginning to end.

¦ Managed Instance Health This section is located in the top-left corner of the Utility

Explorer Content pane and summarizes the health status of all managed instances

of SQL Server in the SQL Server Utility. Health status is illustrated in a pie chart and has

four possible designations:

. Well Utilized The number of managed instances of SQL Server that are not violating

resource utilization policies is displayed.

. Overutilized A SQL Server instance is marked as overutilized if any of the following

conditions are true:

¦ CPU resources for the instance of SQL Server are overutilized.

¦ CPU resources of the computer that hosts the SQL Server instance are

overutilized.

¦ The instance contains data or log files with overutilized storage space.

¦ The instance contains data or log files that reside on volumes with overutilized

storage space.

. Underutilized A SQL Server instance is marked as underutilized if it is not

marked as overutilized and any of the following conditions are true:

¦ CPU resources allocated to the instance of SQL Server are underutilized.

¦ CPU resources of the computer that hosts the SQL Server instance are

underutilized.

¦ The instance contains data or log files with underutilized storage space.

¦ The instance contains data or log files that reside on volumes with underutilized

storage space.

. No Data Available Either data has not been uploaded from a managed instance

or there is a problem with the collection and upload process.

By viewing the Managed Instance Health section, DBAs are able to quickly obtain an

overview of resource utilization across all managed instances within the utility. The

example in Figure 5-6 shows that five managed instances are well utilized, six are overutilized,

none are underutilized, and data is unavailable for three managed instances in

the Managed Instance Health section.

¦ Managed Instances With Overutilized Resources This section is found directly

under the Managed Instance Health section. It displays overutilization data for managed

instances of SQL Server based on the following categories:

. Overutilized Instance CPU This represents the number of managed instances

of SQL Server that are violating instance CPU overutilization policies.

. Overutilized Database Files This represents the number of managed instances

of SQL Server with database files that are violating file space overutilization policies.

. Overutilized Storage Volumes This represents the number of managed instances

of SQL Server with database files on storage volumes that are violating file

space overutilization policies.

. Overutilized Computer CPU This represents the number of managed instances

of SQL Server running on computers that are violating computer CPU overutilization

policies.

Detailed status for each health parameter is listed in a sliding indicator to the right of

each element in this section.

¦ Managed Instances With Underutilized Resources This section is located under

the Managed Instances With Overutilized Resources section and displays underutilization

data for managed instances of SQL Server based on the following categories:

. Underutilized Instance CPU This represents the number of managed instances

of SQL Server that are violating instance CPU underutilization policies.

. Underutilized Database Files This represents the number of managed instances

of SQL Server with database files that are violating volume space underutilization

policies.

. Underutilized Storage Volumes This represents the number of managed

instances of SQL Server with database files on storage volumes that are violating file

space underutilization policies.

. Underutilized Computer CPU This represents the number of managed instances

of SQL Server running on computers that are violating computer CPU underutilization

policies.

Detailed status for each health parameter is listed in a sliding indicator to the right of

each element in this section.

¦ Data-Tier Application Health This section is located in the top-right corner of the

Utility Explorer Content pane. Health status is illustrated in a pie chart and has four

possible designations:

. Well Utilized The number of deployed data-tier applications that are not violating

resource utilization policies is displayed.

. Overutilized The number of deployed data-tier applications that are violating

resource overutilization policies is displayed. A deployed data-tier application is

marked as overutilized if any of the following conditions are true:

¦ CPU resources for the deployed data-tier application are overutilized.

¦ CPU resources of the computer that hosts the SQL Server instance are

overutilized.

¦ Storage volumes associated with the deployed data-tier application are

overutilized.

¦ The deployed data-tier application contains data or log files that reside on volumes

with overutilized storage space.

. Underutilized The number of deployed data-tier applications that are violating

resource underutilization policies is displayed. A deployed data-tier application is

marked as underutilized if any of the following conditions are true:

¦ CPU resources for the deployed data-tier application are underutilized.

¦ CPU resources of the computer that hosts the SQL Server instance are

underutilized.

¦ Storage volumes associated with the deployed data-tier application are

underutilized.

¦ The deployed data-tier application contains data or log files that reside on

volumes with underutilized storage space.

. No Data Available Either data affiliated with deployed data-tier applications has

not been uploaded to the Utility Control Point or there is a problem with the collection

and upload process.

By viewing the Data-Tier Application Health section, DBAs can quickly obtain a holistic

view of resource utilization for all deployed data-tier applications managed by the SQL

Server Utility. In Figure 5-6, there are seven well-utilized and two overutilized data-tier

applications.

¦ Data-Tier Applications With Overutilized Resources This section is found

directly under the Data-Tier Application Health section. It displays overutilization data

for deployed data-tier applications based on the following categories:

. Overutilized Data-Tier Application CPU This represents the number of

deployed data-tier applications that are violating data-tier application CPU

overutilization policies.

. Overutilized Database Files This represents the number of deployed data-tier

applications with database files that are violating file space overutilization policies.

. Overutilized Storage Volumes This represents the number of deployed datatier

applications with database files on storage volumes that are violating file space

overutilization policies.

. Overutilized Computer CPU This represents the number of deployed data-tier

applications running on computers that are violating computer CPU overutilization

policies.

Detailed status for each health parameter is listed in a sliding indicator to the right of

each element in this section.

¦ Data-Tier Applications With Underutilized Resources This section is located

directly under the Data-Tier Applications With Overutilized Resources section. This

section displays underutilization data of individual instances based on the following

categories:

. Underutilized Data-Tier Application CPU This represents the number of

deployed data-tier applications that are violating data-tier application CPU underutilization

policies.

. Underutilized Database Files This represents the number of deployed data-tier

applications with database files that are violating file space underutilization policies.

. Underutilized Storage Volumes This represents the number of deployed datatier

applications with database files on storage volumes that are violating file space

underutilization policies.

. Underutilized Computer CPU This represents the number of deployed datatier

applications running on computers that are violating computer CPU underutilization

policies.

Detailed status for each health parameter is listed in a sliding indicator to the right of

each element in this section.

¦ Utility Storage Utilization History Located at the bottom-left corner of the Utility

Explorer Content pane, this section uses a time graph to display the storage utilization

history for the amount of storage the SQL Server Utility is consuming in gigabytes.

By using the buttons under the Interval heading , you can view data in the graph by

the following intervals:

. 1 Day Displays data in 15-minute intervals

. 1 Week Displays data in one-day intervals

. 1 Month Displays data in one-week intervals

. 1 Year Displays data in one-month intervals

¦ Utility Storage Utilization The bottom-right corner shows a pie chart that displays

the amount of space used and the amount of free space available on the volume hosting

the SQL Server Utility. It is worth noting that the data is refreshed every 15 minutes.

This section explained how to obtain summary information for all managed instances of

SQL Server. DBAs seeking more information might be interested in the Managed Instances

node in the tree view of Utility Explorer. This node helps database administers gain deeper

knowledge of health status and resource utilization data for each managed instances of SQL

Server. The next section discusses this dashboard.

TIP When working with the SQL Server Utility dashboard, you can click on a link to reveal

additional details about a specific policy.

Using the Managed Instances Viewpoint

DBAs can display the Managed Instances viewpoint in the Utility Explorer Content pane by

connecting to a UCP and then selecting the Managed Instances node in the Utility Explorer

tree. The Utility Explorer Content pane displays the viewpoint, as shown in Figure 5-7, which

communicates the health state and resource utilization information for numerous items including

the CPU, storage, and policies for each managed instance of SQL Server.

FIGURE 5-7 The Managed Instances viewpoint

Resource utilization for each managed instance of SQL Server is presented in the list view

located at the top of the Utility Explorer Content pane. Health state icons appear to the right

of each managed instance and provide summary status for each instance of SQL Server based

on the utilization category. Three icons are used to indicate the health of each managed

instance of SQL Server. A green check mark indicates that an instance is well utilized and does

not violate any policies. A red arrow indicates that an instance is overutilized, and a green arrow

indicates underutilization. The lower half of the dashboard contains tabs for CPU utilization,

storage utilization, policy details, and property details for each managed instance.

In Figure 5-7, the instance CPU, computer CPU, file space, and volume space columns for

SQL2K8R2-01\INSTANCE01 and SQL2K8R2-01\INSTANCE05 are all underutilized. In addition,

the following other elements are underutilized: the Instance CPU for SQL2K8R2-03\

INSTANCE03, Computer CPU for SQL2K8R2-01\INSTANCE02, SQL2K8R2-01\INSTANCE03,

SQL2K8R2-01\INSTANCE04 and SQL2K8R2-01\INSTANCE05, File Space for SQL2K8R2-02\INSTANCE03

and Volume Space for SQL2K8R2-01\INSTANCE02, SQL2K8R2-01\INSTANCE03, and

SQL2K8R2-01\INSTANCE04. The volume space for SQL2K8R2-02, SQL2K8R2-02\INSTANCE02,

SQL2K8R2-03, SQL2K8R2-03\INSTANCE02, SQL2K8R2-03\INSTANCE03, and SQL2K8R2-03\

INSTANCE04 are all overutilized, and the remainder of managed instances are well utilized.

The Managed Instances list view columns and utilization tabs are discussed in more detail

in the next sections.

The Managed Instances List View Columns

The health status of each managed instance of SQL Server in the Managed Instances list view

is analyzed against four types of utilization and the current policy in place for each:

¦ Instance CPU This column indicates processor utilization of the managed instance.

The health state is determined by the global CPU Utilization For All Managed Instances

Of SQL Server policy, which is predetermined for all managed instances of SQL Server.

However, by clicking on the Policy Tab in the bottom half of the view, DBAs can override

this global policy to configure overutilization and underutilization policies for

a single instance. The CPU Utilization tab shows the CPU utilization history for the

selected managed instance of SQL Server.

¦ Computer CPU This column communicates computer processor utilization where

the managed instance resides. Health is based on the settings of two policies: the CPU

utilization policy in place for the computer and the configuration setting for the Volatile

Resource Evaluation policy. The CPU Utilization tab shows the processor utilization

history for a managed instance of SQL Server.

¦ File Space The File Space column summarizes file space utilization for all of the

databases belonging to a selected instance of SQL Server. The health state for this

parameter is determined by global or local file space utilization policies. Because there

are many database associated with a managed instance of SQL server, the health state

is reported as overutilized if only one database is overutilized. The Storage Utilization

tab shows health state information on all other database files.

¦ Volume Space Volume space utilization is summarized in this column for volumes

with databases belonging to each managed instance. The health of this parameter

is determined by the global or local storage volume utilization policies for managed

instances of SQL Server. As with file space reports, the health of a storage volume associated

with a managed instance of SQL Server that is overutilized is reported with a red

up arrow, and underutilization is reported with a green arrow. The Storage Utilization

tab shows additional health information and history for volumes.

¦ Policy Type The final column in the list view specifies the type of policy applied to

the managed instance of SQL Server. Policy type results are reported as either Global

or Override, with Global meaning that default policies are in use, and Override meaning

that custom policies are in use.

DBAs can appreciate the value of the information each list view column holds. But, in the

case of the Managed Instances view, DBAs can gain an even greater appreciation by also accessing

the Managed Instances viewpoint tabs to better understand their present infrastructure

and to better prepare for a successful consolidation.

The Managed Instances Detail Tabs

The Managed Instances viewpoint includes tabs for additional viewing. The tabs are located

at the bottom of the viewpoint and consist of

¦ CPU Utilization The CPU Utilization tab, illustrated earlier in Figure 5-7, displays

historical information of CPU utilization for a selected managed instance of SQL Server

according to the interval specified on the left side of the display area. DBAs can change

the display intervals for the graphs by selecting one of these options:

. 1 Day Displays data in 15-minute intervals

. 1 Week Displays data in one-day intervals

. 1 Month Displays data in one-week intervals

. 1 Year Displays data in one-month intervals

Two linear graphs are presented next to each other. The first graph shows CPU utilization

based on the managed instance of the SQL Server, and the second graph displays

data based on the computer associated with the managed instance.

¦ Storage Utilization The next tab displays storage utilization for a selected managed

instance of SQL Server, as depicted in Figure 5-8. Data is grouped by either

database or volume. When the Database option button is selected, storage utilization

is displayed for each database, filegroup, or a specific database file, which is based on

the node selected in the tree view. If the Volume option button is selected, storage utilization

history is displayed according to file space used by all data files and all log files

located on the storage volume. The tree view also can be expanded to present storage

utilization information and history for each volume and database file associated with a

volume.

FIGURE 5-8 The Storage Utilization tab on the Managed Instances viewpoint

Independent of how the files are grouped, health status is communicated for every database,

filegroup, database file, or volume. For example, the green arrows in Figure 5-8

indicate that all databases, filegroups, and data files are underutilized. No health states

are shown as overutilized. Once again, the display intervals for the graphs are changed

by selecting one of the following options:

. 1 Day Displays data in 15-minute intervals

. 1 Week Displays data in one-day intervals

. 1 Month Displays data in one-week intervals

. 1 Year Displays data in one-month intervals

¦ Policy Details DBAs can use the Policy Details tab, shown in Figure 5-9, to view

the global policies applied to a selected managed instance of SQL Server. In addition,

the Policy Details tab can be used to create a custom policy that overrides the default

global policy applied to a selected managed instance of SQL Server. The display is

broken into the following four policies that can be viewed or modified:

. Managed Instance CPU Utilization Policies

. File Space Utilization Policy

. Computer CPU Utilization Policies

. Storage Volume Utilization Policies

FIGURE 5-9 The Policy Details tab on the Managed Instances viewpoint

NOTE To override the global policy for a specific managed instance, select the Override

The Global Policy option button. Next, specify the new overutilized and underutilized

numeric values in the control boxes to the right of the policy description, and

then click Apply. For example, in Figure 5-9, the default global policy for the CPU of a

managed instance is to consider the CPU overutilized when its usage is greater than 70

percent. The global policy was overridden, and the new setting is 50 percent. Similarly,

the CPU underutilization setting is changed from zero percent to 10 percent.

¦ Property Details This tab, shown in Figure 5-10, displays property details for the

selected managed instance of SQL Server. The Property detail information displays the

processor name, processor speed, processor count, physical memory, operating system

version, SQL Server version, SQL Server edition, backup directory, collation information,

case sensitivity, language, whether or not the instance of SQL Server is clustered,

and the last time data was successfully updated.

FIGURE 5-10 The Property Details tab on the Managed Instances viewpoint

Using the Data-Tier Application Viewpoint

As it is when you use the Managed Instances viewpoint to monitor health status and resource

utilization for managed instances of SQL Server, using the Data-Tier Applications viewpoint

enables you to monitor deployed data-tier applications managed by the SQL Server Utility

Control Point.

NOTE The viewpoints associated with this section may at first appear identical to the

information under the previous section, “The Managed Instances Detail Tab.” However,

the policies and files in this section do differ from those described previously, sometimes

slightly and sometimes significantly.

Similar to the Managed Instance viewpoint, DBAs can access the Data-Tier Applications

view and viewpoints in the Utility Explorer Content pane by connecting to a UCP and then

selecting the Deployed Data-Tier Application node in the Utility Explorer tree. The Utility

Explorer Content pane displays the view, as illustrated in Figure 5-11, that communicates

the health and utilization status for the application CPU, the computer CPU, file space, and

volume space.

FIGURE 5-11 The data-tier application viewpoint

Resource utilization for each deployed data-tier application is presented in the list view located

at the top of the Utility Explorer Content pane. Health state icons appear at the right of

each deployed data-tier application and provide summary status for each deployed data-tier

application based on the utilization category. Three icons are used to indicate the health state

of each deployed data-tier application. A green check mark indicates that the deployed datatier

application is well utilized and does not violate any policies. A red arrow indicates that

the deployed data-tier application is overutilized, and a green arrow indicates underutilization.

The lower half of the view contains tabs for CPU utilization, storage volume utilization,

access policy definitions, and property details for each data-tier application. For example, the

computer CPU and volume space for the AccountingDB and FinanceDB data-tier applications

shown in Figure 5-11 are underutilized. In addition, the application CPU and the file space

utilization for all deployed data-tier applications are well utilized, and the volume space for

AdventureWorks2005 and AdventureWorks2008R2 are overutilized.

The data-tier application list view columns and utilization tabs are discussed in the upcoming

sections.

The Data-Tier Application List View

The columns presenting the state of health for each deployed data-tier application in the

data-tier application list view include

¦ Application CPU This column displays the health state utilization of the processor

for the deployed data-tier application. The health state is determined by the CPU utilization

policy for deployed data-tier applications. The CPU Utilization tab shows CPU

utilization history for the selected deployed data-tier application.

¦ Computer CPU This column communicates computer processor utilization for

deployed data-tier applications. The CPU Utilization tab shows the processor utilization

history for the deployed data-tier application.

¦ File Space The File Space column summarizes file space utilization for each deployed

data-tier application. The health state for this parameter is determined by global or

local file space utilization policies. The Storage Utilization tab shows health state information

on all other database files.

¦ Volume Space Volume space utilization is summarized in this column for volumes with

databases belonging to each deployed data-tier application. The health of this parameter

is determined by the global or local Storage Volume utilization policies for deployed

data-tier application of SQL Server. Similar to File Space reports, the health of a storage

volume associated with a deployed data-tier application of SQL Server that is overutilized

is reported with a red arrow, and underutilization is reported with a green arrow. The

Storage Utilization tab shows additional health information and history for volumes.

¦ Policy Type This column in the list view specifies the type of policy applied to a

deployed data-tier application of SQL Server. Policy Type results are reported as either

Global or Override. Global indicates that default policies are in use, and Override indicates

that custom policies are in use.

¦ Instance Name The final column in the list view specifies the name of the SQL

Server instance to which the data-tier application has been deployed.

The Data-Tier Application Tabs

The Data-Tier Applications viewpoint includes tabs for additional viewing. The tabs are located

at the bottom of the viewpoint and consist of

¦ CPU Utilization The CPU Utilization tab, illustrated in Figure 5-11, displays historical

information on CPU utilization for a selected deployed data-tier application according

to the interval specified on the left side of the display area. DBAs can change the

display intervals for the graphs by selecting one of the following options:

. 1 Day Displays data in 15-minute intervals

. 1 Week Displays data in one-day intervals

. 1 Month Displays data in one-week intervals

. 1 Year Displays data in one-month intervals

Two linear graphs are presented next to each other. The first graph shows CPU utilization

based on the selected deployed data-tier application, and the second graph displays

data based on the computer associated with the deployed data-tier application.

¦ Storage Utilization The next tab displays storage utilization for a selected deployed

data-tier application, as depicted in Figure 5-12. Data is grouped by either

filegroup or volume. When the Filegroup option button is selected, storage utilization

is displayed for each data-tier application based on the node selected in the tree

view. If the Volume option button is selected, storage utilization history is displayed

by volume. The tree view also can be expanded to present storage utilization information

and history for each volume and filegroup associated with a deployed data-tier

application. In Figure 5-12, the volume space for the AdventureWorks2005 deployed

data-tier application is shown as overutilized because a red arrow is displayed in the

Volume Space column of the Storage Utilization tab.

Once again, the display intervals for the graphs are changed by selecting one of the

options available below:

. 1 Day Displays data in 15-minute intervals

. 1 Week Displays data in one-day intervals

. 1 Month Displays data in one-week intervals

. 1 Year Displays data in one-month intervals

FIGURE 5-12 The Storage Utilization tab on the Data-Tier Applications viewpoint

¦ Policy Details The Policy Details tab, shown in Figure 5-13, is where a DBA can view

the global policies applied to a selected deployed data-tier application. The Policy

Details tab can also be used to create a custom policy that overrides the default global

policy applied to a deployed data-tier application. For example, by expanding the

Data-Tier Application CPU Utilization Policies section, you can observe that the global

policy is applied. With this policy, a CPU of a data-tier application is considered to be

overutilized when its usage is greater than 70 percent and underutilized when it is less

than zero percent. If you wanted to override this global policy for a data-tier application,

you would select the Override The Global Policy option button and specify the

new overutilized and underutilized numeric values in the box. You would then click

Apply to enforce the new policy. In Figure 5-13, the global policy has been modified

from its original settings, and the CPU of a data-tier application is now considered to

be overutilized when its usage is greater than 30 percent. To override this setting, you

would choose the Override The Global Policy option button and set a desired value

in the box to the right of the policy description. For this example, the setting was

changed from 30 percent to 70 percent.

FIGURE 5-13 The Policy Details tab on the Data-Tier Applications viewpoint

The display is broken up into the following four policies, which can be viewed or

overridden:

. Data-Tier Application CPU Utilization Policies

. File Space Utilization Policies

. Computer CPU Utilization Policies

. Storage Volume Utilization Policies

¦ Property Details The Property Details tab, shown in Figure 5-14, displays generic

property details for the selected deployed data-tier application. Property detail information

consists of database name, deployed date, trustworthiness, collation, compatibility

level, encryption-enabled state, recovery model, and the last time data was

successfully updated.

FIGURE 5-14 The Property Details tab on the Data-Tier Applications viewpoint

109

C H A P T E R 6

Scalable Data Warehousing

Microsoft SQL Server 2008 R2 Parallel Data Warehouse is an enterprise data warehouse

appliance based on technology originally created by DATAllegro and

acquired by Microsoft in 2008. In the months following the acquisition, Microsoft revamped

the product by changing it from a product that used the Linux operating system

and Ingres database technologies to a product based on SQL Server 2008 R2 and the

Windows Server 2008 operating system. SQL Server 2008 Enterprise has many features

supporting scalability and data warehouse performance that Parallel Data Warehouse

uses to its advantage. The combination of SQL Server scalability and performance with

a massively parallel processing (MPP) architecture in Parallel Data Warehouse creates a

powerful new option for hosting a very large data warehouse.

Parallel Data Warehouse Architecture

Parallel Data Warehouse does not install like other editions of SQL Server. Instead, it is

a data warehouse appliance that bundles multiple software and hardware technologies,

including SQL Server, into a platform well suited for a very large data warehouse. A key

characteristic of this platform is the MPP architecture, which enables fast data loads and

high-performance queries. This architecture consists of a multi-rack system, which parallelizes

queries across an array of dedicated servers connected by a high-speed network

to deliver results at speeds that are typically faster than possible with a traditional symmetric

multiprocessing (SMP) architecture.

Data Warehouse Appliances

You purchase a data warehouse appliance as preassembled and preconfigured integrated

components with all software preinstalled. When you place an order for an appliance

with an authorized vendor, you specify the number of appliance racks that you want to

purchase. The vendor works with you to add options, such as an optional backup node,

and to optimize the system to meet your requirements for faster query performance

and for storage of high data volumes. The vendor then assembles industry-standard

hardware components and loads the operating system, SQL Server, and Parallel Data

110 CHAPTER 6 Scalable Data Warehousing

Warehouse software. When the assembly process is complete, the vendor ships the appliance

to you using shockproof pallets. When it arrives, you remove the appliance from the pallets,

plug it into a power source, and connect it to your network.

Parallel Data Warehouse is a data warehouse appliance that includes all server, networking,

and storage components required to host a data warehouse. In addition, your purchase of

Parallel Data Warehouse includes cables, power distribution units, and racks. Furthermore, the

components have redundancy to prevent downtime caused by a failure. The vendor installs all

software at the factory and configures Parallel Data Warehouse to balance CPU, memory, and

disk space. After you receive the Parallel Data Warehouse at your location, you use a configuration

tool that Parallel Data Warehouse includes to complete the network setup and configure

appliance settings for your environment. You can also install Microsoft or third-party

software to use when copying data between your corporate network and the appliance.

Processing Architecture

A traditional data warehouse deployment of SQL Server is an SMP architecture, in which identical

processors share memory on a single server. One physical instance of a database processes

all queries. You can improve performance by partitioning the data, thereby achieving

multi-threaded parallelization. You can add higher powered servers with more CPU, memory,

storage, and networking capacity to scale up, but the cost to scale up is high.

By contrast, Parallel Data Warehouse is an MPP architecture that uses multiple database

servers that operate together to process queries. Behind the scenes, each database server

runs one SQL Server instance with its own dedicated CPU, RAM, storage, and network

bandwidth. Each database managed by Parallel Data Warehouse is distributed across multiple

database servers that execute Parallel Data Warehouse queries in parallel. Parallel Data

Warehouse’s architecture includes a controlling server to coordinate these parallel queries

and all other database activity across the multiple database servers. This controlling server

also presents the distributed database as a single logical database to users. If you need to

scale out the MPP hardware, you can simply add inexpensive commodity servers and storage

rather than expensive high-end servers and storage.

The Multi-Rack System

Parallel Data Warehouse is configured as a multi-rack system in which there is a control rack

and one or more data racks, as shown in Figure 6-1. Each rack is a collection of nodes, each of

which has a dedicated role within the appliance. These nodes transfer data among themselves

using an InfiniBand network that ships with the appliance. Only the nodes in the control rack

communicate with the corporate Ethernet network. The nodes in the data rack can export

tables to a corporate SMP SQL Server database by using the InfiniBand network.

Parallel Data Warehouse Architecture CHAPTER 6 111

Control rack Data rack

Management node

active/passive

User queries

Control node

active/passive

Landing Zone

Backup node

Control rack

Active server Dedicated storage

Passive server

Data loading

Data backup

Dual Fibre

Dual Channel

InfiniBand

SQL

FIGURE 6-1 The multi-rack system

The Data Rack

All activity related to parallel query processing occurs in the data rack, which is a collection of

compute nodes. Each compute node consists of a server with dedicated storage, a SQL Server

instance, and additional Parallel Data Warehouse software that provides communication and

data transfer functions. Although the compute nodes run separate SQL Server instances in

parallel to manage each distributed appliance database, you query the database as if it were a

single database.

The number of compute nodes in a data rack varies among the vendors, although each

vendor follows a standard architecture specification. For example, each data rack includes a

spare server for high availability. If a compute node server fails or needs to be taken offline

for maintenance, the compute node server automatically fails over to the spare server.

The current connections to the appliance stay intact while the appliance reconfigures itself.

Just as with SQL Server failover, queries that were in progress before the failover need to be

restarted.

The Control Rack

The control rack is a separate rack that houses the servers, storage, and networking components

for the nodes that provide control, management, or interface functions. It contains

several types of nodes that Parallel Data Warehouse uses to process user queries, to load

and back up data, and to manage the appliance. Some of the nodes serve as intermediaries

between the corporate network and the private network that connects the nodes in both the

control rack and data rack. You never interact directly with the data rack; you submit a data

load or a query to the control rack, which then coordinates the processes between nodes to

complete your request.

Most Parallel Data Warehouse activity involves coordination with the control node. To support

high availability, the control node is a two-node active/passive cluster. If the active node

fails for any reason, the passive node takes over. The redundancy between the two nodes

ensures the appliance can recover quickly from a failure.

Parallel Data Warehouse uses multiple networking technologies. The control rack servers

connect to the corporate network by using the corporate Ethernet. The compute node servers

connect to their dedicated database storage by using a Fibre Channel network. A highspeed

InfiniBand network internally connects all the servers in the appliance to one another.

Because InfiniBand is much faster than a Gigabit Ethernet network, it is better suited for the

Parallel Data Warehouse nodes, which must transfer high volumes of data and be as fast as

possible. For high availability, the switching fabric of each network includes redundancy.

The Control Node

The control node is in the control rack and manages client authentication; accepts client connections

to Parallel Data Warehouse; manages the query execution process, which it distributes

across the compute nodes; and serves as the central point for all hardware monitoring.

To support high availability, the control node is a two-node active/passive cluster in which the

passive node instantly takes over if the active node fails for any reason. The control node also

contains a SQL Server instance.

To support the distributed architecture of Parallel Data Warehouse, the control node contains

the MPP Engine, the Data Movement Service (DMS), and Windows Internet Information

Services (IIS), as shown in Figure 6-2. The MPP Engine coordinates parallel query processing,

storage of appliance-wide metadata and configuration data, and authentication and authorization

for the appliance and databases. The DMS, which runs on most appliance nodes, is the

communication interface for copying data between appliance nodes. IIS hosts a Web application,

called the Admin Console, that you access by using Windows Internet Explorer and use

to manage and monitor the appliance status and query performance.

You can connect to the Parallel Data Warehouse control node by using a variety of client

access tools. Parallel Data Warehouse integrates with SQL Server 2008 R2 Business Intelligence

Development Studio, SQL Server Integration Services, SQL Server Analysis Services, and SQL

Server Reporting Services. The Nexus client is the query editor that you can use to submit

queries by using SQL statements to Parallel Data Warehouse. Parallel Data Warehouse also

includes DWSQL, a command-line tool for submitting SQL statements to the control node.

These client tools use Data Direct’s SequeLink client drivers that support the following data

access driver types:

¦ ODBC

¦ OLE DB

¦ ADO.NET

SQL Server

SMP SQL database

Appliance nodes

Client access tools

SQL

Server BI

(AS, RS, IS)

Data rack

Compute

DMS

SQL Server

User data

Control rack

Management

Backup

DMS

Landing Zone

Landing tool DMS

Control

IIS

Admin

Console

Data

Movement

Service

(DMS)

MPP Engine

SQL Server

Control

database

OLEDB

ODBC

ADO.NET

NEXUS

query

editor

DWSQL

FIGURE 6-2 Appliance software

The Landing Zone Node

The Landing Zone is a high-capacity data storage node in the control rack that contains terabytes

of disk space for temporary storage of user data before loading it into the appliance.

Using your ETL processes to move data to the Landing Zone, you can either copy data to the

Landing Zone and then load it into the appliance, or you can load data directly without first

storing it on the Landing Zone. With either approach, the Landing Zone uses the appliance’s

high-speed fabric to copy that data in parallel into the data rack. To perform parallel data

loading, you can use SQL Server Integration Services or a command-line tool.

The Backup Node

Another node in the control rack is the Backup node that, as the name implies, is dedicated

to the backup process, which it can perform at very high speed. The backup node uses SQL

Server’s native database-level backup and restore functionality and coordinates the backup

across nodes. You can create full backups or differential backups of user databases, or

backups of the system database that contains information about user accounts, passwords,

and permissions. The initial backup takes the longest time because it contains all data in a

database, but subsequent differential backups run much faster because they contain only

the changes in the data that were made since the last full backup. Furthermore, the backup

process runs in parallel across nodes to help performance.

TIP To restore the backup, the destination appliance must have at least as many of compute

nodes as the appliance where the backup was created.

The Management Node

The final node in the control rack is the management node, which operates as the hub for

software deployment, servicing, and system health and performance monitoring. This node

also runs a Windows domain controller to manage authentication within the appliance. It

performs functions related to the management of hardware and software in the appliance

and is not visible to users. Like the control node, the management node is a two-node active/

passive cluster.

NOTE Parallel Data Warehouse does not use the domain controller on the management

node for user authentication.

The Compute Node

Each compute node is the host for a single SQL Server instance and runs the DMS to communicate

with and transfer data to other appliance nodes. Each compute node stores a subset of

each user database. Before parallel query processing begins, Parallel Data Warehouse copies

any necessary data to each compute node so that it can process the query in parallel with

other compute nodes without requiring data from other locations during processing. This

feature, called data colocation, ensures that each compute node can execute its portion of the

parallel query with no effect on the query performance of the other compute nodes.

Hub-and-Spoke Architecture

Rather than using Parallel Data Warehouse exclusively for a data warehouse, you can use a

hub-and-spoke architecture to support both a corporate data warehouse and special purpose

data marts. These data marts reside on servers outside of the appliance. The data warehouse

at the hub is the primary data source for the spokes. A spoke can be a data mart, a host for

Analysis Services, or even a development or test environment. You can enforce business rules

and data quality standards for all data at the hub, and then you can quickly copy data as

needed from the Parallel Data Warehouse to the spokes residing outside the appliance.

Data Management

Loading, processing, and backing up terabytes of data with balanced hardware resources is

vitally important in a very large data warehouse. Parallel Data Warehouse uses carefully balanced

hardware to maximize the efficiency of each hardware component and avoid the need

to over-purchase hardware. Parallel Data Warehouse accomplishes this goal of balancing

speed and hardware by using a shared nothing (SN) architecture.

In addition to the shared nothing architecture, there are other differences from other editions

of SQL Server to notice. For example, SQL commands to create a database and tables

are slightly different from their standard Transact-SQL counterparts. In addition, although

Parallel Data Warehouse supports most of the SQL Server 2008 data types, there are a few

exceptions. Last, the architecture requires a new approach to query processing and data

load processing.

Shared Nothing Architecture

An SN architecture is a type of architecture in which each node of a system uses its own CPU,

memory, and storage to avoid performance bottlenecks caused by resource contention with

other nodes. In Parallel Data Warehouse, each compute node contains its own data, CPU, and

storage to function as a self-sufficient and independent unit. Although the SN architecture

is gaining popularity as a data warehousing architecture, performance can still be slow when

a parallel query must first move data among the nodes before execution. When a SQL join

operation requires data that is not already on the requisite compute nodes, Parallel Data

Warehouse copies data to these nodes temporarily for use during query execution.

You design the data layout on the appliance to avoid or minimize data movement for parallel

queries by using either a replicated or a distributed strategy for storage. When planning

which strategy to implement, you consider the types of joins that the parallel queries require.

Some tables require a replicated strategy, whereas others require a distributed strategy.

Replicated Strategy

For best performance, you can add small tables—such as dimension tables in a star schema—

to Parallel Data Warehouse by using a replicated strategy. Parallel Data Warehouse makes

a copy of the table on each compute node, as shown in Figure 6-3. You then perform the

initial load of the table, followed by any subsequent inserts, updates, or deletes, as if you were

working with a single table, without the need to manage each copy of the table. Parallel Data

Warehouse handles all changes to the table for you. When a query performs a join on a replicated

dimension, Parallel Data Warehouse joins the dimension to the portion of the fact table

that exists on the same compute node. All compute nodes run the query in parallel and can

find data very quickly because the complete dimension table is on each compute node.

Table

Compute nodes

All table rows are copied

to each compute node

Replicated table

FIGURE 6-3 Replicated strategy

Distributed Strategy

One of the keys to performance in an MPP architecture is the distribution of large tables

across multiple nodes, as shown in Figure 6-4. To distribute a fact table, you simply select a

column from the table to use as the distribution column, and when data is loaded into the

table, Parallel Data Warehouse automatically spreads the rows across all of the compute

nodes in the appliance. There are performance considerations for the selection of a distribution

column, such as distinctness, data skew, and the types of queries executed on the system. For

a detailed discussion of the choice of distributed tables, refer to the product documentation.

To distribute the rows in the fact table, a hash function assigns each row to one of many storage

locations based on the distribution column. Each compute node has 8 storage locations,

called distributions, for the hashed rows. If a data rack has 8 compute nodes, the data rack has

64 distributions, which are queried in parallel.

Hash

function

Table

Compute nodes

Each table row

belongs to one

distribution

Distributed table

FIGURE 6-4 Distributed strategy

It is not essential that equal numbers of table rows are assigned to each distribution. There

will almost always be some data skew among the distributions. If the amount of data skew

becomes too large, the parallel system continues to run, but query times might be affected.

You might have to experiment with several approaches before finding the best distributed

strategy. A distributed strategy does not affect other table options that you might want to

implement. For example, you can still define partitions and clustered indexes as needed.

DDL Extensions

To support the MPP architecture, Parallel Data Warehouse includes a SQL language that

works with appliance databases. This SQL language includes data definition language (DDL)

statements to create and alter databases, tables, views, and other entities on the appliance.

You use these statements to operate on these objects as if they were on a single database

instance. Behind the scenes, Parallel Data Warehouse allocates space for the objects and

instantiates them across nodes.

CREATE DATABASE

The CREATE DATABASE statement has a set of options for supporting distributed and replicated

tables. You determine how much space you need in total for the database for replicated

tables, distributed tables, and logs. Parallel Data Warehouse manages the database according

to your specifications.

Here is an example of the statement you use in Parallel Data Warehouse to create a new

database:

CREATE DATABASE DW

WITH (

AUTOGROW = ON,

REPLICATED_SIZE = 50,

DISTRIBUTED_SIZE = 10000,

LOG_SIZE = 25

);

This statement uses the following options:

¦ AUTOGROW This option specifies whether to enable or disable the automatic

growth feature. This feature allows Parallel Data Warehouse to manage the growth of

data and log files as needed over time.

¦ REPLICATED_SIZE This specifies the total space in gigabytes allocated to replicated

tables (and associated data) on each compute node. Parallel Data Warehouse stores

replicated tables in a SQL Server filegroup on each compute node.

¦ DISTRIBUTED_SIZE This specifies the total space in gigabytes allocated to distributed

tables on the appliance. Parallel Data Warehouse divides the space among all distributions

on the compute nodes and stores each distribution in a separate SQL Server

filegroup. In the SN architecture of Parallel Data Warehouse, each distribution has its

own set of disks for storage. This set of disks is configured as a logical unit number

(LUN).

¦ LOG_SIZE This option specifies the total space in gigabytes allocated to the transaction

log on the appliance. You should plan for the log file size to be large enough to

accommodate the largest data load that you expect. The automatic growth feature

adjusts the log size as needed if you underestimate the required log file size.

CREATE TABLE

The CREATE TABLE statement syntax varies slightly from its syntax in standard Transact-SQL.

For Parallel Data Warehouse, the statement includes options for specifying whether the table

uses a replicated or a distributed strategy and whether to store the table with a clustered index

or with a heap. You can also use this syntax to create partitions by specifying the partition

boundary values.

NOTE Parallel Data Warehouse does not use the Transact-SQL partition schema or partition

function. Also, you can create a clustered index only when you use CREATE TABLE. To

create a nonclustered index, you use CREATE INDEX.

Here is an example of the syntax to create a replicated table:

CREATE TABLE DimProduct

(

ProductId BIGINT NOT NULL,

Description VARCHAR(50),

CategoryId INT NOT NULL,

ListPrice DECIMAL(12,2)

) WITH ( DISTRIBUTION = REPLICATE );

This syntax instructs Parallel Data Warehouse to create a table on all compute nodes. Subsequent

commands to insert or delete data affect data in each copy of the table.

Here is an example of the syntax to create a distributed table:

CREATE TABLE FactSales

( CustomerId BIGINT,

SalesId BIGINT,

ProductId BIGINT,

SaleDate DATE,

Quantity INT,

Amount DECIMAL(15,2)

) WITH (

DISTRIBUTE = HASH (CustomerId),

CLUSTERED INDEX (SaleDate),

PARTITION ( SaleDate

RANGE RIGHT FOR VALUES

( ‘2009-01-01′,’2009-02-01′,’2009-03-01′,’2009-04-01′,’2009-05-01′,’2009-06-01’

,’2009-07-01′,’2009-08-01′,’2009-09-01′,’2009-10-01′,’2009-11-01′,’2009-12-01′)

));

The CREATE TABLE statement for Parallel Data Warehouse includes the following items:

¦ DISTRIBUTION Specifies the column to hash for distributing rows across all compute

nodes in Parallel Data Warehouse

¦ CLUSTERED INDEX Specifies the column for a clustered index—if you omit this

item from the statement, Parallel Data Warehouse stores the table as a heap

¦ PARTITION Specifies the boundary values of the partition and the column to use for

partitioning the rows

In addition, you can use a CREATE TABLE AS SELECT statement to create a table from the

results of a SELECT statement. You might use this technique when you are redistributing or

defragmenting a table.

Here is an example of the syntax for a CREATE TABLE AS SELECT statement:

CREATE TABLE DimCustomer

WITH

( CLUSTERED INDEX (CustomerID) )

SELECT * FROM DimCustomer;

Another option for creating tables is the CREATE REMOTE TABLE statement, which you

can use to export a table to a non-appliance SQL Server database in an SMP architecture. To

use this statement, you must ensure that the target database is available on the appliance’s

InfiniBand network.

Data Types

Many SQL Server data types supported by SQL Server 2008 are also supported by Parallel

Data Warehouse. Character and binary strings are supported, but you must limit the string

length to 8,000 characters. Another point to note is that Parallel Data Warehouse uses only

Latin1_General_BIN2 collation.

The following data types are supported:

¦ Binary and varbinary

¦ Bit

¦ Char and varchar

¦ Date

¦ Datetime and datetime2

¦ Datetimeoffset

¦ Decimal

¦ Float and real

¦ Int, bigint, smallint, and tinyint

¦ Money and smallmoney

¦ Nchar and nvarchar

¦ Smalldatetime

¦ Time

Query Processing

Query processing in Parallel Data Warehouse is more complex than in an SMP data warehouse

because processing must manage high availability, parallelization, and data movement

between nodes. In general, Parallel Data Warehouse’s control node follows these steps to

process a query (shown in Figure 6-5):

1. Parse the SQL statement.

2. Validate and authorize the objects.

3. Build a distributed execution plan.

4. Run the execution plan.

5. Aggregate query results.

6. Send results to the client application.

Client

Management

Compute

Control

Compute

Landing Zone

Compute

Backup

Compute

Appliance

Create query plan

User query

Query results

Aggregate query results Compute nodes

process query plan

operations in parallel

FIGURE 6-5 Query processing steps

A query with a simple join on columns of replicated tables or distribution columns of distributed

tables does not require the transfer of data between compute nodes before executing

the query. By contrast, a more complex join that includes a nondistribution column of a

distributed table does require Parallel Data Warehouse to copy data among the distributions

before executing the query.

Data Load Processing

The design of data load processing in Parallel Data Warehouse takes full advantage of the

parallel architecture to move data to the compute nodes. You have several options for loading

data into your data warehouse. You can use your ETL process to copy files to the Parallel

Data Warehouse’s Landing Zone. You then invoke a command-line tool, DWLoader, and specify

options to load the data into the appliance. Or you can use Integration Services to move

data to the Landing Zone and call the loading functionality directly. To load small amounts of

data, you can connect to the control node and use the SQL INSERT statement.

Queries can run concurrently with load processing, so your data warehouse is always available

during ETL processing. DWLoader loads table rows in bulk into an existing table in the

appliance. You have several options for loading rows into a table. You can add all rows to the

end of the table by using append mode. Another option is to append new rows and update

existing rows by using upsert mode. A third option is to delete all existing rows first and then

to insert all rows into an empty table by using reload mode.

Monitoring and Management

Parallel Data Warehouse includes the Admin Console, a Web-based application with which

you can monitor the health of the appliance, query execution status, and view other information

useful for tuning user queries. This application runs on IIS on the control node and is

accessible by using Internet Explorer.

The Admin Console allows you to view these options:

¦ Appliance Dashboard Displays status details, such as utilization metrics for CPUs,

disks, and the network, and displays activity on the nodes

¦ Queries Activity Displays a list of running queries and queries recently completed,

with related errors, if any, and provides the ability to drill down to details to view the

query execution plan and node execution information

¦ Load Activity Displays load plans, the current state of loads, and related errors, if any

¦ Backup and Restore Displays a log of backup operations

¦ Active Locks Displays a list of locks across all nodes and their current status

¦ Active Sessions Displays active user sessions to aid monitoring of resource contention

¦ Application Errors Displays error event information

¦ Node Health Displays hardware and software alerts and allows an administrator to

view the health of specific nodes

To manage database objects, you might need to query the tables or view the objects. The

version of SQL Server Management Studio included with SQL Server 2008 R2 is not currently

compatible with Parallel Data Warehouse, but you can still use other tools. For example, you

can use a command-line utility, Dwsql, to query a table. Using Dwsql is similar to using Sqlcmd.

An alternative with a graphical user interface is the Nexus query tool from Coffing Data

Warehousing (Coffing DW), which is distributed with each appliance installation. This tool operates

much like SQL Server Management Studio (SSMS) by allowing you to navigate through

an object explorer to find tables and views and to run queries interactively.

Business Intelligence Integration

Parallel Data Warehouse integrates with the SQL Server business intelligence (BI) components—

Integration Services, Reporting Services, and SQL Server Analysis Services.

Integration Services

Integration Services is the ETL component of SQL Server. You use Integration Services packages

to extract and merge data from multiple data sources and to filter and cleanse your

data before loading it into the data warehouse. In SQL Server 2008 R2, Integration Services

includes the SQL Server Parallel Data Warehouse connection manager and the SQL Server

Parallel Data Warehouse Destination as new components that you use in Integration Services

packages to load data into Parallel Data Warehouse. This new data destination provides optimized

throughput and very fast performance because it loads data directly and quickly into

the target database. You also have the option to deploy packages to the Landing Zone.

Reporting Services

You can use Parallel Data Warehouse as a data source for reports that you develop for Reporting

Services using the Report Designer in Business Intelligence Development Studio or

SQL Server 2008 R2 Report Builder 3.0. The Parallel Data Warehouse data source extension

provides support for the graphical query designer, parameterized queries, and basic transactions,

but it does not support Windows integrated security or advanced transactions. To

use the Parallel Data Warehouse data source extension, you must install the ADO.NET data

provider for Parallel Data Warehouse on the report server and each computer on which you

create reports.

You can also use Parallel Data Warehouse as a source for report models. By using Report

Manager or the report server API, you can generate a model from a Parallel Data Warehouse

database. For more precise control of the model, you can use the Model Designer in Business

Intelligence Development Studio.

Analysis Services and PowerPivot

Parallel Data Warehouse is also a valid data source for Analysis Services databases and Excel

PowerPivot models. Using the OLE DB provider, you can configure an Analysis Services cube

to use either multidimensional online analytical processing (MOLAP) or relational online

analytical processing (ROLAP) storage. When using MOLAP storage, Analysis Services extracts

data from Parallel Data Warehouse and stores it in a separate structure for reporting and

analysis. By contrast, when using ROLAP storage, Analysis Services leaves the data in Parallel

Data Warehouse. At query time, Analysis Services translates the multidimensional expression

(MDX) query into a SQL query, which it sends to the Parallel Data Warehouse control node for

query processing.

125

C H A P T E R 7

Master Data Services

Microsoft SQL Server 2008 R2 Master Data Services (MDS) is another new technology

in the SQL Server family and is based on software from Microsoft’s acquisition of

Stratature in 2007. Just as SQL Server Reporting Services (SSRS) is an extensible reporting

platform that ships with ready-to-use applications for end users and administrators, MDS

is both an extensible master data management platform and an application for developing,

managing, and deploying master data models. MDS is included with the Datacenter,

Enterprise, and Developer editions of SQL Server 2008 R2.

Master Data Management

In the simplest sense, master data refers to nontransactional reference data. Put another

way, master data represents the business entities—people, places, or things—that

participate in a transaction. In a data mart or data warehouse, master data becomes

dimensions. Master data management is the set of policies and procedures that you

use to create and maintain master data in an effort to overcome the many challenges

associated with managing master data. Because it’s unlikely that a single set of policies

and procedures would apply to all master data in your organization, MDS provides the

flexibility you need to accommodate a wide range of business requirements related to

master data management.

Master Data Challenges

As an organization grows, the number of line-of-business applications tends to increase.

Furthermore, data from these systems flows into reporting and analytical solutions.

Often, the net result of this proliferation of data is duplication of data related to key

business entities, even though each system might maintain only a subset of all possible

data for any particular entity type. For example, customer data might appear in a sales

application, a customer relationship management application, an accounting application,

and a corporate data warehouse. However, there might be fields maintained in one application

that are never used in the other applications, not to mention information about

customers that might be kept in spreadsheets independent of any application. None of

the systems individually provide a complete view of customers, and the multiple systems

quite possibly contain conflicting information about specific customers.

126 CHAPTER 7 Master Data Services

This scenario presents additional problems for operational master data in an organization

because there is no coordination across multiple systems. Business users cannot be

sure which of the many available systems has the correct information. Moreover, even when

a user identifies a data quality problem, the process for properly updating the data is not

always straightforward or timely, nor does fixing the data in one application necessarily ripple

through the other applications to keep all applications synchronized.

Compounding the problems further is data that has no official home in the organization’s

data management infrastructure. Older data might be archived and no longer available in

operational systems. Other data might reside only in e-mail or in a Microsoft Access database

on a computer sitting under someone’s desk.

Some organizations try their best not to add another system dedicated to master data

management to minimize the number of systems they must maintain. However, ultimately

they find that neither existing applications nor ETL processes can be sufficiently extended to

accommodate their requirements. Proper master data management requires a wide range of

functionality that is difficult, if not impossible, to replicate through minor adaptations to an

organization’s technical infrastructure.

Last, the challenges associated with analytic master data stem from the need to manage

dimensions more effectively. For example, analysts might require certain attributes in a

business intelligence (BI) solution, but these attributes might have no source in the line-ofbusiness

applications on which the BI solution is built. In such a case, the ETL developer can

easily create a set of static attributes to load into the BI solution, but what happens when

the analyst wants to add more attributes? Moreover, how gracefully can that solution handle

changes to hierarchical structures?

Key Features of Master Data Services

The goal of MDS is to address the challenges of both operational and analytical master data

management by providing a master data hub to centrally organize, maintain, and manage

your master data. This master data hub supports these capabilities with a scalable and extensible

infrastructure built on SQL Server and the Windows Communication Foundation (WCF)

APIs. By centralizing the master data in an external system, you can more easily align all business

applications to this single authoritative source. You can adapt your business processes to

use the master data hub as a System of Entry that can then update downstream systems. Another

option is to use it as a System of Record to integrate data from multiple source systems

into a consolidated view, which you can then manage more efficiently from a central location.

Either way, this centralization of master data helps you improve and maintain data quality.

Because the master data hub is not specific to any domain, you can organize your master

data as you see fit, rather than force your data to conform to a predefined format. You can

easily add new subject areas as necessary or make changes to your existing master data to

meet unique requirements as they arise. The master data hub is completely metadata driven,

so you have the flexibility you need to organize your master data.

Master Data Services Components CHAPTER 7 127

In addition to offering flexibility, MDS allows you to manage master data proactively.

Instead of discovering data problems in failed ETL processes or inaccurate reports, you can

engage business users as data stewards. As data stewards, they have access to Master Data

Manager, a Web application that gives them ownership of the processes that identify and

react to data quality issues. For example, a data steward can specify conditions that trigger

actions, such as creating a default value for missing data, sending an e-mail notification, or

launching a workflow. Data stewards can use Master Data Manager not only to manage data

quality issues, but also to edit master data by adding new members or changing values. They

can also enhance master data with additional attributes or hierarchical structures quickly and

easily without IT support. Using Master Data Manager, data stewards can also monitor changes

to master data through a transaction logging system that tracks who made a change, when

the change was made, which record was changed, and what the value was both before and

after the change. If necessary, the data steward can even reverse a change.

MDS uses Windows integrated security for authentication and a fine-grained, role-based

system for authorization that allows administrators to give the right people the direct access

they need to manage and update master data. As an administrator, you can grant broad access

to all objects in a model, or you can restrict users to specific rows and columns in a data set.

To capture the state of master data at specific points in time, MDS allows administrators

to create versions of the master data. As long as a version has an Open status, anyone with

access to the model can make changes to it. Then you can lock the version for validation and

correction, and commit the version when the model is ready use. If requirements change

later, you copy a committed version and start the process anew.

Because MDS is a platform, not simply an application, you can use the API to integrate

your existing applications with MDS and automate the import or export processes. Anything

that you can do by using Master Data Manager can be built into your own custom application

because the MDS API supports all operations. This capability also enables Microsoft partners

to quickly build master data support into their applications with domain-specific user interfaces

and transparent application integration.

Master Data Services Components

Although MDS is included on the SQL Server installation media, you perform the MDS installation

separately from the SQL Server installation by using a wizard interface. The wizard

installs Master Data Services Configuration Manager, installs the files necessary to run the

Master Data Services Web service, and registers assemblies. After installation, you use the

Master Data Services Configuration Manager to create and configure a Master Data Services

database in a SQL Server instance that you specify, create the Master Data Services Web application,

and enable the Web service.

Master Data Services Configuration Manager

Before you can start using MDS to manage your master data, you use Master Data Services

Configuration Manager. This configuration tool includes pages to create the MDS database,

configure the system settings for all Web services and applications that you associate with

that database, and configure the Master Data Services Web application.

On the Databases page of Master Data Services Configuration Manager, you specify the

SQL Server instance to use for the new MDS database and launch the process to create the

database. After creating the database, you can modify the system settings that govern all

MDS Web applications that you establish on the same server. You configure system settings

to set thresholds, such as time-out values or the number of items to display in a list.

You can also use system settings to manage application behavior, such as whether users can

copy committed model versions or any model version and whether the staging process logs

transactions. For e-mail notifications, you can configure system settings to include a URL to

Master Data Manager in e-mails, to manage the frequency of notifications, and whether to

send e-mails in HTML or text format, among other settings. Most settings are configurable by

using Master Data Services Configuration Manager. You can change values for other settings

directly in the System Settings table in the MDS database.

On the Web Configuration page of Master Data Services Configuration Manager, you associate

the Master Data Services Web application, Master Data Manager, with an existing Web

site or create a new Web site and application pool for it. You can also opt to enable the Web

service for Master Data Manager to support programmatic access to the application.

The Master Data Services Database

The MDS database is the central repository for all information necessary to support the Master

Data Manager application and the MDS Web service. This database stores application settings,

metadata tables, and all versions of the master data. In addition, it contains tables that

MDS uses to stage data from source systems and subscription views for downstream systems

that consume master data.

Master Data Manager

Master Data Manager is a Web application that serves as a stewardship portal for business

users and a management interface for administrators. Master Data Manager includes the following

five functional areas:

¦ Explorer Use this area to change attributes, manage hierarchies, apply business rules

to validate master data, review and correct data quality issues, annotate master data,

monitor changes, and reverse transactions.

¦ Version Management Use this area to create a new version of your master data

model and underlying data, uncover all validation issues in a model version, prevent

users from making changes, assign a flag to indicate the current version for subscribing

systems, review changes, and reverse transactions.

¦ Integration Management Use this area to create and process batches for importing

data from staging tables into the MDS database, view errors arising from

the import process, and create subscription views for consumption of master data by

operational and analytic applications.

¦ System Administration Use this area to create a new model and its entities and

attributes, define business rules, configure notifications for failed data validation, and

deploy a model to another system.

¦ User And Group Permissions Use this area to configure security for users and

groups to access functional areas in Master Data Manager, to perform specific functions,

and to restrict or deny access to specific model objects.

Data Stewardship

Master Data Manager is the data stewardship portal in which authorized business users can

perform all activities related to master data management. At minimum, a user can use this

Web application to review the data in a master data model. Users with higher permissions can

make changes to the master data and its structure, define business rules, review changes to

master data, and reverse changes.

Model Objects

Most activities in MDS revolve around models and the objects they contain. A model is a

container for all objects that define the structure of the master data. A model contains at least

one entity, which is analogous to a table in a relational database. An entity contains members,

which are like the rows in a table, as shown in Figure 7-1. Members (also known as leaf members)

are the master data that you are managing in MDS. Each leaf member of the entity has

multiple attributes, which correspond to table columns in the analogy.

Attributes

Members

FIGURE 7-1 The Product entity

By default, an entity has Name and Code attributes, as shown in Figure 7-1. These two attributes

are required by MDS. The Code attribute values must be unique, in the same way that

a primary key column in a table requires unique values. You can add any number of additional

free-form attributes to accept any type of data that the user enters; the Name attribute

of the Product entity shown in Figure 7-1 is one such attribute.

An entity can also have any number of domain-based attributes whose values are members

of another related entity. In the example in Figure 7-1, the ProductSubCategory attribute

is a domain-based attribute. That is, the ProductSubCategory codes are attribute values in the

Product entity, and they are also members of the ProductSubCategory entity. A third type of

attribute is the file attribute, which you can use to store a file or image.

You have the option to organize attributes into attribute groups. Each attribute group contains

the name and code attributes of the entity. You can then assign the remaining attributes

to one or more attribute groups or not at all. Attribute groups are securable objects.

You can organize members into hierarchies. Figure 7-2 shows partial data from two types

of hierarchies. On the left is an explicit hierarchy, which contains all members of a single entity.

On the right is a derived hierarchy, which contains members from multiple, related entities.

FIGURE 7-2 Product hierarchies

In the explicit hierarchy, you create consolidated members to group the leaf members. For

example, in the Geography hierarchy shown in Figure 7-2, North America, United States, and

Bikes are all consolidated members that create multiple levels for summarization of the leaf

members.

In a derived hierarchy, the domain-based attribute values of an entity define the levels. For

example, in the Category hierarchy in the example, Wholesale is in the ProductGroup entity,

which in turn is a domain-based attribute of the ProductCategory entity of which Components

is a member. Likewise, the ProductCategory entity is a domain-based attribute of the

ProductSubCategory entity, which contains Forks as a member. The base entity, Product,

includes ProductSubCategory as a domain-based attribute.

Regardless of hierarchy type, each hierarchy contains all members of the associated entities.

When you add, change, or delete a member, all hierarchies to which the member belongs

will also update to maintain consistency across hierarchies.

A collection is an alternative way to group members by selecting nodes from existing

explicit hierarchies, as shown in Figure 7-3. Although this example shows only leaf members,

a collection can also contain branches of consolidated members and leaf members. You can

combine nodes from multiple explicit hierarchies into a single collection, but all members

must belong to the same entity.

FIGURE 7-3 A collection

Master Data Maintenance

Master Data Manager is more than a place to define model objects. It also allows you to

create, edit, and update leaf members and consolidated members. When you add a leaf

member, you initially provide values for only the Name and Code attributes, as shown in

Figure 7-4. You can also use a search button to locate and select the parent consolidated

member in each hierarchy.

FIGURE 7-4 Adding a new leaf member

After you save your entry, you can edit the remaining attribute values immediately or at a

later time. Although a member can have hundreds of attributes and belong to multiple hierarchies,

you can add the new member without having all of this information at your fingertips;

you can update the attributes at your leisure. MDS always keeps track of the missing

information, displaying it as validation issue information at the bottom of the page on which

you edit the attribute values, as shown in Figure 7-5.

FIGURE 7-5 Attributes and validation issues

Business Rules

One of the goals of a master data management system is to set up data correctly once and to

propagate only valid changes to downstream systems. To achieve this goal, the system must

be able to recognize valid data and to alert you when it detects invalid data. In MDS, you

create business rules to describe the conditions that cause the data to be considered invalid.

For example, you can create a business rule that specifies the required attributes (also known

as fields) for an entity. A business entity is likely to have multiple business rules, which you can

sequence in order of priority, as shown in Figure 7-6.

FIGURE 7-6 The Product entity’s business rules

Figure 7-7 shows an example of a simple condition that identifies the required fields for the

Product entity. If you omit any of these fields when you edit a Product member, MDS notes

a validation issue for that member and prevents you from using the master data model until

you supply the missing values.

FIGURE 7-7 The Required Fields business rule

When creating a business rule, you can use any of the following types of actions:

¦ Default Value Sets the default value of an attribute to blank, a specific value that

you supply in the business rule, a generated value that increments from a specified

starting value, or a value derived by concatenating multiple attribute values

¦ Change Value Updates the attribute value to blank, another attribute value, or a

value derived by concatenating multiple attribute values

¦ Validation Creates a validation warning and, if you choose, sends a notification

e-mail to a specified user or group

¦ External Action Starts a workflow at a specified Microsoft SharePoint site or

initiates a custom action

Because users can add or edit data only while the master data model version is open,

invalid data can exist only while the model is still in development and unavailable to other

systems. You can easily identify the members that pass or fail the business rule validation

when you view a list of members in Explorer, as shown in Figure 7-8. In this example, the first

two records are in violation of one or more of the business rules. Remember that you can see

the specific violation issues for a member when you open it for editing.

FIGURE 7-8 Business rule validation

Transaction Logging

MDS uses a transaction log, as shown in Figure 7-9, to capture every change made to master

data, including the master data value before and after the change, the user who made the

change (not shown), the date and time of the change, and other identifying information

about the master data. You can access this log to view all transactions for a model by version

in the Version Management area of Master Data Manager. If you find that a change was made

erroneously, you can select the transaction in the log and click the Undo button above the

log to restore the prior value. The transaction log also includes the reversals you make when

using this technique.

FIGURE 7-9 The transaction log

MDS allows you to annotate any transaction so that you can preserve the reasons for a

change to the master data. When you select a transaction in the transactions log, a new section

appears at the bottom of the page for transaction annotations. Here you can view the

complete set of annotations for the selected transaction, if any, and you can enter text for a

new annotation, as shown in Figure 7-10.

FIGURE 7-10 A transaction annotation

Integration

Master Data Manager also provides support for data integration between MDS and other applications.

Master Data Manager includes an Integration Management area for importing and

exporting data. However, the import and export processes here are nothing like those of the

SQL Server Import And Export wizard. Instead, you use the Import page in Master Data Manager

to manage batch processing of staging tables that you use to load the MDS database,

and you use the Export page to configure subscription views that allow users and applications

to read data from the MDS database.

Importing Master Data

Rather than manually entering the data by using Master Data Manager, you can import your

master data from existing data sources by staging the data in the MDS database. You can

stage the data by using either the SQL Server Import And Export wizard or SQL Server Integration

Services. After staging the data, you use Master Data Manager to process the staged

data as a batch. MDS moves valid data from the staging tables into the master data tables in

the MDS database and flags any invalid records for you to correct at the source and restage.

You can use any method to load data into the staging tables. The most important part of

this task is to ensure that the data is correct in the source and that you set the proper values

for the columns that provide information to MDS about the master data. For example, each

record must identify the model into which you will load the master data. When staging data,

you use the following tables in the MDS database as appropriate to your situation:

¦ tblSTGMember Use this table to stage leaf members, consolidated members, or

collections. You provide only the member name and code in this table.

¦ tblSTGMemberAttribute Use this table to stage the attribute values for each

member using one row per attribute, and include the member code to map the attribute

to the applicable member.

¦ tblSTGRelationship Use this table to stage parent-child or sibling relationships

between members in a hierarchy or a collection.

NOTE For detailed information about the table columns and valid values for required

columns, refer to the “Master Data Services Database Reference” topic in SQL Server 2008

R2 Books Online at http://msdn.microsoft.com/en-us/library/ee633808(SQL.105).aspx.

The next step is to use Master Data Manager to create a batch. To do this, you identify

the model and the version that stores the master data for the batch. The version must have

a status of either Open or Locked to import data from a staging table. On your command to

process the batch, MDS attempts to locate records in the staging tables that match the specified

model and load them into the tables corresponding to the model and version that you

selected. When the batch processing is complete, you can review the status of the batch in

the staging batch log, which is available in Master Data Manager, as shown in Figure 7-11.

FIGURE 7-11 The staging batch log

If the log indicates any errors for the staging batch, you can select the batch in the log and

then view the Staging Batch Errors page to see a description of the error for each record that

did not successfully load into the MDS database. You can also check the Status_ID column of

the staging table to distinguish between successful and failed records, which have a column

value of 1 and 2, respectively. At this point, you should return to the source system and

update the pertinent records to correct the errors. The next steps would be to truncate the

staging table to remove all records and finally to load the updated records. At this point, you

can create a new staging batch and repeat the process until all records successfully load.

Exporting Master Data

Of course, MDS is not a destination system for your master data. It can be both a system

of entry and a system of record for applications important to the daily operations of your

organization, such as an enterprise resource planning (ERP) system, a customer relationship

management (CRM) system, or a data warehouse. After you commit a model version, your

master data is available to other applications through subscription views in the MDS database.

Any system that can consume data from SQL Server can use these views to access up-to-date

master data.

To create a subscription view in Master Data Manager, you start by assigning a name to

the view and selecting a model. You then associate the view with a specific version or a version

flag.

TIP You can simplify the administration of a subscription view by associating it with a

version flag rather than a specific version. As the version of a record changes over time,

you can simply reset the flag for the versions. If you don’t use version flags, a change in

version requires you to update every subscription view that you associate with the version,

which could be a considerable number.

Next, you select either an entity or a derived hierarchy as the basis for the view and the

format of the view. For example, if you select an entity, you can format the view to use leaf

members, consolidated members, or collection members and the associated attribute values.

When you save the view, it is immediately available in the MDS database to anyone (or

any application) with Read access to the database. For example, after creating the Product

subscription view in Master Data Manager as an entity-based leaf member view, you can

query the Product view and see the results in SQL Server Management Studio, as shown in

Figure 7-12.

FIGURE 7-12 Querying the Product subscription view

Administration

Of course, Master Data Manager supports administrative functions, too. Administrators use

it to manage the versioning process of each master data model and to configure security for

individual users and groups of users. When you need to make a copy of a master data model

on another server, as you would when you want to recreate your development environment on

a production server, you can use the model deployment feature in Master Data Manager.

Versions

MDS uses a versioning management process to support multiple copies of master data. With

versioning, you can maintain an official working copy of master data that no one can change,

alongside historical copies of master data for reference and a work-in-progress copy for use

in preparing the master data for changing business requirements.

MDS creates the initial version when you create a model. Anyone with the appropriate permissions

can populate the model with master data and make changes to the model objects

in this initial version until you lock the version. After that, only users with Update permissions

on the entire model can continue to modify the data in the locked version to add missing

information, fix any business rule violation, or revert changes made to the model. If necessary,

you can temporarily unlock the version to allow other users to correct the data.

When all data validates successfully, you can commit the version. Committing a version

prevents any further changes to the model and allows you to make the version available to

downstream systems through subscriptions. You can use a flag, as shown in Figure 7-13, to

identify the current version to use so that subscribing systems do not need to track the current

version number themselves. If you require any subsequent changes to the model, you

create a new version by copying a previously committed version and allowing users to make

their changes to the new version.

FIGURE 7-13 Model versions

Security

MDS uses a role-based authorization system that allows you to configure security both by

functional area and by object. For example, you can restrict a user to the Explorer area of

Master Data Manager, as shown in Figure 7-14, while granting another user access to only the

Version Management and Integration Management areas. Then, within the functional area,

you must grant a user access to one or more models to control which data the user can see

and which data the user can edit. You must assign the user permission to access at least one

functional area and one model for that user to be able to open Master Data Manager.

FIGURE 7-14 Functional area permissions

You can grant a user either Read-only or Update permissions for a model. That permission

level applies to all objects in the model unless you specifically override the permissions for

a particular object; the new permission cascades downward to lower level objects. Similarly,

you can grant permissions on specific members of a hierarchy and allow the permissions to

cascade to members at lower levels of the hierarchy.

To understand how security works in MDS, let’s configure security for a sample user and

see how the security settings affect the user experience. As you saw earlier in Figure 7-14,

the user can access only the Explorer area in Master Data Manager. Accordingly, that is the

only functional area that is visible when the user accesses Master Data Manager, as shown in

Figure 7-15. An administrator with full access privileges would instead see the full list of functional

areas on the home page.

FIGURE 7-15 The Master Data Manager home page for a user with only Explorer permissions

Data security begins at the model level. When you deny access to a model, the user does

not even see it in Master Data Manager. With Read-only access, a user can view the model

structure and its data but cannot make changes. Update permissions allow a user to see the

data as well as make changes to it. To continue the security example, Figure 7-16 shows that

this user has Read-only permissions for the Product model (as indicated by the lock icon) and

Deny permissions on all other models (as indicated by the stop symbol) in the Model Permissions

tree view on the left. In the Model Permissions Summary table on the right, you can

see the assigned permissions at each level of the model hierarchy. Notice that the user has

Update permission on leaf members of the ProductCategory entity.

FIGURE 7-16 A user’s model permissions

With Read-only access to the model, except for the ProductCategory entity, the user

can view data for all other entities or hierarchies, such as Color, as shown in Figure 7-17, but

cannot edit the data in any way. Notice the lock icons in the Name and Code columns in the

Color table on the right side of the page. These icons indicate that the values in the table are

not editable. The first two buttons above the table allow a user with Update permissions to

add or delete a member, but those buttons are unavailable here because the user has Readonly

permission. The user can also navigate through the hierarchy in the tree view on the left

side of the page, but the labels are gray to indicate the Read-only status for every member of

the hierarchy.

FIGURE 7-17 Read-only permission on a hierarchy

At this point in the example, the user has Update permission on the ProductCategory entity,

which allows the user to edit any member of that entity. However, you can apply a more

granular level of security by changing permissions of individual members of the entity within

a hierarchy. As shown in Figure 7-18, you can override the Update permission at the entity

level by specifying Read-only permission on selected members. The tree view on the left side

of the page shows a lock icon for the members to which Read-only permissions apply and a

pencil icon for the members for which the user has Update permissions.

FIGURE 7-18 Member permissions within a hierarchy

More specifically, the security configuration allows this user to edit only the Bikes and Accessories

categories in the Retail group, but the user cannot edit categories in the Wholesale

group. Let’s look first at the effect of these permissions on the user’s experience on the ProductCategory

page (shown in Figure 7-19). The lock icon in the first column indicates that the

Components and Clothing categories are locked for editing. However, the user has Update

permission for both Bikes and Accessories, and can access the member menu for either of

these categories. The member menu, as shown in the figure, allows the user to edit or delete

the member, view its transactions, and add an annotation. Furthermore, the user can add new

members to the entity.

FIGURE 7-19 Mixed permissions for an entity

Last, Figure 7-20 shows the page for the Category derived hierarchy. Recall from Figure

7-19 that the user has Update permission for the Retail group. The user can therefore modify

the Retail member, but not the Wholesale member, as indicated by the lock icon to the left

of the Wholesale member in the ProductGroup table. You can also see the color-coding of

the labels in the tree view of the Category hierarchy, which indicates whether the member is

editable by the user. The user can edit members that are shown in black, but not the members

shown in gray. When the user selects a member in the tree view, the table on the right

displays the children of the selected member if the user has the necessary permission.

FIGURE 7-20 Mixed permissions for a derived hierarchy

Model Deployment

When you have finalized the master data model structure, you can use the model deployment

capabilities in Master Data Manager to serialize the model and its objects as a package

that you can later deploy on another server. In this way, you can move a master data model

from development to testing and to production without writing any code or moving data

at the table level. The deployment process does not copy security settings. Therefore, after

moving the master data model to the new server, you must grant the users access to functional

areas and configure permissions.

To begin the model deployment, you use the Create Package wizard in the System Administration

area of Master Data Manager. You specify the model and version that you want to

deploy and whether you want to include the master data in the deployment. When you click

Finish to close the wizard, Master Data Manager initiates a download of the package to your

computer, and the File Download message box displays. You can then save the package for

deployment at a later time.

When you are ready to deploy the package, you use the Deploy Package wizard in Master

Data Manager on the target server and provide the wizard with the path to the saved package.

The wizard checks to see whether the model and version already exist on the server. If so,

you have the option to update the existing model by adding new items and updating existing

items. Alternatively, you can create an entirely new model, but if you do so, the relationship

with the source model is then permanently broken, and any subsequent updates to the

source model cannot be brought forward to the copy of the model on the target server.

Programmability

Rather than use Master Data Manager exclusively to perform master data management

operations, you might prefer to automate some operations to incorporate them into a custom

application. Fortunately, MDS is not just an application ready to use after installation, but also

a development platform that you can use to integrate master data management directly into

your existing business processes.

TIP For a code sample that shows how to create a model and add entities to the model,

see the following blog entry by Brent McBride, a Senior Software Engineer on the MDS

team: “Creating Entities using the MDS WCF API,” at http://sqlblog.com/blogs/mds_team

/archive/2010/01/29/creating-entities-using-the-mds-wcf-api.aspx.

The Class Library

The MDS API allows you to fully customize any or all activities necessary to create, populate,

maintain, manage, and secure master data models and associated data. To build your own

data stewardship or management solution, you use the following namespaces:

¦ Microsoft.MasterDataServices.Services Contains a class to provide instances

of the MdsServiceHost class and a class to provide an API for operations related to

business rules

¦ Microsoft.MasterDataServices.Services.DataContracts Contains classes to

represent models and model objects

¦ Microsoft.MasterDataServices.Services.MessageContracts Contains classes to

represent requests and responses resulting from MDS operations

¦ Microsoft.MasterDataServices.Services.ServiceContracts Contains an interface

that defines the service contract for MDS operations based on WCF related to

business rules, master data, metadata, and security

NOTE For more information about the MDS class libraries, refer to the “Master Data Services

Class Library” topic in SQL Server 2008 R2 Books Online at http://msdn.microsoft.com

/en-us/library/ee638492(SQL.105).aspx.

Master Data Services Web Service

MDS includes a Web services API as an option for creating custom applications that integrate

MDS with an organization’s existing applications and processes. This API provides access to

the master data model definitions, as well as to the master data itself. For example, by using

this API, you can completely replace the Master Data Manager Web application.

TIP For a code sample that shows how to use the Web service in a client application,

see the following blog entry by Val Lovicz, Principal Program Manager on the MDS team:

“Getting Started with the Web Services API in SQL Server 2008 R2 Master Data Services,”

at http://sqlblog.com/blogs/mds_team/archive/2010/01/12/getting-started-with-the-webservices-

api-in-sql-server-2008-r2-master-data-services.aspx.

Matching Functions

MDS also provides you with several new Transact-SQL functions that you can use to match

and cleanse data from multiple systems prior to loading it into the staging tables:

¦ Mdq.NGrams Outputs a stream of tokens (known as a set of n-grams) in the length

specified by n for use in string comparisons to find approximate matches between strings

¦ Mdq.RegexExtract Finds matches by using a regular expression

¦ Mdq.RegexIsMatch Indicates whether the regular expression finds a match by

using a regular expression

¦ Mdq.RegexIsValid Indicates whether the regular expression is valid

¦ Mdq.RegexMask Converts a set of regular expression option flags into a binary

value

¦ Mdq.RegexMatches Finds all matches of a regular expression in an input string

¦ Mdq.RegexReplace Replaces matches of a regular expression in an input string

with a different string

¦ Mdq.RegexSplit Splits an input string into an array of strings based on the positions

of a regular expression within the input string

¦ Mdq.Similarity Returns a similarity score between two strings using a specified

matching algorithm

¦ Mdq.SimilarityDate Returns a similarity score between two date values

¦ Mdq.Split Splits an input string into an array of strings using specified characters as

a delimiter

NOTE For more information about the MDS functions, refer to the “Master Data

Services Functions (Transact-SQL)” topic in SQL Server 2008 R2 Books Online at

http://msdn.microsoft.com/en-us/library/ee633712(SQL.105).aspx.

145

C H A P T E R 8

Complex Event Processing

with StreamInsight

Microsoft SQL Server StreamInsight is a complex event processing (CEP) engine. This

technology is a new offering in the SQL Server family, making its first appearance

in SQL Server 2008 R2. It ships with the Standard, Enterprise, and Datacenter editions of

SQL Server 2008 R2. StreamInsight is both an engine built to process high-throughput

streams of data with low latency and a Microsoft .NET Framework platform for developers

of CEP applications. The goal of a CEP application is to rapidly aggregate high

volumes of raw data for analysis as it streams from point to point. You can apply analytical

techniques to trigger a response upon crossing a threshold or to find trends or exceptions

in the data without first storing it in a data warehouse.

Complex Event Processing

Complex event processing is the task of sifting through streaming data to find meaningful

information. It might involve performing calculations on the data to derive information,

or the information might be the revelation of significant trends. As a development platform,

StreamInsight can support most types of CEP applications that you might need.

Complex Event Processing Applications

There are certain industries that regularly produce high volumes of streaming data.

Manufacturing and utilities companies use sensors, meters, and other devices to monitor

processes and alert users when the system identifies events that could lead to a potential

failure. Financial trading firms must monitor market prices for stocks, commodities,

and other financial instruments and rapidly calculate profits or losses based on changing

conditions.

146 CHAPTER 8 Complex Event Processing with StreamInsight

Similarly, there are certain types of applications that benefit from the ability to analyze

data as close as possible to the time that the applications capture the data. For example,

companies selling products online often use clickstream analysis to change the page layout

and site navigation and to display targeted advertising while a user remains connected to a

site. Credit card companies monitor transactions for exceptions to normal spending activities

that could indicate fraud.

The challenge with CEP arises when you need to process and analyze the data before

you have time to perform ETL activities to move the data into a more traditional analytical

environment, such as a data warehouse. In CEP applications, the value of the information

derived from low-latency processing, defined in milliseconds, can be extremely high. This

value begins to diminish as the data ages. Adding to the challenge is the rate at which source

applications generate data, often tens of thousands of records per second.

StreamInsight Highlights

StreamInsight’s CEP server includes a core engine that is built to process high-throughput

data. The engine achieves high performance by executing highly parallel queries and using

in-memory caches to avoid incurring the overhead of storing data for processing. The engine

can handle data that arrives at a steady rate or in intermittent bursts, and can even rearrange

data that arrives out of sequence. Queries can also incorporate nonstreaming data sources,

such as master reference data or historical data maintained in a data warehouse.

You write your CEP applications using a .NET language, such as Visual Basic or C#, for rapid

application development. In your applications, you embed declarative queries using Language

Integrated Query (LINQ) expressions to process the data for analysis.

StreamInsight also includes other tools for administration and development support. The

CEP server has a management interface and diagnostic views that you can use to develop

applications to monitor StreamInsight. For development support, StreamInsight includes an

event flow debugger that you can use to troubleshoot queries. An example of a situation that

might require troubleshooting is the arrival of a larger number of events than expected.

StreamInsight Architecture

As with any new technology, you will find it helpful to have an understanding of the StreamInsight

architecture before you begin development of your first CEP application. Your application

must restructure data streams to a format usable by the processing engine. You use

adapters to perform this restructuring before passing the data to queries that run on the CEP

server. The way you choose to develop your application also depends on the deployment

model you use to implement StreamInsight.

StreamInsight Architecture CHAPTER 8 147

Data Structures

The high-throughput data that StreamInsight requires is known as a stream. More specifically,

a stream is a collection of data that changes over time. For example, a Web log contains

data about each server hit, including the date, time, page request, and Internet protocol (IP)

address of the visitor. If a visitor clicks on several pages in the Web site, the Web log contains

multiple lines, or hits, for the same visitor, and each line records a different time. The information

in the Web log shows how each user’s activity in a Web site changes over time, which

is why this type of information is considered a stream. You can query this stream to find the

average number of hits or the top five referring sites over time.

StreamInsight splits a stream into individual units called events. An event contains a header

and a payload. The event header includes the event kind and one or more timestamps for the

event. The event kind is an indicator of a new event or the completeness of events already in

the stream. The payload contains the event’s data as a .NET data structure.

There are three types of event models that StreamInsight uses. The interval event model

represents events with a fixed duration, such as a stock bid price that is valid only for a certain

period of time. The edge event model is another type of duration model, but it represents an

event with a duration that is unknown at the time the event starts, such as a Web user session.

The point model represents events that occur at a specific point in time, such as a Web user’s

click entry in a Web log.

The CEP Server

The CEP server is a run-time engine and a set of adapter instances that receive and send

events, as shown in Figure 8-1. You develop these adapters in a .NET language and register

the assemblies on the CEP server, which then instantiates the adapters at run time. Input

adapters receive data as a continuous stream from event stores, such as sensors on a factory

floor, Web servers, data feeds, or databases. The data passes from the input adapter to the

CEP engine, which processes and transforms the data by using standing queries, which are

query instances that the CEP engine manages. The engine then forwards the query results

to output adapters, which connect to event consumers, such as pagers, monitoring devices,

dashboards, and databases. The output adapters can also include logic to trigger a response

based on the query results.

Pagers and

monitoring devices

Input Adapters

Data feeds Event stores

and databases

Web servers Devices

and sensors

Event Event Event Event

CEP Engine

Standing Queries

Event

Event Event

Output Adapters

CEP Application

at Run Time

Static

reference data

Event Sources

Event stores

and databases

KPI dashboards

and SharePoint UI

Event Targets

FIGURE 8-1 StreamInsight architecture

Input Adapters

The input adapters translate the incoming events into the event format that the CEP engine

requires. You can create a typed adapter if the source produces a single event type only, but

you must create an untyped adapter when the payload format differs across events or is unknown

in advance. In the case of the typed adapter, the payload format is defined in advance

with a static number of fields and data types when you implement the adapter. By contrast,

an untyped adapter receives the payload format only when the adapter binds to the query (as

part of a configuration specification). In the latter case, the number of fields and data types

can vary with each query instantiation.

Output Adapters

The output adapters reverse the operations of the input adapters by translating events into a

format that is usable by the target device and then sending the translated data to the device.

The development process for an output adapter is very similar to the process you use to

develop an input adapter.

Query Instances

Standing queries receive the stream of data from an input adapter, apply business logic to the

data (such as an aggregation), and send the results as an event stream to an output adapter.

You encapsulate the business logic used by a standing query instance in a query template

that you develop using a combination of LINQ and a .NET language. To create the standing

query instance in the CEP server, you bind a query template with specific input and output.

You can use the same query template with multiple standing queries. After you instantiate a

query, you are can start, stop, or manage it.

Deployment Models

You have two options for deploying StreamInsight. You can integrate the CEP server into an

application as a hosted assembly, or you can deploy it as a standalone server.

Hosted Assembly

Embedding the CEP server into a host application is a simple deployment approach. You

have greater flexibility than you would have with a standalone server because there are no

dependencies between applications that you must consider before making changes. Each

application and the CEP server run as a single process, which may be easier to manage on

your server.

You can use any of the development approaches described later, in the “Application Development”

section of this chapter, when hosting the CEP server in your application. However, if

you decide later that you want your application to run on a standalone server, you will need

to rewrite your application using the explicit server development model.

Standalone Server

You should deploy the CEP server as a standalone server when applications need to share

event streams or metadata objects. For example, you can reuse event types, adapter types,

and query templates and thereby minimize the impact of changes to any of these metadata

objects across applications by maintaining a single copy. You can run the CEP server as an

executable, or you can configure it as a Windows service. If you want to run it as a service

application, you can use StreamInsightHost.exe as a host process or develop your own host

process.

If you choose to deploy CEP as a standalone server, there are some limitations that affect

the way you develop applications. First, you can use only the explicit server development

model (which is described in the next section of this chapter) when developing CEP applications

for a standalone server. Second, you must connect to the CEP server by using the Web

service Uniform Resource Identifier (URI) of the CEP server host process.

Application Development

You start the typical development cycle for a new CEP application by sampling the existing

data streams and developing functions to process the data. You then test the functions, review

the results, and determine the changes necessary to improve the functions. This process

continues in an iterative fashion until you complete development.

As part of the development of your CEP application, you create event types, adapters, and

query templates. The way you use these objects depends on the development model you

choose. When you develop using the explicit server development model, you explicitly create

and register all of these objects and can reuse these objects in multiple applications. In the

implicit server development model, you concentrate on the development of the query logic

and rely on the CEP server to act as an implicit host and to create and register the necessary

objects.

TIP You can locate and download sample applications by searching for StreamInsight at

CodePlex (http://www.codeplex.com).

Event Types

An event type defines events published by the event source or consumed by the event consumer.

You use event types with a typed adapter or as objects in LINQ expressions that you

use in query templates. You create an event type as a .NET Framework class or structure by

using only public fields and properties as the payload fields, like this:

public class sampleEvent

{

public string eventId { get; set; }

public double eventValue { get; set; }

}

An event type can have no more than 32 payload fields. Payload fields must be only scalar

or elementary CLR types. You can use nullable types, such as int? instead of int. The string and

byte[] types are always nullable.

You do not create an event type when your application uses untyped adapters for scenarios

that must support multiple event types. For example, an input adapter for tables in a SQL

Server database must adapt to the schema of the table that it queries. Instead, you provide

the table schema in a configuration specification when the adapter is bound to the query.

Conversely, an untyped output adapter receives the event type description, which contains

a list of fields, when the query starts. The untyped output adapter must then map the event

type to the schema of the destination data source, typically in a configuration specification.

Adapters

Input and output adapters provide transformation interfaces between event sources, event

consumers, and the CEP server. Event sources can push events to event consumers, or event

consumers can pull events from event sources. Either way, the CEP application operates

between these two points and intercepts the events for processing. The input adapter reads

events from the source, transforms them into a format recognizable by the CEP server, and

provides the transformed events to a standing query. As the CEP server processes the event

stream, the output adapter receives the resulting new events, transforms them for the event

consumers, and then delivers the transformed events.

Before you can begin developing an adapter, you must know whether you are building

an input or output adapter. You must also know the event type, which in this context means

you must understand the structure of the event payload and how the application timestamps

affect stream processing. The .NET class or structure of the event type provides you with

information about the event payload if you are building a typed adapter. The information

necessary for the management of stream processing, known as event metadata, comes from

an interface in the adapter API when it creates an event. In addition to knowing the event

payload and event metadata, you must also know whether the shape of the event is a point,

interval, or edge model. Having this information available allows you to choose the applicable

base class. The adapter base classes are listed in Table 8-1.

TABLE 8-1 Adapter base classes

ADAPTER TYPE

AND EVENT MODEL

INPUT ADAPTER

BASE CLASS

OUTPUT ADAPTER

BASE CLASS

Typed point

TypedPointInputAdapter

TypedPointOutputAdapter

Untyped point

PointInputAdapter

PointOutputAdapter

Typed interval

TypedIntervalInputAdapter

TypedIntervalOutputAdapter

Untyped interval

IntervalInputAdapter

IntervalOutputAdapter

Typed edge

TypedEdgeInputAdapter

TypedEdgeOutputAdapter

Untyped edge

EdgeInputAdapter

EdgeOutputAdapter

If you are developing an untyped input adapter, you must ensure that it can use the configuration

specification during query bind time to determine the event’s field types by inference

from the query’s SELECT statement. You must also add code to the adapter to populate

the fields one at a time and enqueue the event. The untyped output adapter works similarly,

but instead it must be able to use the configuration specification to retrieve query processing

results from a dequeued event.

The next step is to develop an AdapterFactory object as a container class for your input

and output adapters. You use an AdapterFactory object to share resources between adapter

implementations and to pass configuration parameters to adapter constructors. Recall that

an untyped adapter relies on the configuration specification to properly handle an event’s

payload structure. The adapter factory must implement the Create() and Dispose() methods

as shown in the following code example, which shows how to create adapters for events in a

text file:

public class TextFileInputFactory : IInputAdapterFactory<TextFileInputConfig>

{

public InputAdapterBase Create(TextFileInputConfig configInfo,

EventShape eventShape, CepEventType cepEventType)

{

InputAdapterBase adapter = default(InputAdapterBase);

if (eventShape == EventShape.Point)

{

adapter = new TextFilePointInput(configInfo, cepEventType);

}

else if (eventShape == EventShape.Interval)

{

adapter = new TextFileIntervalInput(configInfo, cepEventType);

}

else if (eventShape == EventShape.Edge)

{

adapter = new TextFileEdgeInput(configInfo, cepEventType);

}

else

{

throw new ArgumentException(

string.Format(CultureInfo.InvariantCulture,

“TextFileInputFactory cannot instantiate adapter with event shape {0}”,

eventShape.ToString()));

}

return adapter;

}

public void Dispose()

{

}

The final step is to create a .NET assembly for the adapter. At minimum, the adapter

includes a constructor, a Start() method, a Resume() method, and either a ProduceEvents() or

ConsumeEvents() method, depending on whether you are developing an input adapter or an

output adapter. You can see the general structure of the adapter class in the following code

example:

public class TextFilePointInput : PointInputAdapter

{

public TextFilePointInput(TextFileInputConfig configInfo,

CepEventType cepEventType)

{ … }

public override void Start()

{ … }

public override void Resume()

{ … }

private void ProduceEvents()

{ … }

}

Using the constructor method for an untyped adapter, such as TextFilePointInput as in the

example, you can pass the configuration parameters from the adapter factory and the event

type object that passes from the query binding. The constructor also includes code to connect

to the event source and to map fields to the event payload. After the CEP server instantiates

the adapter, it invokes the Start() method, which generally calls the ProduceEvents()

or ConsumeEvents()method to begin receiving streams. The Resume() method invokes the

ProduceEvents() or ConsumeEvents() method again if the CEP server paused the streaming

and confirms that the adapter is ready.

The core transformation and queuing of events occurs in the ProduceEvents() method. This

method iterates through either reading the events it is receiving from the source or writing

events it is sending to the event consumer. It makes calls as necessary to push or pull events

into or from the event stream using calls to Enqueue() or Dequeue(). Calls to Enqueue() and

Dequeue() return the state of the adapter. If Enqueue() returns FULL or Dequeue() returns

EMPTY, the adapter transitions to a suspended state and can no longer produce or consume

events. When the adapter is ready to resume, it calls Ready(), which then causes the server to

call Resume(), and the cycle of enqueuing and dequeuing begins again from the point in time

at which the adapter was suspended.

Another task the adapter must perform is classification of an event. That is, the adapter

must specify the event kind as either INSERT or Current Time Increment (CTI). The adapter

adds events with the INSERT event kind to the stream as it receives data from the source. It

uses the CTI event kind to ignore any additional INSERT events it receives afterward that have

a start time earlier than the timestamp of the CTI event.

Query Templates

Query templates encapsulate the business logic that the CEP server instantiates as a standing

query instance to process, filter, and aggregate event streams. To define a query template,

you first create an event stream object. In a standalone server environment, you can create

and register a query template as an object on the CEP server for reuse.

The Event Stream Object

You can create an event stream object from an unbound stream or a user-defined input

adapter factory.

You might want to develop a query template to register on the CEP server without binding

it to an adapter. In this case, you can use the Create() method of the EventStream class to

obtain an event stream that has a defined shape, but without binding information. To do this,

you can adapt the following code:

CepStream<PayloadType> inputStream = CepStream<PayloadType>.Create(“inputStream”);

If you are using the implicit server development model, you can create an event stream

object from an input adapter factory and an input configuration. With this approach, you

do not need to implement an adapter, but you must specify the event shape. The following

example illustrates the syntax to use:

CEPStream<PayloadType> inputStream =

CepStream<PayloadType>.Create (streamName, typeof(AdapterFactory), myConfig,

EventShape.Point);

The QueryTemplate Object

When you use the explicit server development model for standalone server deployment,

you can create a QueryTemplate object that you can reuse in multiple bindings with different

input and output adapters. To create a QueryTemplate object, you use code similar to the

following example:

QueryTemplate myQueryTemplate = application.CreateQueryTemplate(“myQueryTemplate”,

outputStream);

Queries

After you create an event stream object, you write a LINQ expression on top of the event

stream object. You use LINQ expressions to define the fields for output events, to filter events

before query processing, to group events into subsets, and to perform calculations, aggregations,

and ranking. You can even use LINQ expressions to combine events from multiple

streams through join or union operations. Think of LINQ expressions as the questions you ask

of the streaming data.

Projection

The projection operation, which occurs in the select clause of the LINQ expression, allows you

to add more fields to the payload or apply calculations to the input event fields. You then

project the results into a new event by using field assignments. You can create a new event

type implicitly in the expressions, or you can refer to an existing event type explicitly.

Consider an example in which you need to increment the fields x and y from every event in

the inputStream stream by one. The following code example shows how to use field assignments

to implicitly define a new event type by using projection:

var outputStream = from e in inputStream

select new {x = e.x + 1, y = e.y + 1};

To refer to an existing event type, you cannot use the type’s constructor; you must use

field assignments in an expression. For example, assume you have an existing event type

called myEventType. You can change the previous code example as shown here to reference

the event type explicitly:

var outputStream = from e in inputStream

select new myEventType {x = e.x + 1, y = e.y + 1};

Filtering

You use a filtering operation on a stream when you want to apply operations to a subset of

events and discard all other events. All events for which the expression in the where clause

evaluates as true pass to the output stream. In the following example, the query selects

events where the value in field x equals 5:

var outputStream = from e in inputStream

where e.x == 5

select e;

Event Windows

A window represents a subset of data from an event stream for a period of time. After you

create a stream of windows, you can perform aggregation, TopK (a LINQ operation described

later in this chapter), or user-defined operations on the events that the windows contain. For

example, you can count the number of events in each window.

You might be inclined to think of a window as a way to partition the event stream by time.

However, the analogy between a window and a partition is useful only up to a point. When

you partition records in a table, a record belongs to one and only one partition, but an event

can appear in multiple windows based on its start time and end time. That is, the window

that covers the time period that includes an event’s start time might not include the event’s

end time. In that case, the event appears in each subsequent window, with the final window

covering the period that includes the event’s end time. Therefore, you should instead think of

a window as a way to partition time that is useful for performing operations on events occurring

between the two points of time that define a window.

In Figure 8-2, each unlabeled box below the input stream represents a window and contains

multiple events for the period of time that the window covers. In this example, the input

stream contains three events, but the first three windows contain two events and the last

window contains only one event. Thus, a count aggregation on each window yields results

different from a count aggregation on an input stream.

0 30 60 90 120

Input

events

Time (minutes)

FIGURE 8-2 Event windows in an input stream

As you might guess, the key to working with windows is to have a clear understanding of

the time span that each window covers. There are three types of window streams that StreamInsight

supports—hopping windows, snapshot windows, and count windows. In a hopping

windows stream, each window spans an equal time period. In a snapshot windows stream,

the size of a window depends on the events that it contains. By contrast, the size of a count

windows stream is not fixed, but varies according to a specified number of consecutive event

start times.

To create a hopping window, you specify both the time span that the window covers (also

known as window size) and the time span between the start of one window and the start of

the next window (also known as hop size). For example, assume that you need to create windows

that cover a period of one hour, and a new window starts every 15 minutes, as shown in

Figure 8-3. In this case, the window size is one hour and the hop size is 15 minutes. Here is the

code to create a hopping windows stream and count the events in each window:

var outputStream = from eventWindow in

inputStream.HoppingWindow(TimeSpan.FromHours(1), TimeSpan.FromMinutes(15))

select new { count = eventWindow.Count() };

0 30 60 90 120

Input

events

Time (minutes)

Hopping

windows

FIGURE 8-3 Hopping windows

When there are no gaps and there is no overlap between the windows in the stream,

hopping windows are also called tumbling windows. Figure 8-2, shown earlier, provides an

example of tumbling windows. The window size and hop size are the same in a tumbling

windows stream. Although you can use the HoppingWindow method to create tumbling windows,

there is a TumblingWindow method. The following code illustrates how to count events

in tumbling windows that occur every half hour.

var outputStream = from eventWindow in

inputStream.TumblingWindow(TimeSpan.FromMinutes(30))

select new { count = eventWindow.Count() };

Snapshot windows are similar to tumbling windows in that the windows do not overlap,

but whereas fixed points in time determine the boundaries of a tumbling window, events

define the boundaries of a snapshot window. Consider the example in Figure 8-4. At the start

of the first event, a new snapshot window starts. That window ends when the second event

starts, and a second snapshot window starts and includes both the first and second event.

When the first event ends, the second snapshot also ends, and a third snapshot window

starts. Thus, the start and stop of an event triggers the start and stop of a window. Because

events determine the size of the window, the Snapshot method takes arguments, as shown in

the following code, which counts events in each window:

var outputStream = from eventWindow in inputStream.Snapshot()

select new { count = eventWindow.Count() };

Input

events

Time

Snapshot

windows

FIGURE 8-4 Snapshot windows

Count windows are completely different from the other window types because the size of

the windows is variable. When you create windows, you provide a parameter n as a count of

events to fulfill within a window. For example, assume n is 2 as shown in Figure 8-5. The first

window starts when the first event starts and ends when the second event starts, because a

count of 2 events fulfills the specification. The second event also resets the counter to 1 and

starts a new window. The third event increments the counter to 2, which ends the second window.

Input

events

Time

Count

windows

(n=2)

FIGURE 8-5 Count windows

Aggregations

You cannot perform aggregation operations on event streams directly; instead you must first

create a window to group data into periods of time that you can then aggregate. You then

create an aggregation as a method of the window and, for all aggregations except Count, use

a lambda expression to assign the result to a field.

StreamInsight supports the following aggregation functions:

¦ Avg

¦ Sum

¦ Min

¦ Max

¦ Count

Assume you want to apply the Sum and Avg aggregations to field x in an input stream. The

following example shows you how to use these aggregations as well as the Count aggregation

for each snapshot window:

var outputStream = from eventWindow in inputStream.Snapshot()

select new { sum = eventWindow.Sum(e => e.x),

avg = eventWindow.Avg(e => e.x),

count = eventWindow.Count() };

TopK

A special type of aggregation is the TopK operation, which you use to rank and filter events in

an ordered window stream. To order a window stream, you use the orderby clause. Then you

use the Take method to specify the number of events that you want to send to the output

stream, discarding all other events. The following code shows how to produce a stream of the

top three events:

var outputStream = (from eventWindow in inputStream.Snapshot()

from e in eventWindow

orderby e.x ascending, e.y descending

select e).Take(3);

When you need to include the rank in the output stream, you use projection to add the

rank to each event’s payload. This is accessible through the Payload property, as shown in the

following code:

var outputStream = (from eventWindow in inputStream.Snapshot()

from e in eventWindow

orderby e.x ascending, e.y descending

select e).Take(3, e=> new { x = e.Payload.x, y = e.Payload.y, rank = e.Rank });

Grouping

When you want to compute operations on event groups separately, you add a group by

clause. For example, you might want to produce an output stream that aggregates the input

stream by location and compute the average for field x for each location. In the following

example, the code illustrates how to create the grouping by location and how to aggregate

events over a specified column:

var outputStream = from e in inputStream

group e by e.locationID into eachLocation

from eventWindow in eachLocation.Snapshot()

select new { avgValue = eventWindow.Avg(e => e.x), locationId = eachGroup.Key };

Joins

You can use a join operation to match events from two streams. The CEP server first matches

events only if they have overlapping time intervals, and then applies the conditions that you

specify in the join predicate. The output of a join operation is a new event that combines payloads

from the two matched events. Here is the code to join events from two input streams,

where field x is the same value in each event. This code creates a new event containing fields

x and y from the first event and field y from the second event.

var outputStream = from e1 in inputStream1

join e2 in inputStream2

on e1.x equals e2.x

select new { e1.x, e1.y, e2.y };

Another option is to use a cross join, which combines all events in the first input stream

with all events in the second input stream. You specify a cross join by using a from clause for

each input stream and then creating a new event that includes fields from the events in each

stream. By adding a where clause, you can filter the events in each stream before the CEP

server performs the cross join. The following example selects events with a value for field x

greater than 5 from the first stream and selects events with a value for field y less than 20

from the second stream, performs the cross join, and then creates a stream of new events

containing field x from the first event and field y from the second event:

var outputStream = from e1 in inputStream1

from e2 in inputStream2

where e1.x > 5 && e2.y < 20

select new { e1.x, e2.y };

Unions

You can also combine events from multiple streams by performing a union operation. You

can work with only two streams at a time, but you can cascade a series of union operations if

you need to combine events from three or more streams, as shown in the following code:

var outputStreamTemp = inputStream1.Union(inputStream2);

var outputStream = outputStreamTemp.Union(inputStream3);

User-defined Functions

When you need to perform an operation that the CEP server does not natively support, you

can create user-defined functions (UDFs) by reusing existing .NET functions. You add a UDF to

the CEP server in the same way that you add an adapter. You can then call the UDF anywhere

in your query where an expression can be used, such as in a filter predicate, a join predicate,

or a projection.

Query Template Binding

The method that the CEP server uses to instantiate the query template as a standing query

depends on the development model that you use. If you are using the explicit server development

model, you create a query binder object, but you create an event stream consumer

object if you are using the implicit server development model.

The Query Binder Object

In the explicit server development model, you first create explicit input and output adapter

objects. Next you create a query binder object as a wrapper for the query template object on

the CEP server, which in turn you bind to the input and output adapters, and then you call the

CreateQuery() method to create the standing query, as shown here:

QueryBinder myQuerybinder = new QueryBinder(myQueryTemplate);

myQuerybinder.BindProducer(“querySource”, myInputAdapter, inputConf,

EventShape.Point);

myQuerybinder.AddConsumer(“queryResult”, myOutputAdapter, outputConf,

EventShape.Point, StreamEventOrder.FullyOrdered);

Query myQuery = application.CreateQuery(“query”, myQuerybinder, “query description”);

Rather than enqueuing CTIs in the input adapter code, you can define the CTI behavior by

using the AdvanceTimeSettings class as an optional parameter in the BindProducer method.

For example, to send a CTI after every 10 events, set the CTI’s timestamp as the most recent

event’s timestamp, and drop any event that appears later in the stream but has an end timestamp

earlier than the CTI, use the following code:

var ats = new AdvanceTimeSettings(10, TimeSpan.FromSeconds(0),

AdvanceTimePolicy.Drop);

queryBinder.BindProducer (“querysource”, myInputAdapter, inputConf,

EventShape.Interval, ats);

The Event Stream Consumer Object

After you define the query logic in an application that uses the implicit server development

model, you can use the output adapter factory to create an event stream consumer object.

You can pass this object directly to the CepStream.ToQuery() method without binding the

query template to the output adapter, as you can see in the following example:

Query myQuery = outputStream.ToQuery<ResultType>(typeof(MyOutputAdapterFactory),

outputConf, EventShape.Interval, StreamEventOrder.FullyOrdered);

The Query Object

In both the explicit and implicit development models, you create a query object. With that

object instantiated, you can use the Start() and Stop() methods. The Start() method instantiates

the adapters using the adapter factories, starts the event processing engine, and calls the

Start() methods for each adapter. The Stop() method sends a message to the adapters that

the query is stopping and then shuts down the query. Your application must include the following

code to start and stop the query object:

query.Start();

// wait for signal to complete the query

query.Stop();

The Management Interface

StreamInsight includes the ManagementService API, which you can use to create diagnostic

views for monitoring the CEP server’s resources and the queries running on the server. Another

option is to use Windows PowerShell to access diagnostic information.

Diagnostic Views

Your diagnostic application can retrieve static information, such as object property values,

and statistical information, such as a cumulative event count after a particular point in time or

an aggregate count of events from child objects. Objects include the server, input and output

adapters, query operators, schedulers, and event streams. You can retrieve the desired information

by using the GetDiagnosticView() method and passing the object’s URI as a method

argument.

If you are monitoring queries, you should understand the transition points at which the

server records metrics about events in a stream. The name of a query metric identifies the

transition point to which the metric applies. For example, Total Outgoing Event Count provides

the total number of events that the output adapter has dequeued from the engine. The

following four transition points relate to query metrics:

¦ Incoming The event arrival at the input adapter

¦ Consumed The point at which the input adapter enqueues the event into the engine

¦ Produced The point at which the event leaves the last query operator in the engine

¦ Outgoing The event departure from the output adapter

Windows PowerShell Diagnostics

For quick analysis, you can use Windows PowerShell scripts to view diagnostic information

rather than writing a complete diagnostic application. Before you can use a Windows PowerShell

script, the StreamInsight server must be running a query. If the server is running as a

hosted assembly, you must expose the Web service.

You start the diagnostic process by loading the Microsoft.ComplexEventProcessing assembly

from the Global Assembly Cache (GAC) into Windows PowerShell by using the following

code:

PS C:\>

[System.Reflection.Assembly]::LoadWithPartialName(“Microsoft.ComplexEventProcessing”)

Then you need to create a connection to the StreamInsight host process by using the code

in this example:

PS C:\> $server =

Microsoft.ComplexEventProcessing.Server]::Connect(“http://localhost/StreamInsight”)

Then you can use the GetDiagnosticView() method to retrieve statistics for an object, such

as the Event Manager, as shown in the following code:

PS C:\> $dv = $server.GetDiagnosticView(“cep:/Server/EventManager”)

PS C:\> $dv

To retrieve information about a query, you must provide the full name, following the

StreamInsight hierarchical naming schema. For example, for an application named myApplication

with a query named myQuery, you use the following code:

PS C:\> $dv =

$server.GetDiagnosticView(“cep:/Server/Application/myApplication/Query/myQuery”)

PS C:\> $dv

NOTE For a complete list of metrics and statistics that you can query by using diagnostic

views, refer to the SQL Server Books Online topic “Monitoring the CEP Server and Queries”

at http://msdn.microsoft.com/en-us/library/ee391166(SQL.105).aspx.

C H A P T E R 9

Reporting Services

Enhancements

If you thought Microsoft SQL Server 2008 Reporting Services introduced a lot of great

new features to the reporting platform, just wait until you discover what’s new in

Reporting Services in SQL Server 2008 R2. The Reporting Services development team

at Microsoft has been working hard to incorporate a variety of improvements into the

product that should make your life as a report developer or administrator much simpler.

New Data Sources

This release supports a few new data sources to expand your options for report development.

When you use the Data Source Properties dialog box to create a new data source,

you see Microsoft SharePoint List, Microsoft SQL Azure, and Microsoft SQL Server Parallel

Data Warehouse (covered in Chapter 6, “Scalable Data Warehousing”) as new options in

the Type drop-down list. To build a dataset with any of these sources, you can use a graphical

query designer or type a query string applicable to the data source provider type.

You can also use SQL Server PowerPivot for SharePoint as a data source, although this

option is not included in the list of data source providers. Instead, you use the SQL Server

Analysis Services provider and then provide the URL for the workbook that you want to

use as a data source. You can learn more about using a PowerPivot workbook as a data

source in Chapter 10, “Self-Service Analysis with PowerPivot.”

Expression Language Improvements

There are several new functions added to the expression language, as well as new capabilities

for existing functions. These improvements allow you to combine data from two

different datasets in the same data region, create aggregated values from aggregated

values, define report layout behavior that depends on the rendering format, and modify

report variables during report execution.

165

Combining Data from More Than One Dataset

To display data from more than one source in a table (or in any data region, for that matter),

you must create a dataset that somehow combines the data because a data region binds to

one and only one dataset. You could create a query for the dataset that joins the data if both

sources are relational and accessible with the same authentication. But what if the data comes

from different relational platforms? Or what if some of the data comes from SQL Server and

other data comes from a SharePoint list? And even if the sources are relational, what if you

can access only stored procedures and are unable to create a query to join the sources? These

are just a few examples of situations in which the new Lookup functions in the Reporting

Services expression language can help.

In general, the three new functions, Lookup, MultiLookup, and LookupSet, work similarly

by using a value from the dataset bound to the data region (the source) and matching it to

a value in a second dataset (the destination). The difference between the functions reflects

whether the input or output is a single value or multiple values.

You use the Lookup function when there is a one-to-one relationship between the source

and destination. The Lookup function matches one source value to one destination value at a

time, as shown in Figure 9-1.

Month-to-Date Sales

State/Province

British Columbia

Oregon

Washington

Sales Amount

1,225

750

1,000

StProvName

British Columbia

Oregon

Washington

StProv

StateProvinceCode

SalesAmount

1225

750

1000

Dataset1 Dataset2

FIGURE 9-1 Lookup function results

In the example, the resulting report displays a table for the sales data returned for

Dataset2, but rather than displaying the StateProvinceCode field from the same dataset, the

Lookup function in the first column of the table instructs Reporting Services to match each

value in that field from Dataset2 with the StProv field in Dataset1 and then to display the

corresponding StProvName. The expression in the first column of the table is shown here:

166 CHAPTER 9 Reporting Services Enhancements

=Lookup(Fields!StateProvinceCode.Value, Fields!StProv.Value,

Fields!StProvName.Value, “Dataset1”)

The MultiLookup function also requires a one-to-one relationship between the source and

destination, but it accepts a set of source values as input. Reporting Services matches each

source value to a destination value one by one, and then returns the matching values as an

array. You can then use an expression to transform the array into a comma-separated list, as

shown in Figure 9-2.

StProvName

British Columbia

Oregon

Washington

StProv

Dataset1

Florida

Georgia

Salesperson StateProvinceCode

Dataset2

SalesAmount

Month-to-Date Sales by Salesperson

David Campbell BC, OR, WA 2975

Tsvi Reiter FL, GA 3000

Saleperson

David Campbell

Tsvi Reiter

Territory

British Columbia, Oregon, Washington

Florida, Georgia

Sales Amount

2,975

3,000

FIGURE 9-2 MultiLookup function results

The MultiLookup function in the second column of the table requires an array of values

from the dataset bound to the table, which in this case is the StateProvinceCode field in

Dataset2. You must first use the Split function to convert the comma-separated list of values

in the StateProvinceCode field into an array. Reporting Services operates on each element

of the array, matching it to the StProv field in Dataset1, and then combining the results into

an array that you can then transform into a comma-separated list by using the Join function.

Here is the expression in the Territory column:

=Join(MultiLookup(Split(Fields!StateProvinceCode.Value, “,”), Fields!StProv.Value,

Fields!StProvName.Value, “Dataset1 “), “, “)

Expression Language Improvements CHAPTER 9 167

When there is a one-to-many relationship between the source and destination values, you

use the LookupSet function. This function accepts a single value from the source dataset as

input and returns an array of matching values from the destination dataset. You could then

use the Join function to convert the result into a delimited string, as in the example for the

MultiLookup function, or you could use other functions that operate on arrays, such as the

Count function, as shown in Figure 9-3.

Salesperson SalespersonCode

Dataset2

David Campbell

Tsvi Reiter

Salesperson- CustomerName

Code

K. Gregersen

T. Yee

L. Miller

J. Frank

Dataset2

K. Gregersen

T. Yee

L. Miller

Customer Counts

Salesperson

David Campbell

Tsvi Reiter

Customer Count

1 J. Frank

FIGURE 9-3 LookupSet function results

The Customer Count column uses this expression:

LookupSet(Fields!SalespersonCode.Value,Fields!SalesperonCode.Value,

Fields!CustomerName.Value,”Dataset2″).Length

Aggregation

The aggregate functions available in Reporting Services since its first release with the SQL

Server 2000 platform provided all the functionality most people needed most of the time.

However, if you needed to use the result of an aggregate function as input for another

aggregate function and weren’t willing or able to put the data into a SQL Server Analysis

Services cube first, you had no choice but to preprocess the results in the dataset query.

In other words, you were required to do the first level of aggregation in the dataset query,

and then you could perform the second level of aggregation by using an expression in the

report. Now, with SQL Server 2008 R2 Reporting Services, you can nest an aggregate function

inside another aggregate function. Put another way, you can aggregate an aggregation.

The example table in Figure 9-4 shows the calculation of average monthly sales for a selected

year. The dataset contains one row for each product, which the report groups by year and by

month while hiding the detail rows.

FIGURE 9-4 Aggregation of an aggregation

Here is the expression for the value displayed in the Monthly Average row:

=Avg(Sum(Fields!SalesAmount.Value,”EnglishMonthName”))

Conditional Rendering Expressions

The expression language in SQL Server 2008 R2 Reporting Services includes a new global

variable that allows you to set the values for “look-and-feel” properties based on the rendering

format used to produce the report. That is, any property that controls appearance (such

as Color) or behavior (such as Hidden) can use members of the RenderFormat global variable

in conditional expressions to change the property values dynamically, depending on the

rendering format.

Let’s say that you want to simplify the report layout when a user exports a report to

Microsoft Excel. Sometimes other report items in the report can cause a text box in a data

region to render as a set of merged cells when you are unable to get everything to align

perfectly. The usual reason that users export a report to Excel is to filter and sort the data,

and they are not very interested in the information contained in the other report items.

Rather than fussing with the report layout to get each report item positioned and aligned just

right, you can use an expression in the Hidden property to keep those report items visible in

every export format except Excel. Simply reference the name of the extension as found in the

RSReportServer.config file in an expression like this:

=iif(RenderFormat.Name=”EXCEL”, True, False)

Another option is to use the RenderFormat global variable with the IsInteractive member

to set the conditions of a property. For example, let’s say you have a report that displays summarized

sales but also allows the user to toggle a report item to display the associated details.

Rather than export all of the details when the export format is not interactive, you can easily

omit those details from the rendered output by using the following expression in the Hidden

property of the row group containing the details:

=iif(RenderFormat.IsInteractive, False, True)

Page Numbering

Speaking of global variables, you can use the new Globals!OverallPageNumber and

Globals!OverallTotalPages variables to display the current page number relative to the entire

report and the total page count, respectively. You can use these global variables, which are

also known as built-in fields, in page headers and page footers only. As explained later in

this chapter in the “Pagination Properties” section, you can specify conditions under which

to reset the page number to 1 rather than incrementing its value by one. The variables

Globals!PageNumber and Globals!TotalPages are still available from earlier versions. You can

use them to display the page information for the current section of a report. Figure 9-5 shows

an example of a page footer when the four global variables are used together.

FIGURE 9-5 Global variables for page counts

The expression to produce this footer looks like this:

=”Section Page ” + CStr(Globals!PageNumber) + ” of ” + CStr(Globals!TotalPages) +

” (Overall ” + Cstr(Globals!OverallPageNumber) + ” of ” +

CStr(Globals!OverallTotalPages) +”)”

Read/Write Report Variable

Another enhancement to the expression language is the new support for setting the value

of a report variable. Just as in previous versions of Reporting Services, you can use a report

variable when you have a value with a dependency on the execution time. Reporting Services

stores the value at the time of report execution and persists that value as the report continues

to process. That way, as a user pages through the report, the variable remains constant even if

the actual page rendering time varies from page to page.

By default, a report variable is Read Only, which was the only option for this feature in the

previous version of Reporting Services. In SQL Server 2008 R2, you can now clear the Read-

Only setting, as shown in Figure 9-6, when you want to be able to change the value of the

report variable during report execution.

FIGURE 9-6 Changing report variables

To write to your report variable, you use the SetValue method of the variable. For example,

assume that you have set up the report to insert a page break between group instances, and

you want to update the execution time when the group changes. Add a report variable to the

report, and then add a hidden text box to the data region with the group used to generate

a page break. Next, place the following expression in the text box to force evaluation of the

expression for each group instance:

=Variables!MyVariable.SetValue(Now())

In the previous version of Reporting Services, the report variable type was a value just

like any text box on the report. In SQL Server 2008 R2, the report variable can also be a .NET

serializable type. You must initialize and populate the report variable when the report session

begins, then you can independently add or change the values of the report variable on each

page of the report during your current session.

Layout Control

SQL Server 2008 R2 Reporting Services also includes several new report item properties that

you can use to control layout. By using these properties, you can manage report pagination,

fill in data gaps to align data groupings, and rotate the orientation of text.

Pagination Properties

There are three new properties available to manage pagination: Disabled, ResetPageNumber,

and PageName. These properties appear in the Properties window when you select a tablix,

rectangle, or chart in the report body or a group item in the Row Groups or Column Groups

pane. The most common reason you set values for these properties is to define different paging

behaviors based on the rendering format, now that the global variable RenderFormat is

available.

For example, assume that you create a tablix that summarizes sales data by year, and

group the data with the CalendarYear field as the outermost row group. When you click the

CalendarYear group item in the Row Groups pane, you can access several properties in the

Properties window, as shown in Figure 9-7. Those properties, however, are not available in the

item’s Group Properties dialog box.

FIGURE 9-7 Pagination properties

Assume also that you want to insert page breaks between each instance of CalendarYear

only when you export the report to Excel. After setting the BreakLocation property to Between,

you set the Disabled property to False when the report renders as Excel by using the

following expression:

=iif(Globals!RenderFormat.Name=”EXCEL”,False,True)

Reporting Services keeps as many groups visible on one page as possible and adds a soft

page break to the report where needed to keep the height of the page within the dimensions

specified by the InteractiveSize property when the report renders as HTML. However,

when the report renders in any other format, each year appears on a separate page, or on a

separate sheet if the report renders in Excel.

Whether or not you decide to disable the page break, you can choose the conditions to

apply to reset the page number when the page break occurs by assigning an expression to

the ResetPageNumber property. To continue with the current example, you can use a similar

conditional expression for the ResetPageNumber property to prevent the page number from

resetting when the report renders as HTML and only allow the reset to occur in all other

formats. Therefore, in HTML format, the page number of the report increments by one as you

page through it, but in other formats (excluding Excel), you see the page number reset each

time a new page is generated for a new year.

Last, consider how you can use the PageName property. As one example, instead of using

page numbers in an Excel workbook, you can assign a unique name to each sheet in the

workbook. You might, for example, use the group expression that defines the page break as

the PageName property. When the report renders as an Excel workbook, Reporting Services

uses the page break definition to separate the CalendarYear groups into different sheets of

the same workbook and uses the PageName expression to assign the group instance’s value

to the applicable sheet.

As another example, you can assign an expression to the PageName property of a rectangle,

data region, group, or map. You can then reference the current value of this property

in the page header or footer by using Globals!PageName in the expression. The value

of Globals!PageName is first set to the value of the InitialPageName report property when

report processing begins and then resets as each report item processes if you have assigned

an expression to the report item’s PageName property.

Data Synchronization

One of the great features of Reporting Services is its ability to create groups of groups by

nesting one type of report item inside another type of report item. In Figure 9-8, a list that

groups by category and year contains a matrix that groups by month. Notice that the months

in each list group do not line up properly because data does not exist for the first six months

of the year for the Accessories 2005 group. Each monthly group displays independently of

other monthly groups in the report.

FIGURE 9-8 Unsynchronized groups

A new property, DomainScope, is available in SQL Server 2008 R2 Reporting Services to fix

this problem. This property applies to a group and can be used within the tablix data region,

as shown in Figure 9-9, or in charts and other data visualizations whenever you need to fill gaps

in data across multiple instances of the same grouping. You simply set the property value to

the name of the data region that contains the group. In this example, the MonthName group’s

DomainScope property is set to Tablix1, which is the name assigned to the list. Each instance of

the list’s group—category and year—renders an identical set of values for MonthName.

FIGURE 9-9 Synchronized groups

Text Box Orientation

Each text box has a WritingMode property that by default displays text horizontally. There is

also an option to display text vertically to accommodate languages that display in that format.

Although you could use the vertical layout for other languages, you probably would not

be satisfied with the result because it renders each character from top to bottom. An English

word, for example, would have the bottom of each letter facing left and the top of each letter

facing right. Instead, you can set this property to a new value, Rotate270, which also renders

the text in a vertical layout, but from bottom to top, as shown in Figure 9-10. This feature is

useful for tablix row headers when you need to minimize the width of the tablix.

FIGURE 9-10 Text box orientation

Data Visualization

Prior to SQL Server 2008 R2 Reporting Services, your only option for enhancing a report with

data visualization was to add a chart or gauge. Now your options have been expanded to

include data bars, sparklines, indicators, and maps.

Data Bars

A data bar is a special type of chart that you add to your report from the Toolbox window.

A data bar shows a single data point as a horizontal bar or as a vertical column. Usually you

embed a data bar inside of a tablix to provide a small data visualization for each group or

detail group that the tablix contains. After adding the data bar to the tablix, you configure the

value you want to display, and you can fine-tune other properties as needed if you want to

achieve a certain look. By placing data bars in a tablix, you can compare each group’s value to

the minimum and maximum values within the range of values across all groups, as shown in

Figure 9-11. In this example, Accessories 2005 is the minimum sales amount, and Bikes 2007

is the maximum sales amount. The length of each bar allows you to visually assess whether a

group is closer to the minimum or the maximum or some ratio in between, such as the Bikes

2008 group, which is about half of the maximum sales.

FIGURE 9-11 Data bars

Sparklines

Like data bars, sparklines can be used to include a data visualization alongside the detailed

data. Whereas a data bar usually shows a single point, a sparkline shows multiple data points

over time, making it easier to spot trends.

You can choose from a variety of sparkline types such as columns, area charts, pie charts,

or range charts, but most often sparklines are represented by line charts. As you can see in

Figure 9-12, sparklines are pretty bare compared to a chart. You do not see axis labels, tick

marks, or a legend to help you interpret what you see. Instead, a sparkline is intended to

provide a sense of direction by showing upward or downward trends and varying degrees of

fluctuation over the represented time period.

FIGURE 9-12 Sparklines

Indicators

Another way to display data in a report is to use indicators. In previous versions of Reporting

Services, you could produce a scorecard of key performance indicators by uploading

your own images and then using expressions to determine which image to display. Now

you can choose indicators from built-in sets, as shown in Figure 9-13, or you can customize

these sets to change properties such as the color or size of an indicator icon, or even by

using your own icons.

FIGURE 9-13 Indicator types

After selecting a set of indicators, you associate the set with a value in your dataset or with

an expression, such as a comparison of a dataset value to a goal. You then define the rules

that determine which indicator properly represents the status. For example, you might create

an expression that compares SalesAmount to a goal. You could then assign a green check

mark if SalesAmount is within 90 percent of the goal, a yellow exclamation point if it is within

50 percent of the goal, and a red X for everything else.

Maps

A map element is a special type of data visualization that combines geospatial data with other

types of data to be analyzed. You can use the built-in Map Gallery as a background for your

data, or you can use an ESRI shapefile. For more advanced customization, you can use SQL

Server spatial data types and functions to create your own polygons to represent geographical

areas, points on a map, or a connected set of points representing a route. Each map can

have one or more map layers, each of which contains spatial data for drawing the map, analytical

data that will be projected onto the map as color-coded regions or markers, and rules

for assigning colors, marker size, and other visualization properties to the analytical data. In

addition, you can add Bing Maps tile layers as a background for other layers in your map.

Although you can manually configure the properties for the map and each map layer, the

easiest way to get started is to drag a map from the Toolbox window to the report body (if

you are using Business Intelligence Development Studio) or click the map in the ribbon (if

you are using Report Builder 3.0). This starts the Map Wizard, which walks you through the

configuration process by prompting you for the source of the spatial data defining the map

itself and the source of the analytical data to display on the map. You then decide how the

report should display this analytical data—by color-coding elements on the map or by using

a bubble to represent data values on the map at specified points. Next, you define the relationship

between the map’s spatial data and the analytical data by matching fields from each

dataset. For example, the datasets for the map shown in Figure 9-14 have matching fields for

the two-letter state codes. In the next step, you specify the field in your analytical data to

display on the map, and you configure the visualization rules to apply, such as color ranges.

In the figure, for example, the rule is to use darker colors to indicate a higher population.

FIGURE 9-14 A map using colors to show population distribution

Reusability

SQL Server 2008 R2 Reporting Services has several new features to support reusability of

components. Report developers with advanced skills can build shared datasets and report

parts that can be used by others. Then, for example, a business user can quickly and easily

pull together these preconstructed components into a personalized report without knowing

how to build a query or design a matrix. To help the shared datasets run faster, you can

configure a cache refresh schedule to keep a copy of the shared dataset in cache. Last, the

ability to share report data as an Atom data feed extends the usefulness of data beyond a

single source report.

Shared Datasets

A shared dataset allows you to define a query once for reuse in many reports, much as you

can create a shared datasource to define a reusable connection string. Having shared datasets

available on the server also helps SQL Server 2008 R2 Report Builder 3.0 users develop

reports more easily, because the dataset queries are already available for users who lack the

skills to develop queries without help. The main requirement when creating a shared dataset

is to use a shared data source. In all other respects, the configuration of the shared dataset

is just like the traditional embedded dataset used in earlier versions of Reporting Services.

You define the query and then specify options, query parameter values, calculated fields, and

filters as needed. The resulting file for the shared dataset has an .rsd extension and uploads to

the report server when you deploy the project. The project properties now include a field for

specifying the target folder for shared datasets on the report server.

NOTE You can continue to create embedded datasets for your reports as needed, and

you can convert an embedded dataset to a shared dataset at any time.

In Report Manager, you can check to see which reports use the shared dataset when you

need to evaluate the impact of a change to the shared dataset definition. Simply navigate to

the folder containing the shared dataset, click the arrow to the right of the shared dataset

name, and select View Dependent Items, as shown in Figure 9-15.

FIGURE 9-15 The shared dataset menu

Cache Refresh

The ability to configure caching for reports has been available in every release of Reporting

Services. This feature is helpful in situations in which reports take a long time to execute and

the source data is not in a constant state of change. By storing the report in cache, Reporting

Services can respond to a report request faster, and users are generally happier with the

reporting system. However, cache storage is not unlimited. Periodically, the cache expires

and the next person that requests the report has to wait for the report execution process to

complete. A workaround for this scenario is to create a subscription that uses the NULL delivery

provider to populate the cache in advance of the first user’s request.

In SQL Server 2008 R2 Reporting Services, a better solution is available. A new feature

called Cache Refresh allows you to establish a schedule to load reports into cache. In addition,

you can configure Cache Refresh to load shared datasets into cache to extend the performance

benefit to multiple reports. Caching shared datasets is not only helpful for reports,

but also for any dataset that you use to populate the list of values for a parameter. To set up

a schedule for the Cache Refresh, you must configure stored credentials for the data source.

Then you configure the caching expiration options for the shared dataset and create a new

Cache Refresh Plan, as shown in Figure 9-16.

FIGURE 9-16 The Cache Refresh Plan window

Report Parts

After developing a report, you can choose which report items to publish to the report server

as individual components that can be used again later by other report authors who have

permissions to access the published report parts. Having readily accessible report parts in a

central location enables report authors to build new reports more quickly. You can publish

any of the following report items as report parts: tables, matrices, rectangles, lists, images,

charts, gauges, maps, and parameters.

You can publish report parts both from Report Builder 3.0 and Report Designer in Business

Intelligence Development Studio. In Report Designer, the Report menu contains the Publish

Report Parts command. In the Publish Report Parts dialog box, shown in Figure 9-17, you

select the report items that you want to publish. You can replace the report item name and

provide a description before publishing.

FIGURE 9-17 The Publish Report Parts dialog box

When you first publish the report part, Reporting Services assigns it a unique identifier

that persists across all reports to which it will be added. Note the option in the Publish Report

Parts dialog box in Report Designer (shown in Figure 9-15) to overwrite the report part on the

report server every time you deploy the report. In Report Builder, you have a different option

that allows you to choose whether to publish the report item as a new copy of the report.

If you later modify the report part and publish the revised version, Reporting Services can

use the report part’s unique identifier to recognize it in another report when another report

developer opens that report for editing. At that time, the report author receives a notification

of the revision and can decide whether to accept the change.

Although you can publish report parts in Report Designer and Report Builder 3.0, you

can only use Report Builder 3.0 to find and use those report parts. More information about

Report Builder 3.0 can be found later in this chapter in the “Report Builder 3.0” section.

Atom Data Feed

SQL Server 2008 R2 Reporting Services includes a new rendering extension to support

exporting report data to an Atom service document. An Atom service document can be used

by any application that consumes data feeds, such as SQL Server PowerPivot for Excel. You

can use this feature for situations in which the client tools that users have available cannot

access data directly or when the query structures are too complex for users to build on their

own. Although you could use other techniques for delivering data feed to users, Reporting

Services provides the flexibility to use a common security mechanism for reports and data

feeds, to schedule delivery of data feeds, and to store report snapshots on a periodic basis.

The Atom service document contains at least one data feed per data region in the report if

a report author has not disabled this feature. Depending on the structure of the data, a matrix

that contains adjacent groups, a list, or a chart might produce multiple data feeds. Each data

feed has a URL that you use to retrieve the content.

To export a report to the Atom data feed, you click the last button on the toolbar in the

Report Viewer, as shown in Figure 9-18.

FIGURE 9-18 Atom Data Feed

The Atom service document is an XML document containing a connection to each data

feed that is defined as a URL, as shown in the following XML code:

<?xml version=”1.0″ encoding=”utf-8″ standalone=”yes” ?>

<service xmlns:atom=”http://www.w3.org/2005/Atom”

xmlns:app=”http://www.w3.org/2007/app” xmlns=”http://www.w3.org/2007/app”>

<atom:title>Reseller Sales</atom:title>

<collection

href=”http://yourserver/ReportServer?%2fExploring+Features%2fReseller+Sales

&rs%3aCommand=Render&rs%3aFormat=ATOM&rc%3aDataFeed=xAx0x0″>

<atom:title>Tablix1</atom:title>

</collection>

</workspace>

</service>

Report Builder 3.0

Report Builder 1.0 was the first release of a report development tool targeted for business

users. That version restricted the users to queries based on a report model and supported

limited report layout capabilities. Report Builder 2.0 was released with SQL Server 2008 and

gave the user expanded capabilities for importing queries from other report definition files or

for writing a query on any data source supported by Reporting Services. In addition, Report

Builder 2.0 included support for all layout options of Report Definition Language (RDL).

Report Builder 3.0 is the third iteration of this tool. It supports the new capabilities of SQL

Server 2008 R2 RDL including maps, sparklines, and data bars. In addition, Report Builder 3.0

supports two improvements intended to speed up the report development process—edit

sessions and the Report Part Gallery.

Edit Sessions

Report Builder 3.0 operates as an edit session on the report server if you perform your development

work while connected to the server. The main benefit of the edit session is to speed

up the preview process and render reports faster. The report server saves cached datasets

for the edit session. These datasets are reused when you preview the report and have made

report changes that affect the layout only. If you know that the data has changed in the

meantime, you can use the Refresh button to retrieve current data for the report. The cache

remains available on the server for two hours and resets whenever you preview the report.

After the two hours have passed, the report server deletes the cache. An administrator can

change this default period to retain the cache for longer periods if necessary.

The edit session also makes it easier to work with server objects during report development.

One benefit is the ability to use relative references in expressions. Relative references

allow you to specify the path to subreports, images, and other reports that you might configure

as targets for the Jump To action relative to the current report’s location on the report

server. Another benefit is the ability to test connections and confirm that authentication

credentials work before publishing the report to the report server.

The Report Part Gallery

Report Builder 3.0 includes a new window, the Report Part Gallery, that you can enable from

the View tab on the ribbon. At the top of this window is a search box in which you can type

a string value, as shown in Figure 9-19, and search for report parts published to the report

server where the name or the description of the report part contains the search string. You

can also search by additional criteria, such as the name of the creator or the date created.

To use the report part, simply drag the item from the list onto the report body. The ability

to find and use report parts is available only within Report Builder 3.0. You can use Report

Designer to create and publish report parts, but not to reuse them in other reports.

FIGURE 9-19 The Report Part Gallery

Report Access and Management

In this latest release of Reporting Services, you can benefit from a few enhancements that

improve access to reports and to management operations in Report Manager, in addition to

an additional feature that supports sandboxing of the report server environment.

Report Manager Improvements

When you open Report Manager for the first time, you will immediately notice the improved

look and feel. The color scheme and layout of this Web application had not changed since the

product’s first release, until now. When you open a report for viewing, you notice that more

screen space is allocated to the Report Viewer, as shown in Figure 9-20. All of the space at the

top of the screen has been eliminated.

FIGURE 9-20 Report Viewer

Notice also that the Report Viewer does not include a link to open the report properties.

Rather than requiring you to open a report first and then navigate to the properties pages,

Report Manager gives you direct access to the report properties from a menu on the report

listing page, as shown in Figure 9-21. Another direct access improvement to Report Manager

is the ability to test the connection for a data source on its properties page.

FIGURE 9-21 The report menu

Report Viewer Improvements

The display of reports is also improved in the Report Viewer available in this release of SQL

Server, which now supports AJAX (Asynchronous JavaScript and XML). If you are familiar with

earlier versions of Reporting Services, you can see the improvement that AJAX provides by

changing parameters or by using drilldown. The Report Viewer no longer requires a refresh

of the entire screen, nor does it reposition the current view to the top of the report, which

results in a much smoother viewing experience.

Improved Browser Support

Reporting Services no longer supports just one Web browser, as it did when it was first

released. In SQL Server 2008 R2, you can continue to use Windows Internet Explorer 6, 7, or

8, which is recommended for access to all Report Viewer features. You can also use Firefox,

Netscape, or Safari. However, these browsers do not support the document map, text search

within a report, zoom, or fixed table headers. Furthermore, Safari 3.0 does not support the

Calendar control for date parameters or the client-side print control and does not correctly

display image files that the report server retrieves from a remote computer.

If you choose to use a Web browser other than Internet Explorer, you should understand

the authentication support that the alternative browsers provide. Internet Explorer is the only

browser that supports all authentication methods that you can use with Reporting Services—

Negotiated, Kerberos, NTLM, and Basic. Firefox supports Negotiated, NTLM, and Basic, but

not Kerberos authentication. Safari supports only Basic authentication.

NOTE Basic authentication is not enabled by default in Reporting Services. You must

modify the RSReportServer.config file by following the instructions in SQL Server Books

Online in the topic “How to: Configure Basic Authentication in Reporting Services” at

http://msdn.microsoft.com/en-us/library/cc281309.aspx.

RDL Sandboxing

When you grant external users access to a report server, the security risks multiply enormously,

and additional steps must be taken to mitigate those risks. Reporting Services now supports

configuration changes through the use of the RDL Sandboxing feature on the report server to

isolate access to resources on the server as an important part of a threat mitigation strategy.

Resource isolation is a common requirement for hosted services that have multiple tenants

on the same server. Essentially, the configuration changes allow you to restrict the external

resources that can be accessed by the server, such as images, XLST files, maps, and data

sources. You can also restrict the types and functions used in expressions by namespace and

by member, and check reports as they are deployed to ensure that the restricted types are

not in use. You can also restrict the text length and the size of an expression’s return value

when a report executes. With sandboxing, reports cannot include custom code in their code

blocks, nor can reports include SQL Server 2005 custom report items or references to named

parameters in expressions. The trace log will capture any activity related to sandboxing and

should be monitored frequently for evidence of potential threats.

SharePoint Integration

SQL Server 2008 R2 Reporting Services continues to improve integration with SharePoint. In

this release, you find better options for configuring SharePoint 2010 for use with Reporting

Services, working with scripts to automate administrative tasks, using SharePoint lists as data

sources, and integrating Reporting Services log events with the SharePoint Unified Logging

Service.

Improved Installation and Configuration

The first improvement affects the initial installation of Reporting Services in SharePoint integrated

mode. Earlier versions of Reporting Services and SharePoint require you to obtain the

Microsoft SQL Server Reporting Services Add-in for SharePoint as a separate download for

installation. Although the add-in remains available as a separate download, the prerequisite

installation options for SharePoint 2010 include the ability to download the add-in and install

it automatically with the other prerequisites.

After you have all components installed and configured on both the report server and the

SharePoint server, you need to use SharePoint 2010 Central Administration to configure the

General Application settings for Reporting Services. As part of this process, you can choose

to apply settings to all site collections or to specific sites, which is a much more streamlined

approach to enabling Reporting Services integration than was possible in earlier versions.

Another important improvement is the addition of support for alternate access mappings

with Reporting Services. Alternate access mappings allow users from multiple zones, such as

the Internet and an intranet, to access the same report items by using different URLs. You can

configure up to five different URLs to access a single Web application that provides access

to Reporting Services content, with each URL using a different authentication provider. This

functionality is important when you want to use Windows authentication for intranet users

and Forms authentication for Internet users.

RS Utility Scripting

Report server administrators frequently use the rs.exe utility to perform repetitive administrative

tasks, such as bulk deployment of reports to the server and bulk configuration of report

properties. Lack of support for this utility in integrated mode had been a significant problem

for many administrators, so having this capability added to integrated mode is great news.

SharePoint Lists as Data Sources

Increasing numbers of companies use SharePoint lists to store information that needs to be

shared with a broader audience or in a standard report format. Although there are some

creative ways you could employ to get that data into Reporting Services, custom code was

always part of the solution. SQL Server 2008 R2 Reporting Services has a new data extension

provider that allows you to access SharePoint 2007 or SharePoint 2010 lists. After you

create the data source using the Microsoft SharePoint List connection type and provide

credentials for authentication, you must supply a connection string to the site or subsite in

the form of a URL that references the site or subsite. That is, use a connection string such as

http://MySharePointWeb/MySharePointSite or http://MySharePointWeb/MySharePointSite

/Subsite. A query designer is available with this connection provider, as shown in Figure 9-22,

allowing you to select fields from the list to include in your report.

FIGURE 9-22 SharePoint list Query Designer

SharePoint Unified Logging Service

In SharePoint integrated mode, you now have the option to view log information by using the

SharePoint Unified Logging Service. After you enable diagnostic logging, the log files capture

information about activities related to Reporting Services in Central Administration, calls from

client applications to the report server, calls made by the processing and rendering engines

in local mode, calls to Reporting Services Web pages or the Report Viewer Web Part, and

all other calls related to Reporting Services within SharePoint. Having all SharePoint-related

activity, including the report server, in one location should help the troubleshooting process.

189

C H A P T E R 1 0

Self-Service Analysis with

PowerPivot

Many business intelligence (BI) solutions require access to centralized, cleansed data

in a data warehouse, and there are many good reasons for an organization to

continue to maintain a data warehouse for these solutions. There are even self-service

tools available that allow users to build ad hoc reports from this data. But for a variety of

reasons, business users cannot limit their analyses to data that comes from the corporate

data warehouse. In fact, their analyses often require data that will never be part of the

data warehouse, such as miscellaneous spreadsheets or text files prepared for specific

needs or data obtained from third parties that might be used only once.

Users can spend a great deal of time gathering data from disparate sources and then

manually consolidating and integrating the data in the form of one or more Microsoft

Excel workbooks. PivotTables and PivotCharts are popular tools for performing analyses,

but Excel requires all the data for these objects to be consolidated first into a single table

or to be available in the form of a cube in a SQL Server Analysis Services database. What

does the user do when the insight is so useful that the spreadsheet needs to be shared

with others on a frequent basis with fresh data?

Sometime users are also constrained by the volume of data that they want to analyze.

Excel 2007 can support one million rows of data, but what if the user has data that is

more than a million rows? These users need a tool that enables them to analyze huge

sets of data without dependence on IT support.

Microsoft SQL Server 2008 R2 comes to the rescue for these users with two new

features to meet these needs—SQL Server PowerPivot for Excel 2010 and SQL Server

PowerPivot for SharePoint 2010. PowerPivot for Excel gives analysts a way to integrate

large volumes of data outside of a corporate data warehouse, whether they are creating

reports to support decision making or prototyping solutions that will eventually be

part of a larger BI implementation. To provide multiple users with centralized access to

reports developed with PowerPivot for Excel, information technology staff can implement

PowerPivot for SharePoint. This server-side PowerPivot product provides the necessary

infrastructure to manage, secure, refresh, and monitor these PowerPivot reports efficiently.

190 CHAPTER 10 Self-Service Analysis with PowerPivot

PowerPivot for Excel

PowerPivot for Excel is an add-in that extends the functionality of Excel 2010 to support

analysis of large, related datasets on your computer. After installing the add-in, you can

import data from external data sources and integrate it with local files, and then develop the

presentation objects, all within the Excel environment. You save all your work in a single file

that is easy to manage and share.

The PowerPivot Add-in for Excel

To create your own PowerPivot workbooks or to edit workbooks that others have created,

you must first install the PowerPivot add-in for Excel 2010.

Modifications to Excel

When you install the add-in, several changes are made to Excel. First, the installation adds

the PowerPivot menu to the Excel ribbon. Second, it adds the PowerPivot window, a design

environment for working with PowerPivot data within Excel. You can use this design environment

to import millions of rows of data, which you can later view as summarized results in

Excel worksheets.

When you are ready to create a PowerPivot workbook, you click the PowerPivot tab on the

Excel ribbon and click the PowerPivot Window button in the Launch group (shown in Figure

10-1) to open the PowerPivot window. The PowerPivot window opens separately from the

Excel window, which allows you to switch back and forth as necessary between working with

your PowerPivot data and working with the presentation of that data in Excel worksheets.

FIGURE 10-1 The PowerPivot Window button in the Excel window

The Local Analysis Services Engine

The add-in also installs a local Analysis Services engine on your computer. Installation also

adds the client providers necessary for connecting to Analysis Services. PowerPivot uses the

Analysis Services engine to compress and process large volumes of data, which Analysis Services

loads into workbook objects.

The Analysis Services engine runs exclusively in-process in Excel, which means that there

is no need to manage a separate Windows service running on your computer. This version

of Analysis Services uses the new VertiPaq storage mode, which works efficiently with large

volumes of columnar data in memory. For example, VertiPaq mode allows you to very quickly

sort and filter millions of rows of data. Furthermore, you can store workbooks on your local

drive because VertiPaq compresses the data by tenfold on average.

PowerPivot for Excel CHAPTER 10 191

The Atom Data Feed Provider

Last, the add-in installs an Atom data feed provider to allow you to import data from Atom

data feeds into a PowerPivot workbook. A data feed provides data to a client application

on request. The structure remains the same each time you request data, but the data can

change between requests. Usually, you identify the online data source as a URL-addressable

HTTP endpoint. The online data source, or data service, responds to requests at this endpoint

by returning an atomsvs document that describes how to retrieve the data feed. When you

open an atomsvc document, the PowerPivot Atom data feed provider detects the file type

and prompts you to load data into PowerPivot. When you confirm the load operation, the

provider connects to the data service, which in turn encapsulates the data in XML by using

the Atom 1.0 format and sends the data to the provider.

Data Sources

Your first step in the process of developing a PowerPivot workbook is to create data sources

and import data into the workbook. You can import data from a variety of external data

sources, including relational or multidimensional databases, text files, and Web services. You

can also import data by linking to tables in Excel, or simply by copying and pasting data. Each

data source that you add to the workbook becomes a separate table.

External Data

When your data comes from an external data source, you use the applicable button in the

Get External Data group of the ribbon in the PowerPivot window, as shown in Figure 10-2.

The button you choose launches the Table Import Wizard for the type of data that you are

importing.

FIGURE 10-2 The Get External Data group in the PowerPivot window

You can choose from a wide variety of data sources:

¦ Databases

• SQL Server 2005, SQL Server 2008, SQL Server 2008 R2, and Windows Azure

• Microsoft Office Access 2003, Access 2007, and Access 2010

• SQL Server 2005 Analysis Services, SQL Server 2008 Analysis Services, and SQL

Server 2008 R2 Analysis Services

• Oracle 9i, Oracle 10g, and Oracle 11g

• Teradata V2R6 and Teradata V12

• Informix

• IBM DB2 8.1

• Sybase

• Any database that can be accessed by using an OLE DB provider or an ODBC driver

¦ Files

• Delimited text files (.txt, .tab, and .csv)

• Files from Excel 97 through Excel 2010

• PowerPivot workbooks published to a PowerPivot-enabled Microsoft SharePoint

Server 2010 farm

¦ Data feeds

• SQL Server 2008 R2 Reporting Services Atom data feeds

• SharePoint lists

• ADO.NET Data Services

• Commercial datasets, such as Microsoft Codename “Dallas”

(http://pinpoint.com/en-US/Dallas)

TIP A new feature in SQL Server 2008 R2 Reporting Services is the ability to export an

Atom data feed for any report, whether you export from a native mode or from an integrated

mode report server. If the PowerPivot client is installed on your computer when you

perform the export, PowerPivot detects the document type and opens a wizard for you

to use to import the data directly into a table. You might find it beneficial to get some of

your data integrated in a report first and take advantage of Reporting Services’ support for

calculations, aggregations, data sources, and refresh schedules before you bring the data

into PowerPivot.

The wizard walks you through the process of specifying connection information for the

source and selecting data to import. If your source is a database, you can choose to select

either tables or views or to provide a query for the data selection. Regardless of the data

source type, the wizard gives you two options for filtering the data before you import it. First,

you can select specific columns rather than importing every column from the source table.

Second, you can apply a filter to a column to select the row values to include in the import. By

applying these filtering options, you can eliminate unnecessary overhead in your workbook,

reducing both the file size of the workbook and the amount of time necessary to refresh and

recalculate the workbook.

TIP When you are working with large datasets, you should use the filtering options to

import only the columns you need for analysis. By limiting the workbook to the essential

columns, you can import more rows of data.

Linked Tables

If your data is in an Excel table already, or if you convert a range of data into an Excel table,

you can add the table to your workbook in the Excel window and then use the Create Linked

Table button to import the data into the PowerPivot window. You can find this button on the

PowerPivot ribbon in the Excel window, as shown in Figure 10-3. After the data is available in

the PowerPivot window, you can then enhance it by defining relationships with other tables

or by adding calculations.

FIGURE 10-3 The Create Linked Table button

One of the benefits of using an Excel table as a source for a PowerPivot table is the ability

to change the data in the Excel table to immediately update the PowerPivot table. Because

you cannot make changes to data in the PowerPivot window, a linked table is the quickest

and easiest way to edit the data in a PowerPivot table.It is also a great way to try out different

values in “what-if” scenarios or to use variable values in a calculation.

Another reason you might consider using a linked table is to support Time Intelligence functions

in PowerPivot’s formula language. Examples of Time Intelligence functions include TotalMTD,

StartOfYear, and PreviousQuarter. Often, source data includes dates and times but does not have

the corresponding attributes to describe these dates and times, such as month, quarter, or year.

You can create your own table in Excel with the necessary attributes, link it to PowerPivot, and

then use Time Intelligence functions to support analysis involving comparative time periods.

Copying and Pasting

If you do not need to change data after importing into PowerPivot, you can copy the data

from another Excel workbook and then in the PowerPivot window, click the Paste button

in the Clipboard group of the PowerPivot ribbon. The Paste preview dialog box displays to

shows the data to be pasted into PowerPivot. Although you cannot directly edit the data after

adding it to PowerPivot, you can replace it by pasting in fresh data or add to it by appending

additional data. To do this, you use the Paste Replace or Paste Append button, respectively.

Data Preparation

After importing data into tables, your next step is to prepare the data for analysis by defining

relationships between tables. You can also choose to enhance the data by applying filters and

modifying column properties.

Relationships

By building relationships between the data, you can analyze the data as if it all came from a

common source. Relationships enable you to use related data in the same PivotTable even

though the underlying data actually comes from different sources. Defining relationship

between columns into two PowerPivot tables is similar to defining a foreign key relationship

between two columns in a relational database. Excel power users can understand defining

relationships as analogous to using the VLOOKUP function to reference data elsewhere.

In addition to consolidating data for PivotTables, there are other benefits of building

relationships. You can filter data in a table based on data found in related columns, or you

can use the formula language to perform a lookup of values in a related column. These

techniques provide alternative ways to eliminate data redundancy, which keeps the workbook

smaller.

When you import related tables at the same time, the Table Import Wizard automatically

detects that they are related and creates the detected relationships. You can also manually

create relationships by using the Create Relationship button on the Design tab of the PowerPivot

ribbon, as shown in Figure 10-4.

NOTE A column cannot participate in more than one relationship, and you cannot create

circular relationships.

FIGURE 10-4 The Create Relationship button

Filters

After you import data into PowerPivot, you cannot delete rows from the resulting PowerPivot

table. To keep your workbook as small as possible, you should apply filters during the import

process to exclude unneeded rows right away. After completing the import, you can modify

the table properties to add a filter, and then update the table to keep only rows that meet the

filter criteria.

You can also apply filters to the imported data if you want the data to be available for other

purposes later, while hiding specific rows from the presentation layer in the current report.

You can filter by name in the same way that you normally filter in Excel, by selecting from a

list of values in a column to identify the rows that you want to keep. As an alternative, you

can filter a numeric column by value, as shown in Figure 10-5. For example, you can use the

Between operator to apply a filter that will select rows with a value in a range that you specify.

FIGURE 10-5 Filtering a numeric column by value

IMPORTANT Use of a filter is not a security measure. Although a filter effectively hides

data from a presentation, anyone who can open the Excel workbook can also clear the

filters and view the data if he or she has installed the PowerPivot add-in.

Columns

As part of the data preparation process, you might need to make changes to column properties.

On the Home tab of the PowerPivot ribbon, you can access tools to make some of these

changes, as shown in Figure 10-6. For example, you can select a column in the table and then

use the ribbon buttons to change the formatting of the column. You can also change the

width of the column for better viewing of its contents, or you can freeze a column to make it

easier to explore the data as you scroll horizontally.

FIGURE 10-6 The Home tab of the PowerPivot ribbon

Although the Table Import Wizard detects and sets column data types, you can use the

Data Type drop-down list on the ribbon to change a data type if necessary. You might need

to adjust data types to create a relationship between two tables, for example. PowerPivot

supports only the following data types:

¦ Currency

¦ Decimal Number

¦ Text

¦ TRUE/FALSE

¦ Whole Number

You can use the Hide and Unhide button on the Design tab (shown in Figure 10-4) to control

the appearance of a column in the PowerPivot window and also in the PivotTable Field List. For

example, you might choose to display a column in the PowerPivot window, but hide that column

in the PivotTable window because you want to use it in a formula for a calculated column.

PowerPivot Reports

A PowerPivot report is an Excel worksheet that presents your PowerPivot data in a summarized

form by using at least one PivotTable or PivotChart. You can convert a PivotTable to a

collection of cube function formulas if you prefer a free-form layout of your PowerPivot data.

Regardless of which layout you choose for the report, you can add slicers to support interactive

filtering.

PivotTables

You create a report by selecting a layout template from the PivotTable menu (available from

the PivotTable button on the PowerPivot ribbon, as shown in Figure 10-7) and specifying a

target worksheet in the Excel workbook. You can create a layout independently of the available

templates by selecting Single PivotTable or Single PivotChart as many times as you need

and targeting a different location on the same worksheet for each object.

FIGURE 10-7 Report layout templates

NOTE The standard Excel ribbon also includes buttons for building a PivotTable or

PivotChart, but you must use the buttons on the PowerPivot ribbon when you want to use

PowerPivot data.

Assume that you select the Chart And Table (Horizontal) template. Placeholders for the

chart and table appear on the worksheet, and a new worksheet appears in the workbook to

store the data that you selected for the chart. Just as you do with a standard PivotTable or

PivotChart, you select the placeholder and then use the associated field list to select and

arrange fields for the selected object, as shown in Figure 10-8.

FIGURE 10-8 A PivotChart and PivotTable report

Cube Functions

As an alternative to the symmetrical layout of a PivotTable, you can use cube functions in cell

formulas to arrange PowerPivot data in a free-form arrangement of cells. Cube functions,

introduced in Excel 2007, allow you to query an Analysis Services database and return metadata

or values from a cube. Because PowerPivot creates an in-memory version of an Analysis

Services database, you can also use cube functions with your PowerPivot data.

Although you can create a formula that uses a cube function in any cell in your PowerPivot

workbook, the simplest way to get started with these functions is to convert an existing PivotTable.

To do this, click the OLAP Tools button on the Options tab under PivotTable Tools, and

click Convert To Formulas. The conversion replaces the row and column labels with a formula

using the CUBEMEMBER function and replaces values with the CUBEVALUE function, as shown

in Figure 10-9. The first argument of either of these functions references the data connection,

which by default is Sandbox for embedded PowerPivot data. All other arguments are pointers

to dimension member names that define the coordinates of the value to retrieve from the

in-memory cube.

FIGURE 10-9 The CUBEVALUE function

Slicers

The task pane for PowerPivot is similar to the one you use for an Excel PivotTable, but it

includes two additional drop zones for slicers. Slicers are a new feature in Excel 2010 that can

be associated with PowerPivot. Slices work much like report filters but link to multiple objects,

such as a PivotTable and a PivotChart, so that the slicer selection can filter an entire report. If

two slicers are related, a selection of items in one slicer automatically highlights and filters the

related items in the second slicer. For example, if you select a year in one slicer, the quarters

related to that year in a second slicer will also be selected, as shown in Figure 10-10.

FIGURE 10-10 Selecting Year slicer values also selects QuarterCode slicer values.

Data Analysis Expressions

The ability to combine data from multiple sources into a single PivotTable is amazingly powerful,

but you can create even more powerful reports by enriching the PowerPivot data with

Data Analysis Expressions (DAX) to add custom aggregations, calculations, and filters to your

report. DAX is a new expression language for use with PowerPivot for Excel. DAX formulas

are similar to Excel formulas. However, rather than working with cells, ranges, or arrays as in

Excel, DAX works only with tables and columns. You can use DAX either to create calculated

columns or to create new measures.

Calculated Columns

A calculated column is the set of values resulting from an expression that you apply to a table

column or another calculated column. For example, you can concatenate values from two

separate columns to produce a single string value that displays in a third column. You can

also perform mathematical operations, manipulate strings, look up values in related tables, or

compare values to produce results in a calculated column. To add a calculated column, click

an empty cell under the Add Column column heading and type an expression in the formula

bar. In your report, you can use the new calculated column just like any other column from

your PowerPivot data. An expression that calculates gross profit looks like this:

=[Sales Amount]-[Total Product Cost]

Measures

A measure is a dynamic calculation that is displayed in the value area of the PivotTable. Its

value depends on the current selection of items in rows and columns and in the report filter.

A measure differs from a calculated column in that the calculated column values persist in

the PowerPivot data whereas the measure values calculate at query time and do not persist in

the data store. The calculated column values are scalar, and the measure values are aggregates.

Last, a calculated column may contain string values or numeric values, but a measure is

always a numeric value.

As an example, consider a calculated column that shows gross profit. The PowerPivot

table would include a gross profit value for each sales transaction, which a PivotTable can

later aggregate. However, if you create a calculated column to store a gross profit margin

percentage value, the aggregate in the PivotTable will not be correct because percentage

values are not additive.

To create a measure, you must first create a PivotTable or PivotChart. In the Excel window,

select the PivotTable or PivotChart, and then click the New Measure button on the PowerPivot

tab of the ribbon. You then provide a name for the measure for all PivotTables in the

report, provide a name for the current PivotTable if you want, and then specify the formula

for the measure, as shown in Figure 10-11.

FIGURE 10-11 Measure settings

DAX Functions

The examples shown for a calculated column and a measure are very basic, although representative

of the common ways that you would use DAX. Table 10-1 lists the types of functions

that DAX provides:

TABLE 10-1 DAX Function Types

FUNCTION TYPE

EXAMPLE

DESCRIPTION

Date and time

=WEEKDAY([OrderDate],1)

Returns the number of the weekday

where Sunday = 1 and Saturday = 7

Filter and value

=FILTER(ProductSubcategory,

[EnglishProductSubcategoryName]

= “Road Bikes”)

Returns a subset of a table based

on the filter expression

Information

=IsNumber([OrderQuantity])

Returns TRUE if the value is numeric

and FALSE if it is not

Logical

=IF([OrderQuantity]<10,”low”,

IF([OrderQuantity]<100,”medium”

,”high”))

Returns the second argument’s

value if the first argument’s condition

is TRUE and otherwise returns

the third argument’s value

Math and trig

=ROUND([SalesAmount] *

[DiscountAmount],2)

Returns the value of the first argument

rounded to the number of

digits specified in second argument

FUNCTION TYPE

EXAMPLE

DESCRIPTION

Statistical

=AVERAGEX(ResellerSales,

[SalesAmount]-

[TotalProductCost])

Evaluates the expression in the

second argument for each row of

the table in the first argument, and

then calculates the arithmetic mean

Text

=CONCATENATE([FirstName],

[LastName])

Returns a string that joins two text

items

Time Intelligence

=DATEADD([OrderDate],10,day)

Returns a table of dates obtained

by adding the number of days

specified in the second argument

(or other period as specified by

the third argument) to the column

specified in the first argument

PowerPivot for SharePoint

PowerPivot for SharePoint provides server-side support for PowerPivot workbooks by

extending the capabilities of SharePoint and Excel Services in SharePoint. SharePoint provides

centralized management of the PowerPivot workbooks, and Excel Services manages data

queries and the rendering of the query results in the browser. Installation of PowerPivot for

SharePoint adds services to the SharePoint farm and includes a document library template,

content types, dashboards, and Web parts that provide access to PowerPivot reports and

support monitoring their usage.

Architecture

PowerPivot for SharePoint requires SharePoint Enterprise Edition and Excel Services. You

must install Analysis Services with SharePoint Integration on a SharePoint Web front end. In

SharePoint Central Administration, you configure the PowerPivot System Service and activate

the PowerPivot feature on the target site collection. PowerPivot for SharePoint uses a scalable

architecture (shown in Figure 10-12) that allows you to add or remove instances as needed

when you require more or less processing capacity. When you add an instance, the SharePoint

autodiscovery feature ensures that the new instance can be found, and the PowerPivot

System Service has a load balancing feature that will use the new instance when possible.

SharePoint Farm

Web front end Application server

PowerPivot

database

Analysis Services-

VertiPaq Mode

Excel Calculation

Services

PowerPivot

System Service

Excel Web Access

Excel Web Service

PowerPivot

Web Service

Excel 2010 with

PowerPivot-

View or

create reports

Browser-

View reports

FIGURE 10-12 PowerPivot for SharePoint Architecture

Analysis Services in VertiPaq Mode

To support users without the PowerPivot for Excel client, Excel Services connects to a server

instance of Analysis Services in VertiPaq mode to process PowerPivot workbooks and respond

to user queries. This type of Analysis Services server instance enables in-memory data storage

on a large scale for multiple users and provides rapid processing of large PowerPivot data

sets. Just like the in-memory version of VertiPaq mode on the client, the server version uses

data compression and columnar storage. Unlike a standard Analysis Services instance that

you manage using SQL Server Management Studio, you manage Analysis Services in VertiPaq

mode exclusively in SharePoint Central Administration.

In response to requests for PowerPivot data, Analysis Services loads the cube into memory

where it stays until no longer required or until SharePoint monitoring detects that contention

for resources has reached a threshold requiring action. You can monitor system performance

through usage data, as explained later in this chapter. Analysis Services loads the PowerPivot

data from the workbook as raw, unaggregated data into the cube, compresses the data, and

dynamically restructures the data based on the user’s actions.

The PowerPivot System Service

The PowerPivot service runs as a service application on SharePoint called PowerPivot System

Service. A service application is configurable independently of other service applications and

isolates service application data. You can install one physical instance of a server but then

create multiple service applications to isolate data at the application level. Another benefit of

the service application model is the ability to delegate administration.

The PowerPivot System Service listens for requests for PowerPivot data, connects to Analysis

Services to manage the loading and unloading of PowerPivot data, collects usage data,

and monitors system health and availability of Analysis Services servers. It also provides load

balancing across servers for query processing if multiple servers are available. Furthermore,

the PowerPivot System Service manages the connections for active, reusable, and cached

connections to PowerPivot workbooks, as well as administrative connections to other PowerPivot

System Services on the SharePoint farm.

To speed up access to data, the PowerPivot System Service caches a local copy of a workbook

and stores it in Program Files\Microsoft SQL Server\MSAS10_50.POWERPIVOT\OLAP\

Backup. The service unloads this copy of the workbook from memory if no one has accessed

the workbook after 48 hours and deletes it from the folder after an additional 72 hours of

inactivity. If a user updates the workbook in SharePoint and a copy of the workbook already

exists in the cache, the PowerPivot System Service also removes the older cache copy.

The PowerPivot Database

Each service application has its own relational database, called the PowerPivot database. In

particular, this PowerPivot database stores the load or cache status of workbooks, server usage

information, and schedule information for data refresh operations. More specifically, the

application database stores an instance map that identifies whether a workbook is currently

loaded on the server or in the cache. Usage information in the application database applies to

connections, query response times, load and unload events, and other information pertinent to

server health statistics. The data refresh schedule information includes details about data sources,

users, and the workbooks associated with a schedule. None of the workbook content is in the

PowerPivot database. Instead, workbooks are stored in the SharePoint content database.

The PowerPivot Web Service

The PowerPivot Web Service is a thin middle-tier connection manager implemented as a Windows

Communication Foundation (WCF) Web service that runs on a SharePoint Web front end.

The Web service listens on the port assigned to a Web application enabled for PowerPivot, and

responds to requests by coordinating the request-response exchange between client applications

and PowerPivot for SharePoint instances in the farm. This Web service requires no

separate configuration or management.

The PowerPivot Managed Extension

The PowerPivot Managed Extension is an assembly in the Analysis Services OLE DB provider

client. This provider client is installed on a client computer when you install the PowerPivot

for Excel add-in, and on the SharePoint server when you install PowerPivot for SharePoint. For

managed connections, the Web service and the managed extension operate the same way.

The query processing request determines which one is used.

Content Management

Content management for PowerPivot is quite simple because the data and the presentation

layout are kept in the same document. If they weren’t, you would have to maintain separate

files in different formats and then manually integrate them each time one of the files required

replacement with fresh data. By storing the PowerPivot workbooks in SharePoint, you can

reap the benefits applicable to any content type, such as workflows, retention policies, and

versioning. For example, you can copy data to a new location by copying the document. Or if

you need to formally approve data before allowing others to access it, you can easily set up a

document approval workflow.

The PowerPivot Gallery

The PowerPivot Gallery is a special type of document library that provides document management

capabilities for PowerPivot workbooks. You can use it to preview and open PowerPivot

workbooks from a central location. In the PowerPivot Gallery, shown in Figure 10-13,

you can see all available sheets in the workbook as thumbnails with current data, without

opening the workbook. A snapshot service creates the thumbnail images by periodically

reading the workbooks file.

FIGURE 10-13 The PowerPivot Gallery

In addition to the default Gallery view, the PowerPivot Gallery also includes the Theater

and Carousel views, which are most useful when you want to highlight a small number of

workbooks. In Theater view, you can see a central preview area, and thumbnails of the other

reports in the workbook display at the bottom of the page. In Carousel view, the thumbnails

appear to the left and right of the preview area. In either of these views, you can click the left

or right arrow to bring a different thumbnail into the preview area. You can also switch to All

Documents view, which allows you to see all the workbooks in a standard document library

view. You can then download a document, check documents in or out, or perform any other

activity that is permissible within a document library.

The Data Feed Library

A special type of document library is available for the storage of Atom svc documents, also

known as data service documents. You can share these documents for the use of other

PowerPivot authors who want to import data feeds into PowerPivot tables. You can create

a data service document in the document library by specifying the URL request to the data

service or Web application that serves data on request. The URL request should include a

parameter that requests data in the Atom 1.0 format.

Data Refresh

In addition to the content management support, another good reason to share a PowerPivot

workbook in SharePoint is to manage the data refresh process. Usually, data that appears in a

PowerPivot table changes from time to time. To keep the workbook up to date and relevant,

you must periodically update the data. You can automate this process by assigning a refresh

schedule to each data source in the workbook.

The data refresh feature is not enabled by default. When you enable data refresh, a timer

job runs every minute on the PowerPivot server. This job is a trigger for the PowerPivot System

service, which in turn reads the predefined schedule found in the PowerPivot database. When

a schedule to run is found, the PowerPivot System Service gets the list of data sources and the

credentials to use, and initiates the data refresh. If the workbook is not checked out or in edit

mode, the data refresh job saves the new data to the workbook.

Linked Documents

Your PowerPivot workbook can be used as a data source for other report types. When

viewing the workbooks in the PowerPivot Gallery, you can use the Create Linked Document

button to create either a Reporting Services report or a PowerPivot report in Excel. You must

have the appropriate client application for the report type that you choose. That is, to build

a Reporting Services report, you must first install SQL Server 2008 R2 Report Builder 3.0, and

to build a PowerPivot report, you must install the PowerPivot for Excel add-in. The query

designer in Report Builder and the Field List in Excel display only the fields presented in the

source workbook rather than all fields available in that workbook’s embedded data.

The PowerPivot Web Service

Another way to use a PowerPivot workbook as a data source is by using the PowerPivot Web

Service to connect to the embedded data. That way, you can reuse the data in multiple places

without having to duplicate all the effort required to create the initial workbook. Any client

application that can connect to Analysis Services directly can use the PowerPivot Web Service. You

simply use the SharePoint URL for the workbook instead of an Analysis Services server name in the

connection string of the provider. For example, if you have a workbook named Bike Sales.xlsx in

the PowerPivot Gallery located at http://<servername>/PowerPivot Gallery, the SharePoint URL to

use as an Analysis Services data source is http://<servername>/PowerPivot Gallery/Bike Sales.xlsx.

The PowerPivot Management Dashboard

PowerPivot for SharePoint includes several tools for configuring the service application and

for monitoring usage in a management dashboard. All management tools are accessible to

farm and service administrators in Central Administration. The easiest way to access settings

related to PowerPoint for SharePoint is to use the PowerPivot Management Dashboard.

The PowerPivot Management Dashboard displays data for one service application at a time.

In this dashboard, you can see a collection of Web parts and PowerPivot reports that display data

that is collected daily from multiple sources. One of the Web parts displays a chart showing CPU

and memory usage over time to help you determine whether the server is running at maximum

capacity or whether it is underutilized. Another Web part shows trending of query response times,

which you can use to determine whether queries are responding within configurable thresholds.

The dashboard page includes links to the PowerPivot reports that provide the source data for

these Web parts. These reports consist of data from an internal reporting database that in turn

collects data from the PowerPivot database, SharePoint usage log data, and other sources. You

can build new reports using this internal reporting database as a source, but you cannot change it.

In addition to giving you information about the state of the server, the dashboard also

provides insight into the usage of published workbooks. An interactive chart allows you to

monitor which workbooks users access most frequently and which workbooks have recent

activity. You can view this information at the daily or weekly level.

One section of the dashboard provides information about data refresh activity, providing a

single location from which you can verify whether data refreshes are occurring as scheduled.

One Web part in this section lists recent activity for data refresh jobs by workbook and also

includes the job duration. Another Web part lists the workbooks for which the data refresh

job fails, and displays the data refresh error message as a tooltip.

The dashboard is also extensible. It includes a link to add new items, which you can use to

add more workbooks to access from the dashboard page. For example, you can create a new

PowerPivot workbook by using the Usage workbook as a data source, and then upload your

workbook to the same document library.

Last, the dashboard page includes links to pages in Central Administration that you can

use to check or reconfigure the settings for PowerPivot. One link takes you to the service

settings page, where you can schedule database timeouts, data refresh hours, and query response

time thresholds. You can use another link to review timer job settings for data refresh,

dashboard processing, PowerPivot configuration, and the health statistics collector. A third

link takes you to the settings page for usage log collection.

Index

207

adapter base classes, 151

AdapterFactory objects, 152

adapters, for CEP applications, 151-154

Admin Console, 122

aggregate functions, 168

AJAX, 186

Analysis Services engine, 123, 190

annotating transactions in MDS, 134

application errors, monitoring, 122

applications, data-tier. See DACs (data-tier applications)

arrays, converting comma-separated lists of values

into, 167

Atom data feed

exporting reports to, 182

importing data into PowerPivot workbook, 191

authentication

Extended Protection for, 10

in MDS (Master Data Services), 127

authorization (MDS), 138

Azure, 9

Backup node, 114

Best Practices Analyzer (BPA)

overview of, 11

running, 71

binding query templates, 162

Bing Maps tile layers as backgrounds, 177

browser support for Reporting Services, 186

built-in fields, 170

business intelligence (BI) integration, 123

business rules (MDS), 132-133

Cache Refresh (reports), 179-180

calculated columns, 199

capacity planning, 25

Carousel view (PowerPivot Gallery), 204-205

class library (MDS), 142-143

Cluster Shared Volumes (CSV). See CSV (Cluster

Shared Volumes)

collections (MDS), 130

comma-separated lists of values, converting into

arrays, 167

Compact edition, 14

complex event processing (CEP). See also StreamInsight

adapters, 151-154

application development cycle, 150

application language, 146

applications for, 145-146

defined, 145

diagnostic views, 163

filtering operation, 155

input adapters, 148

output adapters, 149

overview of, 145

projection operation, 155

query instances, 149

query templates, 154

server, 147-149

compression, Unicode, 10

compute node, 114-115, 118

connecting to UCPs, 33-34, 89

consolidation

of databases, 86

goals of, 85

management strategies, 5

with virtualization, 87-88

control node, 112-113

208

control racks, 112

count windows, 159

CPU

overutilized, 92

upgrading online, 63

CREATE DATABASE statement, 118

Create Package wizard, 142

CREATE REMOTETABLE statement, 120

CREATE TABLE statement, 118-120

Create Utility Control Point Wizard, 26-28

CSV (Cluster Shared Volumes). See also failover clustering

adding storage to, 76

enabling, 76

overview of, 64

storage location, 76

cube functions, 197

DAC file packages, 45

DACs (data-tier applications)

benefits of, 44

configuration options, 53

defined, 41

definition registration, 49

definition storage, 43

deleting, 56-58

deploying, 4, 45, 52-55

detaching database, 56

extracting, 49-51

generating, 43

importing, 47-48

life cycle of, 42-43

monitoring, 93-94, 100-105

project templates, 45-46

properties, setting, 49-50

red or yellow icons, 45

registering, 55-56

SQL Server objects supported in, 44-45

upgrading, 59-61

uses for, 43

Visual Studio deployment of, 45-46

dashboard, PowerPivot, 206

Data Analysis Expressions (DAX), 199-201

data bars, 175

data colocation, 114-115

data feed, Atom

exporting reports to, 182

importing data into PowerPivot workbook, 191

data feed libraries, 205

Data Movement Service (DMS), 112

data racks, 111

data sources

joining, 166-168

for PowerPivot for Excel, 191-193

data stewards, 127

data types

supported in Parallel Data Warehouse, 120

supported in PowerPivot, 195

data warehouse appliances, 109-110. See also Parallel

Data Warehouse

database consolidation, 86

database creation, 118

database objects, managing, 122

Datacenter edition, 12

datasets. See also PowerPivot for Excel; PowerPivot for

SharePoint

combining data from multiple, 166-168

importing large, 192

shared, 179

data-tier applications (DACs). See DACs (data-tier

applications)

Data-Tier Applications viewpoint (Utility Explorer),

100-105

DAX (Data Analysis Expressions), 199-201

DDL extensions, 117

default policies, restoring, 36

definitions, DAC

registering, 49

storing, 43

Delete Data-Tier Application Wizard, 56-58

Deploy Data-Tier Application Wizard, 52-55

deploying DACs, 45

configuration options, 53

with Deploy Data-Tier Application Wizard, 52-55

overview of, 4

report for, 54

in Visual Studio 2010, 45-46

deploying StreamInsight, 149-150

deploying UCPs

with Create Utility Control Point Wizard, 26-28

prerequisites for, 25, 26

report, saving, 28

with Windows PowerShell, 28

detaching DAC database, 56

Developer edition, 13

diagnostic views (StreamInsight), 163

disconnecting from UCPs, 34

disk space consumption, 25

control racks

209

disk space requirements, 15

DMS (Data Movement Service), 112

document libraries, 204-205

DomainScope property (reports), 173

DWLoader, 122

Dwsql, 122

dynamic report formatting, 169-170, 172-173

dynamic virtual machine storage, 64

edge event model, 147

edit sessions, 183

enrolling instances, 29-32

enterprise data warehousing. See Parallel Data

Warehouse

Enterprise edition, 12, 29

entities (MDS), 129-130

event classification, 154

event stream objects, 154

aggregation operations, 159-160

consumer objects, 162

event types, 150-151

event windows, 156-159

Excel add-ins. See PowerPivot for Excel

Excel tables, linking to PowerPivot tables, 193

Excel workbooks, 173, 193

exporting master data, 136-137

exporting tables, 120

Express edition, 13

expression language, 165-171. See also Data Analysis

Expressions (DAX)

expressions, 183

Extended Protection for Authentication, 10

Extract Data-Tier Application Wizard, 45, 49-51

extracting DACs (data-tier applications), 49-51

Failover Cluster Manager, 80

failover clustering. See also CSV (Cluster Shared Volumes)

benefits of, 65

best practices compliance, testing, 11, 71

connecting by multiple networks, 65

enhancements in Windows Server 2008, 63

guest model, 67-68

history of, 64-65

traditional model, 65

troubleshooting, 70

validating prerequisites for, 68-70

feedback on book, xix

file space utilization monitoring, 96

filtering operation, 155

filtering PowerPivot data, 192, 194

formatting reports dynamically, 169-170, 172-173

free edition. See Express edition

functions

aggregate, 168

cube, 197

Lookup, 166

LookupSet, 168

MultiLookup, 167

Split, 167

Time Intelligence, 193

Transact-SQL, 143-144

Generate And Publish Scripts Wizard, 9

global monitoring settings, 34

guest failover clustering, 67-68

hardware, upgrading online, 63

hardware requirements, 14-15

headers and footers, 170

hierarchies (MDS), 130

high availability enhancements, 63-64

hopping windows stream, 157

hot adding hardware, 63

hub-and-spoke architecture, 115

Hyper-V. See also Live Migration; virtualization

benefits of, 74

on guest failover clustering, 67-68

improvements in, 11

overview of, 64

system requirements for, 73-74

uses for, 74

virtual machines, creating with, 76-79

Hyper-V Integration Services tool, 79

Hyper-V Integration Services tool

importing master data, 135

indicators in reports, 176-177

InfiniBand network, 110, 112

in-place upgrades, 16-17

input adapters, base classes, 151

installing MDS (Master Data Services), 127

instances. See also managed instances; SQL Server

instances

enrolling, 29-32

utilization, monitoring, 91

validating, 31

viewing, 91

Integration Services, 123

InteractiveSize property (reports), 172

Internet Explorer requirement, 15

interval event model, 147

join operations, 161

Landing Zone node, 114

linked documents, 205

linked tables, 193

LINQ expressions, 155

Live Migration. See also Hyper-V

benefits of, 87-88

configuring virtual machines for, 79-82

implementing, 75

initiating, 83

overview of, 64, 72

load activity, monitoring, 122

look-and-feel properties, 169-170

Lookup function, 166

LookupSet function, 168

managed instances. See also instances; SQL Server

instances

global policies for, 34

health status of, 91

maximum number of, 29

overutilized resources, 92

processor utilization, 96

underutilized resources, 92

viewing, 91

Managed Instances viewpoint (Utility Explorer), 95-100

management node, 114

management utilities. See Best Practices Analyzer (BPA);

SQL Server Utility

ManagementService API, 163

maps, 177-178

massively parallel processing (MPP), 9

master data, 125-126

Master Data Manager

areas in, 128-129

batch creation, 135-136

data maintenance with, 131

data stewards, 127

model deployment, 142

subscription view, creating, 136

Master Data Services Configuration Manager, 128

MDS (Master Data Services)

API, 127, 142-143

authentication, 127

authorization, 138

business rules, 132-133

class library, 142-143

configuring, 128

data stewards, 127

database, 128

as development platform, 142

exporting master data, 136-137

flexibility of, 126

importing master data, 135

installing, 127

locking data in, 139-140

master data hub, 126

overview of, 125

permissions, 138

tables in, 135

transaction logs, 131-134

Transact-SQL functions, 143-144

versioning, 127-128, 137-138

Web services API, 143

measures (PivotTables), 199-200

members (MDS), 129-130

memory, upgrading online, 63

memory requirements, 14

Microsoft Assessment and Planning Toolkit, 75

Microsoft Press support Web site, xix

Microsoft SQL Azure, 9

importing master data

migrating SQL Server installations, 18-19

migrating virtual machines. See Live Migration

models (MDS)

defined, 129

deploying, 142

security settings, 139-140

monitoring, with SQL Server Utility, 89

monitoring DACs, 100-105

monitoring settings, 34

MPP architecture, 110

msdb database

creation of, 52

disk space consumption, 25

MultiLookup function, 167

multi-rack system

Backup node, 114

compute node, 114-115

control node, 112-113

control rack, 112

data racks, 111

Landing Zone node, 114

management node, 114

overview of, 110

naming UCPs, 27

nesting aggregate functions, 168

.NET Framework requirement, 15

New Virtual Machine Wizard, 76-77

Nexus query tool, 122

non-transactional reference data, 125-126

objects, SQL Server, 45

operating system requirements, 15

output adapters, 151

overutilized instances, 91

overutilized threshold default, 34

page headers and footers, 170

page numbering in reports, 170

PageName property (reports), 173

pagination, report, 172-173

Parallel Data Warehouse

Admin Console, 122

architecture of, 109-115

automatic growth feature, toggling, 118

configuring, 110

control node, 112-113

creating tables, 118-120

data load processing, 121-122

data types supported in, 120

DDL extensions, 117

distributed strategy, 116-117

networking technologies, 112

overview of, 9, 109

query processing, 121

replicated strategy, 116

shared nothing (SN) architecture, 115-120

Parallel Data Warehouse edition, 12

pasting data into PowerPivot, 193

PivotCharts, creating, 196

PivotTables. See also tables

converting to formulas, 197

creating, 196

measures, 199-200

point model, 147

policies

changing, 36

defaults, restoring, 36

managing, 34

violation reporting settings, 35

PowerPivot database, 203

PowerPivot for Excel

Analysis Services engine, 190

Atom data feed, importing, 191

columns, formatting, 195

copying and pasting data into, 193

creating databases, 118

cube functions, 197

data sources, creating, 191-193

data types, changing, 195

filtering data, 194

hiding columns, 196

installing, 190

modifications made by, 190

overview of, 189

relationships in, 194

slicers, 198

Time Intelligence functions, 193

VertiPaq storage mode, 202

workbook, creating, 190

PowerPivot for Excel

PowerPivot for SharePoint

application database, 203

architecture of, 201

caching, 203

content management, 204

data feed libraries, 205

data refreshing in, 205

linked documents, creating, 205

overview of, 10, 189, 201

Parallel Data Warehouse and, 123

prerequisites for, 201

System Service, 202-203

PowerPivot Gallery, 204-205

PowerPivot Managed Extension, 203

PowerPivot Management Dashboard, 206

PowerPivot reports, 196-198

PowerPivot Web Service, 203, 205-206

PowerShell. See Windows PowerShell

premium editions. See Datacenter edition; Parallel Data

Warehouse edition

processor requirements, 14

projection operation, 155

publishing reports, 180-182

query objects, 163

query processing (Parallel Data Warehouse), 121

query templates, 154, 162

QueryTemplate object, 154

RAM requirements, 14

registering DAC definitions, 49, 55-56

relationships, in PowerPivot, 194

relative references in expressions, 183

RenderFormat global variable, 169-170

replicated tables, creating, 118-120

Report Builder 3.0, 183

Report Designer, 123

Report Manager, 184-186

Report Part Gallery, 183

report variables, 170-171

Report Viewer, 186

Reporting Services, 186-188. See also reports

reports

alternate access mappings, 187

cache configuring, 179-180

on DAC deployment, 54

data synchronization, 173

data visualization enhancements, 175-178

edit sessions, 183

exporting to Atom data feed, 182

layout, dynamic, 169-170, 172-173

naming pages in, 173

nesting items in, 173

page numbering, 170

pagination, managing, 172-173

parts, searching for, 183

PowerPivot, 196-198

publishing in parts, 180-182

reusability of components in, 178-182

sandboxing, 186

text box orientation, 174

on UCP creation, 28

on validation, 28, 31

ResetPageNumber property (reports), 172

resource isolation, 186

resource utilization monitoring, 95-99

reusability of report components, 178-182

Role-Based Access security model, 39

rs.exe, 187

sandboxing reports, 186

scalability, 10

Second Level Address Translation (SLAT), 64

security

in MDS (Master Data Services), 138-141

Role-Based Access model, 39

SequeLink client drivers, 112-113

Server Manager, 11

shared datasets, 179

shared nothing (SN) architecture, 115-120

SharePoint, Reporting Services integration, 187-188

SharePoint Unified Logging Service, 188

shell access. See Windows PowerShell

side-by-side migration, 18-19

SLAT (Second Level Address Translation), 64

slicers, 198

SMP architecture, 110

PowerPivot for SharePoint

snapshot windows, 158

software requirements, 15

sparklines, 176

Split function, 167

SQL Azure, 9

SQL Server editions, 11. See also specific editions

SQL Server instances. See instances; managed instances

SQL Server Management Studio, 49-51

SQL Server objects, 44-45

SQL Server PowerShell. See Windows PowerShell

SQL Server Utility. See also Utility Control Points (UCPs);

Utility Explorer

dashboard, 89-95

monitoring with, 89

overview of, 4, 21-23

Standard edition, 13

StreamInsight

aggregation functions, 159-160

core engine, 146

deploying, 149-150

development support, 146

event models, 147

event windows, 156-159

events, 147

as hosted assembly, 149

join operations, 161

ManagementService API, 163

overview of, 145

query objects, 163

as standalone server, 149-150

streams, 147

TopK operation, 160

union operations, 161

subscription views (MDS), 136

support for book, xix

Sysprep, 9

sysutility_mdw. See Utility Management Data

Warehouse (UMDW)

tables. See also PivotTables

calculated columns in, 199

exporting, 120

loading rows into, 122

measures in, 199-200

text box orientation in reports, 174

Theater view (PowerPivot Gallery), 204-205

Time Intelligence functions (PowerPivot), 193

TopK operation, 160

transaction logs

for MDS, 131-134

space allocation, 118

Transact-SQL functions, 143-144

troubleshooting failover clustering, 70

tumbling windows, 157-158

UCPs (Utility Control Points). See Utility Control Points

(UCPs)

UMDW (Utility Management Data Warehouse). See

Utility Management Data Warehouse (UMDW)

underutilization thresholds

default, 34

variance in, 85

underutilized instances, 91

Unicode compression, 10

union operations, 161

upgrading data-tier applications (DACs), 59-61

upgrading hardware online, 63

upgrading SQL Server

in-place upgrades, 16-17

side-by-side migration, 18-19

user-defined functions (UDFs), 161

Utility Administrator, 37

utility collection set account, specifying, 27, 30

Utility Control Points (UCPs), 21-22

capacity specifications, 25

connecting to, 33-34, 89

creating, 26-29

disconnecting from, 34

enrollment of, 29

frequency of data collection, 23

managed instances, maximum number of, 29

msdb database, 25, 52

naming, 27

overview of, 4, 23

prerequisites for deployment, 25-26

report on creation of, 28

SQL Server edition required for, 26

validating, 28

Utility Control Points

Utility Explorer. See also SQL Server Utility

dashboard and list views, 5, 24

data refreshing in, 29

Data-Tier Applications viewpoint, 100-105

launching, 24

Managed Instances viewpoint, 95-99

user interface, 24

Utility Administration node, 33-36

Utility Management Data Warehouse (UMDW), 23

collection upload frequency, 23

data retention period, modifying, 39-40

disk space consumption, 25

verifying, 29

Utility Reader, 37-38

utility storage utilization history, 94-95

utilization policies, 6, 34-35

Validate A Configuration Wizard, 68-70

validating failover clustering setup, 68-70

validating instances, 31

validating UCPs, 28

validation reports, 28, 31

VertiPaq storage mode, 190, 202

violation reporting settings, 35

virtual machines

automatic start action, configuring, 79-80

configuring for Live Migration, 79-82

creating with Hyper-V, 76-79

high availability, configuring, 81-82

live migration of. See Live Migration

virtualization. See also Hyper-V

consolidation with, 87-88

technology for, 72

Visual Studio 2010

deploying DACs from, 45-46

importing DACs into, 47-48

volume space utilization monitoring, 97

Web browser support for Reporting Services, 186

Web edition, 13

windows, event, 156-159

Windows Communication Foundation (WCF) Web

services, 203

Windows domain accounts, 27, 30

Windows PowerShell

deploying UCPs with, 28

diagnostics, 164

enrolling instances with, 32

improvements in, 11

launching, 28

Windows Server 2008 integration, 10-11

workbooks, 173, 192

Workgroup edition, 13

WritingMode property (reports), 174

Utility Explorer

215

About the Authors

Ross Mistry is a technical architect at the Microsoft Technology Center

(MTC) in Silicon Valley. Ross provides executive briefings, architectural

design sessions, and proof of concept workshops to organizations

located in the Silicon Valley. His core specialty is Microsoft SQL Server,

although he also focuses on Windows Active Directory, Microsoft Exchange,

and Windows Server Hyper-V.

Ross’s latest books include Windows Server 2008 R2 Unleashed and

Microsoft SQL Server 2008 Management and Administration. He was a

contributing writer on Microsoft Exchange Server 2010 Unleashed, Microsoft

SharePoint 2007 Unleashed, and Windows Server 2008 Hyper-V Unleashed. He frequently

writes for TechTarget and is currently working on a series of SQL Server virtualization white

papers, which will be published shortly. Ross is a former SQL Server MVP, is well known in the

worldwide SQL Server community, and frequently speaks at technology conferences and user

groups around the world. He has recently spoken at the North American PASS Community Summit,

SQL Connections, European PASS, SQL BITS, and Microsoft.

Prior to joining Microsoft, Ross was a managing partner and principal consultant at Convergent

Computing (CCO), where he was responsible for designing and implementing technology

solutions for organizations with a global presence. Some of his customers included eBay,

McAfee, Yahoo!, Gilead Sciences, Ross Stores, The Sharper Image, McDonald’s, CIBC, Radio Shack,

Wells Fargo, and TD Waterhouse.

You can follow and contact Ross on Twitter @RossMistry.

Stacia Misner is the founder of Data Inspirations (www.datainspirations.

com), which delivers global business intelligence (BI) consulting and

education services. She is a consultant, educator, mentor, and author

specializing in business intelligence and performance management

solutions that use Microsoft technologies. Stacia has more than 25 years

of experience in information technology and has focused exclusively on

Microsoft BI technologies since 2000. She is the author of Microsoft SQL

Server 2000 Reporting Services Step by Step, Microsoft SQL Server 2005

Reporting Services Step by Step, Microsoft SQL Server 2005 Express Edition:

Start Now!, and Microsoft SQL Server 2008 Reporting Services Step

by Step and the coauthor of Business Intelligence: Making Better Decisions Faster, Microsoft SQL

Server 2005 Analysis Services Step by Step, and Microsoft SQL Server 2005 Administrator’s Companion.

She is also a Microsoft Certified IT Professional-BI and a Microsoft Certified Technology

Specialist-BI. Stacia lives in Las Vegas, Nevada, with her husband, Gerry. You can contact Stacia

via e-mail at smisner@datainspirations.com.

First Look Office 2010

Posted on 4 Apr 20134 Apr 2013 by escapebusinesssolutions in Uncategorized

Table of Contents

Acknowledgments ix

Introduction . xi

Part I Envision the Possibilities

1 Welcome to Office 2010 . 3

Features that Fit Your Work Style . 3

Changes in Office 2010 4

Let Your Ideas Soar . 5

Collaborate Easily and Naturally 5

Work Anywhere—and Everywhere . 6

Exploring the Ribbon 7

A Quick Look at the Ribbon 8

Contextual Tabs . 9

New Backstage View 9

Managing Files in Backstage View . 10

Streamlined Printing . 11

Languages and Accessibility 11

Coming Next 12

2 Express Yourself Effectively and Efficiently 13

Understanding Your Audience 14

How Visuals Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Adding Text Effects . 16

Adding Artistry to Your Images 17

Correcting and Recoloring Pictures . 18

Working Font Magic in Word 2010 and Publisher 2010 . . . . . . . . . . . . . . . . . . . 21

Creating Data Visualizations in Excel 2010 . 23

Editing Video in PowerPoint 2010 24

Communicating Visually in Access 2010 . 25

Enhancing and Streamlining Communications in Outlook 2010 26

Coming Next 28

iv Table of Contents

3 Collaborate in the Office and Around the World 29

It’s All About the Teamwork 29

What Teams Look Like Today 30

Team Tasks and Methods 30

Benefits of Office 2010 Collaboration 32

Stay in Touch with Your Team . 32

Share Files in the Workspace 33

Share Files and Folders 34

Co-Author Files Across Applications 34

Connect via Presence 36

Using Office Web Apps 37

Sharing on the Road with Office Mobile . 38

Coming Next 38

Part II Hit the Ground Running

4 Create and Share Compelling Documents with Word 2010 41

Start Out with Word 2010 41

Get Familiar with the Word Ribbon . 42

Find What You Need Easily with the Navigation Pane . 43

Print and Preview in a Single View . 45

Format Your Text . 45

Apply Text-Formatting Effects 47

Preserve Your Format Using Paste with Live Preview . 48

Illustrate Your Ideas 49

Apply Artistic Effects 50

Insert Screen Shots 51

Improve Your Text . 52

Catch More Than Typos with a Contextual Spell Check . 52

Use Language Tools, and Translate on the Fly . 53

Co-Author and Share Documents 55

Working with Shared Documents 57

Access Your Documents Anywhere 58

Use Word Web 2010 . 59

Check Your Document with Word Mobile 2010 60

Table of Contents v

5 Create Smart Data Insights with Excel 2010 61

Start Out with Excel 2010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Summarize Your Data Easily 63

Illustrate Information Effectively 65

Call Attention to Your Data with Icon Sets . 66

Data Bar Improvements . 68

New SmartArt Enhancements 70

Use Slicers to Show Data Your Way 70

Work Anywhere with Excel 2010 72

Excel 2010 Web App . 72

6 Manage Rich Communications with Outlook 2010 75

Starting Out with Outlook 2010 . 76

Using the Outlook 2010 Ribbon . 77

Setting Preferences with Backstage View . 77

Managing Your Conversations 78

Cleaning Up Your Messages 80

Streamlining E-mail Tasks 81

Working with Presence and Social Media . 83

Coordinating Calendars . 84

Viewing Group Schedules 84

Create a Calendar Group . 85

Improving the Look of Your Messages 86

Keeping in Touch with Outlook Mobile 88

7 Produce Dynamic Presentations with PowerPoint 2010 89

Starting Out with PowerPoint 2010 89

Editing and Formatting Video 91

Creating and Working with Animations 94

Enhancing Your Presentation with Transitions and Themes 95

Adding Sections to Your Presentation . 97

Managing and Sharing Your Presentation 98

Merging Presentations . 98

Broadcasting Your Presentation . 99

Printing Presentation Notes 101

Save Your Presentation as a Video . 102

Work with the PowerPoint 2010 Web App . 103

Using PowerPoint Mobile 2010 103

8 Organize, Store, and Share Ideas with OneNote 2010 . 105

Starting Out with OneNote 2010 . 106

Capturing Notes Easily . 107

Using OneNote as You Work 107

Create Notes Anywhere . 108

Working with Linked Notes and Task Notes . 110

Finding Just the Notes You Need . 112

Sharing Ideas Effectively 113

Creating a Shared Notebook 113

Finding Entries by Author 114

Working with Page Versions . 114

Accessing Your Notes Anywhere 115

9 Collaborate Effectively with SharePoint Workspace 2010 . 117

What Can You Do with SharePoint Workspace 2010? 118

Starting Out with SharePoint Workspace 2010 119

What About Groove? 120

Setting Workspace Preferences 122

Accessing Your Files Seamlessly 123

Simplified Searching . 124

Checking Files In and Out 125

Connecting with Your Team Instantly 126

SharePoint with InfoPath and SharePoint Business Connectivity Services 128

Using SharePoint Workspace on the Go . 128

10 Create Effective Marketing Materials with Publisher 2010 129

Starting Out with Publisher 2010 129

Collapse and Expand Page Navigation Panel 130

Use the Mini Toolbar 131

Creating and Using Templates and Building Blocks 131

Creating Precise Layouts 135

Enhancing Typography with OpenType Features 135

Working with the Improved Color Palette 137

Previewing and Printing Publications 138

Preparing for Commercial Printing 139

Sharing Publisher Files 140

11 Make Sense of Your Data with Access 2010 141

Starting Out with Access 2010 141

Using Application Parts 143

Applying Office Themes 144

Adding New Fields 146

Adding Quick Start Fields 146

Inserting Calculated Fields 148

Showing Data Bars and Conditional Formatting 149

Creating Navigation Forms 150

Designing Access 2010 Macros . 150

Working with Access 2010 and the Web . 151

Adding Web Controls . 152

Using Access 2010 with SharePoint 153

Part III Next Steps with Office 2010

12 Putting It All Together 157

Using Excel 2010 Data with Word 2010 . 157

Sharing SmartArt Among Office 2010 Applications . 159

Dragging Word 2010 Content to PowerPoint 2010 160

Mail Merging Word 2010 Documents in Outlook 2010 . 161

Sharing Access 2010 Data with Other Applications . 162

Scheduling a Meeting from a Shared Document 163

13 Security in Office 2010 165

Understanding Security in Office 2010 . 165

Opening Files Safely 166

Working with Protected View . 168

Password Protecting a File . 169

Limiting File Changes 170

Setting Role-Based Permissions 171

Recovering Unsaved Versions . 172

Working with the Trust Center 173

14 Training Made Easy 177

Getting Help in Office 2010 177

Finding What You Need on Office Online 180

Take Your Learning to the Next Level with Microsoft eLearning 182

Continue Learning with Microsoft Press Books . 183

Acknowledgments

Writing a book is a fun and typically fast-paced process that involves the talents of many

individuals, and some projects involve more team members than others. First Look Microsoft

Office 2010 was a particularly exciting and challenging project because it involved working

with Office 2010 in its various stages of development—which is like writing about a moving

target—and coordinating ongoing feedback from the people on the front lines: Office 2010

product managers, reviewers, and content providers.

Thanks very much to everyone who has helped out along the way. Specifically a big Thanks

to Lynn Finnel, a great project manager and friend; Rosemary Caperton, an excellent project

editor with a green heart; Juliana Aldous, who provided help with hurdles and roadblocks;

Joanna Yuan and her crew (Stephanie Krieger and Beth Melton) who gave feedback and program

assistance; and Steve Sagman of Waypoint Press and editor Roger LeBlanc for the great

copy editing and fine layout of the book you are now viewing. And thanks, always, to my

agent, Claudette Moore, for doing everything she does so naturally and well in making these

projects possible.

Introduction

In this chapter:

n The Road to Office 2010

n Who Uses Office 2010?

n What’s in Office 2010?

n Office 2010 System Requirements

n What You’ll Find in First Look: Microsoft Office 2010

You’ve probably noticed that part of living and working in the world today requires that you

do many things at once. For many of us, managing multiple tasks is our normal work mode.

You prepare a new presentation for a client while you’re working collaboratively with your

team, corresponding with people through e-mail, and inserting Microsoft Office Excel data—

which might be changing moment to moment—into the slides you create.

And if you’re like many people, you’re multitasking when out of the office, too. You check

e-mail while you wait in line for your morning coffee, or you make a quick edit to finalize a

report when you’re waiting for your luggage at the airport, Or perhaps you set up a group

meeting with teammates on three continents and trade documents just moments before it

starts so that you’re all looking at the same plan.

Welcome to Office 2010. Whether you work primarily in the office or on the go, you’ll find

smart tools in this release that enable you to get your work done easier, faster, and more

professionally than ever. All the freedom to multitask built into Office 2010 has an upside you

might not expect: being able to work anywhere, anytime means more flexibility, which translates

to higher efficiency and effectiveness. And when your work is done quickly and well,

you have more time left over for the people, places, and possibilities that intrigue you.

The Road to Office 2010

Did you know that Microsoft Office celebrated its twenty-fifth birthday in 2009? Throughout

the last two and a half decades, Office has grown and improved dramatically—partially

thanks to developments in technology, but primarily thanks to you.

It’s no secret that Microsoft places great importance on customer feedback. Users all over the

world continually provide comments and suggestions through various channels. Microsoft

gathers information through extensive beta programs, market research, the help systems,

and discussion forums. Focus groups galore provide veritable mountains of data for researchers,

program developers, and communications people to sift through. All this feedback

xii Introduction

comes together to provide current, relevant pictures that show which features you want and

need most in the programs you use every day. No matter where you fit on the scale ranging

from new user to power user, the new features in Office 2010 give you the option of becoming

more productive, more collaborative, and more mobile as you work.

We live in interesting times. There’s a major shift occurring in the way technology weaves

through all aspects of our lives. Limitations that seemed all but insurmountable a few years

ago—such as having your team divided among three different continents or needing to

access

your data immediately when you’re away from your desk—are now gone for good.

Office 2010 makes it possible to work virtually anywhere—on the desktop, on the train, in

the carpool line, in the coffee shop—with almost anyone, on any continent. And no longer

are you tied to your desktop PC—now you can work on the go using Web-based and mobile

versions of your favorite Office applications.

The tools you use in Office 2010 on a daily basis are no longer just designed for creating

documents or spreadsheets—although the programs do help you accomplish those common

tasks, and with style. The Office 2010 applications also deliver features you’ve been asking

for—professional formats that are a breeze to apply, easy exchange of data among applications,

and streamlined techniques that enable you to get more out of the time you invest in

the documents and presentations you create.

Worldwide, Office users demonstrate that they want reliable, easy-to-use applications that

enable them to produce professional results, work collaboratively in both local and global

teams, and work anywhere from flexible locations limited only by Web access or phone

reach. These three ideas—express yourself, collaborate, and work anywhere—are the key

visions

behind the changes in Office 2010.

Fast, professional, collaborative, flexible. You’re going to love this new release!

Who Uses Office 2010?

One fascinating result that emerges from Microsoft research is the picture of Office 2010

users.

Think of one of those amazing mosaic portraits, which—when you look closely—you

see is actually made up of thousands of tiny, individual photos. Office 2010 users represent

an amazing, diverse, multitalented global group that uses Office to accomplish just about

every possible productivity task you can imagine. Their needs and interests vary greatly,

and their use of the different Office applications runs the gamut from the very simple to the

incredibly

complex.

Introduction xiii

Note The dramatic redesign of the Office interface, introduced in Office 2007, was due in

part to a desire to help Office users discover a wider range of tools in their favorite programs.

Customer research had shown that most users worked with specific tools in the applications they

were familiar with, but a larger percentage of users weren’t getting the full benefit from the

programs they might have if they had been aware of the wider range of features and possibilities.

Data is showing that the redesign of Office really did reach this goal—Word 2007 and Excel

2007 users are now using four times as many features as they used in previous versions, and for

PowerPoint, the increase in feature use is a factor of five.

Today’s Office 2010 users often move back and forth among applications, depending on the

tasks they’re engaged in at any given time. Here are some typical scenarios:

n Meredith is a customer service representative in a large company. Her job includes

fast-paced communications: she receives and sends e-mail messages to dozens of customers,

prepares and sends proposals, updates Web information, and tracks campaign

results in the customer services database. Occasionally, Meredith gets to lead brainstorming

sessions for new campaigns (she loves that) and compiles the notes for the

team. Printouts of colorful SmartArt diagrams she created in Word and PowerPoint—as

well as her favorite “The Far Side” cartoon—are hanging on her cubicle walls. Her daily

tasks require a whole palette of applications: Outlook, Word, PowerPoint, Publisher,

OneNote, and occasionally, Access.

n Ian is a mid-level manager in the communications department of the same company.

As team leader, he is in charge of planning, budgeting, and managing all reports and

support materials that are developed to support the company’s product line. He uses

Outlook for scheduling and task management, and works with Word, Access, and Excel

for reviewing and working with important data. Ian’s team also prepares company

reports

and public relations materials using Word and Publisher.

n Dominik is marketing coordinator—she is responsible for messaging campaigns,

running

budgets, hiring contractors, working with the board, maintaining a database,

conducting webinars, providing online training, and more. She uses Word, Excel,

PowerPoint, Outlook, Publisher, and Access, and she needs to be available for decisions

and updates continually. Because she manages a department of five, she uses Microsoft

SharePoint 2010 to keep the team organized and working efficiently.

n Kamil is an Office power user who has a long commute to and from his Washington

office each day. He has reduced the impact of his travel time by telecommuting two

days a week, but he also wants to be able to get a start on work—or wrap things up for

the day—when he’s on the train. Whether he’s working from home or he’s in the office,

he uses Outlook, Excel, Word, and SharePoint to run his department, keep the team on

track, host meetings, review and sign off on documents, and make the calls that impact

the bottom line in his department.

n Todd is the IT manager for the business. He is in charge of upgrading, deploying, and

training staff on Office 2010. He also secures and backs up all files, writes custom utilities

for the Web portal, and works in customer and staff support. He is a programmer

and power user of all Office applications, but he has a small staff, so he needs to be

able to offer training and support in a cost-effective and productive way.

Office 2010 includes a range of features that will support the daily activities of each of these

users. The consistent look and feel of the Ribbon helps ensure that users are comfortable and

confident working with any of the Office applications. Changes in each of the applications

make it easier to produce and share professional results in a variety of ways. And not only do

the Office applications work together smoothly as an integrated system, they provide easy

collaboration, anywhere access, and all the productivity tools users need as their work tasks

change and grow.

What’s in Microsoft Office 2010?

Similar to earlier releases, Microsoft Office 2010 is available in several versions, each designed

with a specific group of users in mind, and each accessible via PC, browser, or phone. Here’s

what you’ll find in each version of Microsoft Office 2010:

n Office Professional Plus 2010 is for the high-end user who collaborates with

others,

manages data, and needs flexibility, mobility, and coauthoring capabilities. This

edition includes Word 2010, Excel 2010, PowerPoint 2010, OneNote 2010, Outlook

2010, Publisher 2010, Access 2010, SharePoint Workspace 2010, InfoPath 2010, and

Communicator 2010.

n Office Professional 2010 is designed for the business user who needs all the power

of the traditional applications as well as access to data management tools. This version

includes Word 2010, Excel 2010, PowerPoint 2010, OneNote 2010, Outlook 2010,

Publisher 2010, and Access 2010.

n Office Standard 2010 removes Access 2010 from the mix. It offers users who work

with documents, worksheets, marketing materials, presentations, notebooks, and—of

course—e-mail and schedules just what they need: Word 2010, Excel 2010, PowerPoint

2010, OneNote 2010, Outlook 2010, and Publisher 2010.

n Office Home and Business 2010 streamlines the suite to the basic applications used

by small business and home users: Word 2010, Excel 2010, PowerPoint 2010, OneNote

2010, and Outlook 2010.

n Office Home and Student 2010 is geared toward student and home users, offering

the traditional applications for creating documents, worksheets, presentations, and

workbooks: Word 2010, Excel 2010, PowerPoint 2010, and OneNote 2010.

n Office Professional Academic 2010 is designed for faculty members who

need access to all the core applications—Word 2010, Excel 2010, Outlook 2010,

PowerPoint 2010—as well as OneNote 2010, Access 2010, and Publisher 2010.

n Office Starter 2010 is for the beginning user who wants to work with only

Word 2010 and Excel 2010.

Office 2010 System Requirements

In keeping with green efforts to maximize efficiency on systems users already have, Office

2010 was designed for any system capable of running Office 2007. Here are the suggested

system requirements for Office 2010:

n Computer and processor: 500-MHz processor or higher.

n Memory: 256 MB (megabytes) of RAM or more.

n Hard disk space: 2 GB (gigabytes)*.

n Drive: CD-ROM or DVD drive.

n Display: 1024 by 768 or higher resolution monitor.

n Operating system: Windows XP SP3 (32-bit), Windows Vista SP1 (32-bit or 64-bit),

Windows 7 (32-bit or 64-bit), Windows Server 2003 R2 with SP2 (32-bit or 64-bit),

or Windows Server 2008 with SP1 (32-bit or 64-bit). Terminal Server and Windows on

Windows (WOW) are also supported. **

* Part of the used hard disk space can be released after installation is complete.

** WOW allows users to install 32-bit Office 2010 on 64-bit systems.

What You’ll Find in First Look: Microsoft Office 2010

I hope First Look: Microsoft Office 2010 inspires you and gives you a good sense of the

exciting

features coming in the release of Office 2010. This book was written while the

software was in development, so you may find some variance in screen illustrations and

procedures, but the overall story is the same: The key to the new features is freedom and

flexibility—you’ll be able to see how to get more from your applications no matter how—or

where—you choose to use them. Office 2010 is designed to help you express your ideas

clearly and creatively, work seamlessly with a group to get things done efficiently and on

time, and access and work with your files virtually anywhere with a similar look and feel

whether you’re using your PC, browser, or phone. To showcase these key points, First Look:

Microsoft Office 2010 follows this organization:

n Part I, “Envision the Possibilities,” introduces you to the changes in Office 2010 and

shows you how you can make the most of the new features to fit the way you work

today. Chapter 1, “Welcome to Office 2010,” gives you a play-by-play introduction to

new features; Chapter 2, “Express Yourself Effectively and Efficiently,” details the great

feature enhancements and visual effects throughout the applications; and Chapter 3,

“Work Anywhere with Office 2010,” explores the flexibility factor by presenting a set

of scenarios that enable users to complete their work no matter where their path takes

them.

n Part II, “Hit the Ground Running,” focuses on each of the Office 2010 applications in

turn, spotlighting the key new features and showing how they relate to the whole.

These chapters provide a how-to guide for many of the top features you’re likely to

use right off the bat, and they offer inspiring ideas on how to get the most from your

favorite

applications.

n Part III, “Next Steps with Office 2010,” zooms up to the big picture and provides

examples

to help you think through interoperability. How often do you use the various

Office applications together? Customer research shows that people often don’t

realize how well the applications work together as a complete system—which means

they might be laboring over items they could easily incorporate from somewhere else.

This part of the book provides examples for integrating the applications and explores

Office 2010 security and training opportunities, as well.

So if you’re ready, let’s take a closer look at the ways Office 2010 can help you express

your ideas, whether you work on your own or as a part of a team, and share your work

with the world.

First Look: Microsoft Office 2010

Part I

Envision the Possibilities

Office 2010 ushers in a new era in productivity software by making the reliable tools

you’ve come to expect from Microsoft easier to use and more powerful than ever. In this

part of the book, you’ll get the big picture view of how Office 2010 improves the way you

work every day.

This part of First Look: Microsoft Office 2010 includes the following chapters:

n Chapter 1: Welcome to Office 2010

n Chapter 2: Express Yourself Effectively and Efficiently

n Chapter 3: Collaborate in the Office and Around the World

Chapter 1

Welcome to Office 2010

In this chapter:

n Features that Fit Your Work Style

n Changes in Office 2010

n Exploring the Ribbon

n New Backstage View

n Languages and Accessibility

n Coming Next

This is an exciting time to be working with technology. Changes are occurring with what feels

like ever-increasing speed. The world is growing continually smaller, and far-away places are

more and more within our reach. Today our coworkers are almost as likely to be working on

a different continent as they are to be down the hall. Opportunities are possible now that we

couldn’t envision a few years back—more of us are telecommuting, training by webinar, and

planning projects virtually, all of which is accomplished through Web and phone access to

the tools that make it all possible.

Office 2010 was designed with evolving workplace trends in mind. With Office 2010, you

can use familiar, reliable Office applications to work more efficiently, produce better-thanever

results, collaborate in real time with peers in your office or around the world, and continue

your work from any point on the globe with Web or phone access. And even though

these are big changes, they fit easily into what you’re already doing. The tools you need to

implement

these changes in your work efforts don’t have a steep learning curve. By adding

to the functionality of your favorite features (Print, Paste, and Picture Effects, to name a

few examples), Office 2010 helps you get more done with less effort. And the collaboration

and anywhere access features make working with anyone, anytime, a natural and intuitive

process.

Features that Fit Your Work Style

For many of us, our long workdays of focusing on single projects have evolved into days with

smaller blocks of time dedicated to one of many things we have going on. We are getting

more done than ever—and Office 2010 can help you better enjoy the process.

4 Part I Envision the Possibilities

What’s exciting about Office 2010 is that it’s more than a set of powerful tools that help you

meet and manage the demands of your fast-paced workday. For example, if you do most of

your work at your desk, crunching numbers, answering e-mail, and preparing reports, Office

2010 helps you work faster, manage huge worksheets, design effective documents easily, and

present your work in new, visual, and flexible ways that help your diverse audience understand

your ideas.

If you work primarily in a team, you’ll find that Office 2010 makes collaboration easy with

features that enable you to share files, co-author documents, and even contact teammates in

real time.

If you work predominantly on the road—and frequently need to get updates on projects,

add items to the calendar, or approve new documents and strategies—Office 2010 gives you

the flexibility to use the Office applications you know and love regardless of whether you’re

logging in from your PC, your browser, or your phone.

In Part II, “Hit the Ground Running,” you get a closer look at the new features in each of your

favorite Office 2010 applications.

Office 2010 at a Glance

With Office 2010, you can

n Increase your productivity with more effective, reliable tools

n Express your ideas creatively and effectively, for multiple audiences

n Produce and share professional results easier and faster

n Communicate—and manage communications—easily whether you work

independently,

collaboratively, or remotely

n Gain more freedom and flexibility to work anywhere, with anyone

n Enjoy the consistent and high-quality Office experience from your PC,

browser,

or phone

Changes in Office 2010

This section presents a look at the key ideas behind the development of the features you will

find in Office 2010. Working independently or in a group, at your PC or on the road, you’ll

find new tools and techniques that help you create great-looking documents, worksheets,

presentations, and more, and enable you to share your work easily with others.

Chapter 1 Welcome to Office 2010 5

Tip One significant change that has a large impact on processing speed and power is that

Office 2010 is now available in a 64-bit version. This expanded capacity really shines in Excel,

where enormous spreadsheets require that kind of processing power.

Let Your Ideas Soar

Office 2010 shows that powerful programs don’t have to be difficult to use. Program designers

know that users today need a great variety of powerful, flexible tools, and that it’s important

that those tools and features be easy to find and use. For this reason, you’ll find quick

access to style galleries, themes, and more that help you select professional designs, choose

from color schemes that work, and create a professional look whether you’re creating documents,

worksheets, presentations, notebooks, or database tables.

To help you take your ideas to the next level, Office 2010 offers artistic effects and picture

editing, video editing in PowerPoint, new data visualizations (including sparklines and slicers)

in Excel, and the ability to manipulate fonts professionally in Word. And this is just the beginning—

there’s much more, as you’ll see in the chapters in Part II, “Hit the Ground Running.”

And not only will your output be better, but the whole document creation process is easier,

thanks to enhanced search features, simplified navigation, the contextual spell checker,

translation

tools, and more.

Collaborate Easily and Naturally

Unless you’ve been living off the grid for the last couple of years, you’ve probably noticed

that the world has gotten substantially smaller, thanks to the continuing expansion of Web

technologies. Blogs, social media, and new online publishing alternatives have steeped most

of us in a culture that is always on, always connected, and always talking.

An increasing number of people are now working in teams, and those teams might be spread

throughout the office or located around the world. A writer in Omaha could be working with

a software developer in India who might have been hired by an administrator in Scotland.

This geographical diversity within a project team is no longer an unusual occurrence—an

increasing number of Office 2010 users need to collaborate with peers and clients all over the

globe.

Office 2010 includes powerful tools to facilitate easy and successful team collaboration

and management. Co-authoring features in Word 2010, Excel 2010, PowerPoint 2010,

and OneNote 2010 enable you to work with a variety of teammates on a single project in

real time. And, when you use these features, your changes are automatically tracked and

coordinated.

Microsoft SharePoint Workspace 2010, included with Microsoft Office Professional Plus,

enables

users to move files online and offline easily. Team leaders and members use

SharePoint Workspace to create and update the team calendar, conduct project management,

assign tasks, create document libraries, and more. Team members can collaborate in

real time, and their documents show who is working on what so that duplication of effort

or trading outdated versions of files is no longer a problem when several users work on the

same document.

Presence information is available with Office Communicator throughout Office 2010,

enabling

you to see which of your teammates are online and communicate instantly—via

instant messaging, e-mail, or phone—to clarify questions on the project. You don’t have to

leave the application you are working in to ask questions you need answered right away.

Work Anywhere—and Everywhere

Laptop, notebook, desktop, kiosk—any place that gives you an on-ramp to the Internet is

a potential workplace in Office 2010. Office 2010 Web Apps let you work with the familiar

Office 2010 interface and work with your Word 2010, Excel 2010, PowerPoint 2010, and

OneNote 2010 files. You can share files with other users by using Windows Live or SharePoint

Workspace 2010 and then open and work with the files on your PC when you get back to the

office.

If you are a gadget lover, you might already have a smartphone with all the bells and whistles

you can get. Office Mobile 2010 gives you another way to work on the go, using your

Windows Mobile smartphone. You can write up an idea before breakfast, create a new document,

and share it with the team—all before you get into work in the morning. Later, on the

way to meet a vendor, you can add a few more details, insert a picture, and send the file for

review—all from your phone.

Office Mobile works with Word, Excel, Outlook, PowerPoint, and SharePoint Workspace. The

application windows have been customized to fit the small phone screen and browser so that

you can find what you need easily and enjoy the familiarity of the Office 2010 interface.

This flexibility in Office 2010 gives you the freedom to follow through on your creative ideas

in real time—whenever and wherever they occur.

Exploring the Ribbon

At the top of the interface in all Office 2010 applications, the Ribbon brings you all the tools

you need—and only the tools you need—to complete specific tasks in the various Office

2010 applications. The Ribbon includes tabs that reflect the various tasks you perform within

each of the applications, and each tab contains tool groups offering the tools you need as

you work with the files you create, Every application has the same look and feel, which enables

you to learn the Ribbon once in your favorite or most often used Office program and

then easily find your way around any other Office program. The Ribbon was introduced in

Office 2007 and has been improved in Office 2010 to include some new tools and provide

more flexibility. You can use the Minimize The Ribbon button to hide the Ribbon so that you

have more room to work on-screen, and you can customize the Ribbon to create your own

tabs and tool groups specific to your needs.

The simple design of the Ribbon enables you to find the tools you need in the tab that reflects

the task you want to perform. When you want to add a picture to your annual report,

for example, you look in the Insert tab and find Picture in the Illustrations group. Figure 1-1

introduces you to the Ribbon in Word 2010 and Excel 2010.

Quick Access toolbar Tabs Groups

Microsoft Office Button

Dialog box launcher

FIGURE 1-1 Exploring the Office 2010 Ribbon.

A Quick Look at the Ribbon

The Ribbon simplifies the way you find and work with tools and options in Office. With a

simple, easy-to-understand layout for your commands, the Ribbon helps you find the tools

you need:

n Ribbon Tabs Each tab provides a set of tools related to an overall task you are likely

to be performing in a specific application. In Figure 1-1, the Word 2010 tabs are File,

Home, Insert, Page Layout, References, Mailings, Review, and View; the Excel tabs are

File, Insert, Page Layout, Formulas, Data, Review, and View. The File tab takes you to

Microsoft Office Backstage view, which gives you a central place to work with the files

you create in Office 2010 applications.

n Ribbon Groups Within each tab are groups that help organize common commands to

help you quickly find what you need for a specific task. For example, on the Insert tab

in Word 2010, you’ll find Picture, Clip Art, Shapes, SmartArt, Chart, and Screenshot in

the Illustrations group.

n Galleries A down-arrow appears to the right of some options in groups. Clicking the

down-arrow display a gallery of options you can select or a list of additional choices.

(See Figure 1-2.)

FIGURE 1-2 Galleries display visual examples of options.

Contextual Tabs

In addition to the tabs, groups, and tools shown in the Ribbon during normal use, contextual

tabs appear when you perform specific actions in a file. The fact that they appear only when

you need them is part of the beauty of the Office 2010 interface—this keeps the number

of commands on-screen at any one time at a minimum and easy to navigate through. For

example, when you click a photo in a Word document, the Picture Tools contextual tab appears,

providing options related to picture editing. (See Figure 1-3.)

FIGURE 1-3 The contextual tab provides options related to the task you are performing.

New Backstage View

One of the major improvements in Office 2010 is Microsoft Office Backstage view, a kind of

one-stop shop for all tasks related to managing the files you create in Office 2010 applications.

The round and colorful Microsoft Office Button in Office 2007 has been replaced by

the File tab. When you click it, you are taken to a screen outside the document where you

can manage file information and save, share, print, protect, and work with version information

for the document. (See Figure 1-4.)

FIGURE 1-4 Backstage view helps you prepare, manage, and share the files you create.

Backstage view is organized in three panels. The left panel includes the commands you’ll use

to work with the files you create. The center panel offers related options, and the third panel

displays a preview image of the selection or additional options. For example, when you click

Print, the center panel shows print options, and the right panel displays a preview of your

document as it will appear in print. This streamlines the print process so that you can preview

and print your document in one step.

Managing Files in Backstage View

In Backstage view, you’ll find the commands you traditionally found on the File menu: New,

Open, Recent, Close, Save, Save As, Print, and Exit. In addition to each of these basic filemanagement

commands, you’ll find Share, which enables you to share the file in a variety of

ways. You can share your desktop directly from Backstage view by using Communicator integration,

sending the file by e-mail or fax, or saving it to a SharePoint Workspace or to a blog.

Tip Another great option Backstage view offers is that you can customize it to include your own

workflows and procedures.

Backstage view is designed to give you access to important tools users sometimes forget to

use. For example, you can run the Document Inspector by clicking Check For Issues in the

Info panel of the Backstage view and clicking Inspect Document.

Streamlined Printing

Another great time-saving feature in Backstage view is the new Print process. Microsoft research

shows that more than 60 percent of Office users print more than 60 times per month.

That’s a lot of time clicking Print options! Now in Office 2010, Print Preview has been combined

with Print so that instead of working through multiple dialog boxes, you get a onepage

view of how the file will look in print. (See Figure 1-5.) You can choose your options

right on the screen and click Print—and you’re done.

FIGURE 1-5 Now you can preview and print in one smooth process.

Languages and Accessibility

In keeping up with the reach of the global workforce, Microsoft Office 2010 has more robust

language tools than ever, including a choice of translation tools. According to Microsoft research

data, more than 1.6 million words have been translated using the Microsoft Language

tools—plus more than 6.2 million words online—into more than 100 languages. Office

Online, which provides all kinds of content for Office users—including tips and tricks, how-to

articles, and video tutorials—provides content in more than 90 languages.

You’ll find the Translate tool, which enables you to translate words or phrases, in the Review

tab of Word 2010, Excel 2010, PowerPoint 2010, and OneNote 2010. Or you can use the Mini-

Translator tool for on-the-spot translations. (See Figure 1-6.) The Editing Language feature

enables you to choose the language used by the dictionary for the proofing tools you select.

FIGURE 1-6 Translate words or phrases on the fly, and choose the language dictionary you want to use.

In terms of extending the reach of applications for users who are differently abled, Microsoft

works with more than 175 partners to create assistive software that enables Office to provide

screen readers, high-visibility color schemes, and special keyboards.

Coming Next

This chapter painted the big picture of Office 2010 changes, introducing you to the overall

design goals in this release. It also summarized changes to the Ribbon, introduced you to

Backstage view, and discussed enhancements to print and language functions. The next

chapter takes a closer look at the Office 2010 features that will help you express your ideas in

a lively and effective way—through words, images, numbers, and more.

Chapter 2

Express Yourself Effectively

and Efficiently

In this chapter:

n Understanding Your Audience

n How Visuals Help

n Adding Text Effects

n Adding Artistry to Your Images

n Working Font Magic in Word 2010 and Publisher 2010

n Creating Data Visualizations in Excel 2010

n Editing Video in PowerPoint 2010

n Communicating Visually in Access 2010

n Enhancing and Streamlining Communications in Outlook 2010

n Coming Next

People use Microsoft Office to do many, many things. Depending on the nature of your

work, you might create documents, design worksheets, prepare reports, develop and

manage

databases, create and give presentations, e-mail clients and coworkers, gather

information,

analyze information, and share information. And that’s just Monday!

Chances are that many tasks you do in Office 2010 involve communicating ideas to others.

Those others might be peers, clients, board members, prospective customers, students,

and more. You need to be able to create, prepare, and share worksheets, charts, reports,

databases,

Web pages, e-mail messages, and brochures that other people can view and

understand. And of course, once they understand what you’re sharing, you want them to

give you the response you’re hoping for—whether that’s a new contract, an important sale,

startup funding, or accolades for a job well done.

Office 2010 includes a variety of new features and tools that can help you communicate your

ideas clearly, visually, and in ways your readers will understand. Framing both what you want

to show and tell is important and can help you make sure your points hit their mark. This

chapter gives you a tour through the various new features that will help you showcase your

14 Part I Envision the Possibilities

thoughts more creatively than ever, bringing more visual energy to the items you produce.

Specifically, this chapter introduces you to features that help you do the following:

n Improve the formatting of your Word 2010 text by adding special artistic effects such

as shadows, reflections, glows, and more.

n Take advantage of the professional typography capabilities available in many OpenType

fonts, such as ligatures and stylistic sets. Edit pictures within your document, worksheet,

brochure, or presentation by using the image-editing features in the various Office

2010 applications.

n Edit videos within PowerPoint 2010, customizing the length, formatting, and effects to

meet your needs.

n Create data visualizations that help your customers grasp the trends in your Excel 2010

data.

Understanding Your Audience

Beginning with the end in mind is a good approach for just about any document, worksheet,

presentation, notebook, and database you create in Office 2010. When you first begin a new

file, consider your answer to the following questions:

n Who will be reading or using this file?

n What will they expect to see?

n Do you have photos that support the points you’re making in the document or

worksheet?

n Will you use special text effects to call attention to key points, headlines, or labels in

Word 2010?

n Will charts, diagrams, or screen shots help your audience understand what you want

them to do?

n How can you help your audience understand the ideas you’re putting in front of

them—and what do you hope they’ll do as a result? Do you want them to fill out a

form, subscribe to your magazine, purchase your product, or understand how much

your department has accomplished this year? Knowing what kind of response you want

from your communication—before you even begin—will help create the framework of

the whole process for you. And the visual elements you add along the way will be more

likely to take you closer to that goal.

Chapter 2 Express Yourself Effectively and Efficiently 15

How Visuals Help

Not too long ago, most business reports weren’t very exciting. They might have had a cover

page, a column or two, and maybe a page border. The title might have been in a larger font

(Times Roman, most likely), and the body text was probably your basic 10-point or 12-point

standard font. But it was just business communication, right? Better to focus on the facts and

leave the fancy stuff to the marketing brochures.

The affordability of color laser printers and the ability to design attention-getting materials

on the desktop have changed all that. Today we recognize that no matter what we produce,

our materials are competing for readers’ attention. And research shows us that documents,

presentations, notebooks, and worksheets that are clear, easy to read, and include visual cues

that help lead our eyes to the most important points capture our attention and reinforce key

concepts in the document or presentation.

By adding special artistic effects to text; formatting headlines, captions, and tables in an inviting

way; and thinking through the way you use pictures, charts, diagrams, and more, you

can dramatically increase the power of your message and make sure your readers get the

point.

Benefits of Pictures in Communications

The images you add to your documents, worksheets, presentations, notebooks, and

e-mail messages serve several purposes. They not only add visual interest and give your

readers’ eyes a rest, they are also known to provide the following real communication

benefits:

n Pictures linked to written text increase attention and help recall.

n Pictures help improve your readers’ comprehension.

n Readers’ emotional response to pictures can help or hinder communication.

n Readers with lower literacy skills show improved comprehension when

pictures

are used in documents.

n Captions help readers make the connection between pictures and text.

n Pictures showing outcomes, actions, or processes can help readers know what

to do after reading a document.

Adding Text Effects

What are your favorite formatting features in the different Office 2010 applications? Most

people use boldface text to make sure headings, row and column labels, and table headings

stand out. You might also use the styles in Word or Excel to apply the look you want to the

different elements in your file, make changes to font size or color, and occasionally use more

specialized text controls such as small caps and strikethrough.

Word 2010 includes a number of easy-to-apply text effects that help you add special artistic

touches to the text in your documents. Now in addition to using 3-D effects, you can add

glows, bevels, shadows, reflections, and outlines. (See Figure 2-1.) These text effects apply

directly

to your text and can be included in styles you create. And they act like traditional

text when you check spelling or edit your document.

Tip Did you know Paste is one of the most frequently used tools in all of Office 2010? You

now have more control when you copy and paste items in your files; the Paste Options gallery

enables you to preview the changes before you paste. Chapter 4, “Create and Share Compelling

Documents with Word 2010,” includes details on using Paste with Live Preview.

FIGURE 2-1 New text effects enable you to add artistic touches to your text.

Adding Artistry to Your Images

Great photos can be more than just nice-looking images when you apply special artistic

effects

to the pictures you use in your files. Now Office 2010—specifically, Word 2010, Excel

2010, Outlook 2010, and PowerPoint 2010—includes a palette of artistic filters you can apply

to images in your documents. (See Figure 2-2.)

FIGURE 2-2 Choosing artistic effects.

You can choose from a variety of effects that apply different filters to the selected image,

including chalk, watercolor, sponge, rain, and more. You can also use the new Remove

Background feature to remove an image from the foreground and place it on a different

background. This feature is great for product catalogs, or for any image in which you need

to spotlight a particular element without showing a background that might detract from the

central element you hope will catch the reader’s eye.

Figure 2-3 shows an image with various artistic effects applied. As you can see, each image

conveys a different feeling, which means it’s communicating a different idea with each effect.

FIGURE 2-3 With artistic effects, one photo can be used to communicate several different ideas.

When will you use artistic effects in the files you create? Here are just a few ideas:

n Use an artistic photo or treatment of your company logo to show that this year’s annual

report demonstrates innovation and creativity.

n Create an effect viewers will remember by choosing not to show a product or place in a

realistic way.

n Capture your readers’ attention and communicate something new by modifying an

existing

photo they will recognize.

Correcting and Recoloring Pictures

Of course, not all images you take on your digital camera or phone are ready to use just as

you’ve captured them. The lighting might be wrong in that product photo; the person might

be just slightly out of focus; the range of contrast in the picture of the new building might be

too great to show up well in print.

Although the Picture tools in Office 2007 went a long way toward giving you control over

the images you add to your files, they were limited in the range of changes they allowed.

You could adjust the picture by changing the contrast and brightness, recoloring the image,

applying styles and effects, or arranging the picture on the page. The Corrections tools in

Office 2010 give you a customizable palette of choices for brightness and contrast, and they

allow you to set your own standards for sharpening or softening images. This means you

can now insert and edit photos as you work in Word 2010, Excel 2010, PowerPoint 2010, and

Outlook 2010 without ever leaving the application.

Cropping also has received a makeover in Office 2010. Now when you choose the Crop tool,

the entire image is displayed in shadow behind your crop marks; you can then use the cropping

tool to zoom in on the part of the image you want by resizing the image, panning to

the area you want to capture, and cropping out the rest. The display makes it easy for you to

select only the part of the photo that you want people to see.

Leaving the Background Behind

One of the great new artistic tools in Office 2010 is Remove Background, which lets you

pull the object of a photo from its background. This is a great technique when you’re

preparing product information, introducing a new employee, or creating materials to

spotlight a key element you don’t want your readers to miss.

To grab an image in the foreground and remove it from the background of your

photo,

use the new Background Removal tool on the Picture Tools tab of Word 2010,

Excel 2010, PowerPoint 2010, and Outlook 2010. Here’s how:

1. Select the photo, click the Picture Tools Format tab, and click Background

Removal.

2. Drag the bounding box to include the areas of the image you want to display.

The magenta areas are those that will be removed.

3. Use the Remove Background tool in the upper left area of the Ribbon to mark

areas of the image you want to keep or remove.

4. Click Close Background Removal to complete the task.

Pretty neat, eh? Experiment with this feature to discover ways you can point your

readers’

attention to just what you want them to see in your images.

The new Color tools also have had a major overhaul: now you can choose from a wide

range of color management tools and make choices for saturation, tone, and an expanded

selection of color wash effects. (See Figure 2-4.)

FIGURE 2-4 The Color tools in Office 2010 allow more choices for saturation, tone, and recoloring.

Color Effects Defined

If you’re new to the whole landscape of picture editing, you might be wondering what

the new features in the Color tools of Office 2010 enable you to do. Here’s a quick introduction

to the phrases and what they mean:

n Color saturation controls the amount of color used in the picture—in other

words, how saturated the image is with the colors represented. A picture

with a Saturation of 0 percent is a black-and-white image; a picture with

a Saturation of 400 percent is flooded with color. (Use this option only for

special

effects.)

n Color tone enables you to choose the overall temperature for the image. The

underlying tone for the image ranges on a scale from “cool” blues (4700 K) to

“hot” oranges (11,200 K). Experiment with an image to test the range of tones,

and choose the look that best fits the overall design of your file.

n Recoloring applies a color filter to the image, making it monochrome (in blue,

red, green, or purple, for example) and creating a special artistic look that can

fit the color scheme in the file you’re preparing.

Working Font Magic in Word 2010 and Publisher 2010

Another new feature in Office 2010 that adds a touch of visual sophistication to the files you

create is support for OpenType typography. OpenType fonts are a type of scalable font developed

by Microsoft and Adobe to provide an expressive font format that enables software

users to create files reflecting an increasingly diverse range of languages. Word 2010 and

Publisher 2010 now support the typography features found in some OpenType fonts, such as

working with ligatures and stylistic sets. Word 2010 and Publisher 2010 also include support

for Number Forms and Spacing. (The feature is called Number Styles in Publisher.)

A ligature is a character in typography that consists of two or more connecting letters; for

example, the letters fi are often set as a ligature. Ligatures were originally invented (back in

the dark ages when typesetters cast type in lead before inking them and printing pages) to

save space and reduce typesetting effort.

A stylistic set is a font displayed with a specific set of characteristics, enabling you to get

a subtly different look and feel for selected text even though you’re using the same font

throughout a document. Gabriola, a new font in Office 2010, offers a variety of stylistic

sets you can try in your documents. Different stylistic sets might give you a whole range

of choices

for that particular font, including whether you want to display serifs or not, how

characters

with extenders are displayed, and much more.

To see your typography choices in Word 2010, click the dialog launcher in the Font group on

the Home tab, and then click the Advanced tab. The Ligatures setting enables you to choose

how you want the ligatures to be applied when they are available, and the Stylistic Sets

choice offers a list of available sets you can select for the current font. Figure 2-5 shows several

different stylistic sets selected for a headline in the Gabriola font. Take a close look at the

length and shape of the extenders on the letters h, k, and p as well as the spacing between

the characters to see the difference.

FIGURE 2-5 Fine-tuning fonts in Word 2010.

Ligatures and stylistic sets work similarly in Publisher 2010. Here you can choose the

typographical

controls from the Typography group in the Text Box Tools Format contextual

tab. (See Figure 2-6.)

FIGURE 2-6 Choosing a stylistic set in Publisher 2010.

Tip Publisher 2010 also includes a number of specialized font options, including stylistic alternatives,

and specialized number styles. You’ll learn more about the steps involved in working with

fonts in Publisher 2010 in Chapter 10, “Create Effective Marketing Materials with Publisher 2010.”

Creating Data Visualizations in Excel 2010

If you work with numbers all day long, you’re probably comfortable with a language many

other people struggle to understand. Equations and trend lines make perfect sense to you;

business intelligence is part of your language; numbers tell you what you need to know

about product status, market saturation, and potential return on investment. You build your

documents and presentations around these numbers.

But wait a minute! Could you please translate that for the rest of us?

Excel 2010 now includes simple but effective tools that will enable even the most advanced

numbers people to show the rest of us what the numbers mean in a language we can understand.

Take sparklines, for example. Newly added sparklines are small graphical representations

of data on your worksheets—small charts that can depict a trend and visually convey to

your audience what the values actually mean. Sparklines can show, for example, an increase

in enrollment for new webinars, a spike in sales related to a recent event, or a fall-off in hard

goods purchasing.

When you use sparklines to illustrate the data in your worksheet, you help those for whom

numbers might be a foreign language stop struggling with what it all means and enable

them to clearly understand your point. Figure 2-7 shows simple sparklines added to a column

on a worksheet to illustrate the data trend reflected in the displayed row.

The conditional formatting features in Excel 2010 have been improved and expanded with

new icon sets, data bars that are capable of showing negative values, and proportional displays

in data bar sets. You can also control the formatting of data bars to get just the right

effect in the worksheets and documents you prepare.

Tip For more detail on using sparklines and making conditional formatting improvements to

your Excel worksheet, see Chapter 5, “Create Smart Data Insights with Excel 2010.”

FIGURE 2-7 Sparklines give you data snapshots in Excel 2010.

Editing Video in PowerPoint 2010

It’s no secret that seeing how something is done in video format is a simple way to learn a

new technique, whether you’re changing the oil in your car, learning how to plant a rosebush,

or designing a new brochure in Publisher 2010. A video clip enables you to share with others

the “how to” as well as the “why” because you can explain the reasons for the action while

you’re demonstrating the technique for those viewing the clip.

Are you ready to let your creativity out of the box? Take a look at the new video capabilities

in PowerPoint 2010. Now the video from your files is embedded by default, which means that

you no longer have to carry all your media files along whenever you copy, move, or share a

presentation. Because the video is embedded, you can edit the video directly in PowerPoint

without using any other video-editing software.

The video-editing features in PowerPoint 2010 enable you to shorten long video segments,

apply fade-in and fade-out settings, add bookmarks to help you quickly access important

points in the video or even trigger animation from key points in your video. Figure 2-8 shows

you several of the available video-editing capabilities in PowerPoint 2010.

Tip If you have online video you’d like to use in your PowerPoint 2010 presentation, you can easily

embed the code from the online video site right in your PowerPoint slide. To find out more about

how to do this, see Chapter 7, “Produce Dynamic Presentations with PowerPoint 2010.”

FIGURE 2-8 PowerPoint 2010 includes powerful video-editing and formatting tools that enable you to modify

video without leaving the program.

Communicating Visually in Access 2010

Maybe it’s all about the data for you. You design elegant databases; you create new data

tables

and forms. You know how to put together such a sophisticated query that it leaves

others in awe. When you need to communicate your ideas to others who aren’t as comfortable

with data as you are, chances are that you know what happens when other peoples’

eyes glaze over. They are no longer listening, which means they aren’t following what you’re

saying. How can you make sure they understand what your data is saying in a way that

makes sense? Access 2010 data visualizations can help.

Conditional formatting in Access 2010 now supports data bars, which enable you to depict

data visually so that your audience can understand your ideas. See Figure 2-9. In addition

to the traditional data bars, now you can set negative values for data bar display as well.

Improved tools in Access 2010 enable you to apply a greater range of conditional formatting

to the tables and reports you create. Here are other ways you can add visual elements to

Access 2010:

n Include data bars, icons for minimum and maximum values, and more.

n Display an image on the background of your reports by using the Background Image

feature in the Report Design Tools Format tab.

FIGURE 2-9 Access 2010 lets you add data bars that help you compare data values at a glance.

Enhancing and Streamlining Communications in

Outlook 2010

One of the big changes you’ll notice right away in Outlook 2010 is that the Ribbon replaces

the menu and toolbars at the top of the main Outlook window. (See Figure 2-10.) Now all the

tools you need—including the new Quick Steps—are within easy reach and you can find the

right commands when you need them.

All the formatting capabilities—SmartArt graphics, styles, and Office themes—are available

for the e-mail messages you create. This means that even though the messages you create

compete for attention with the hundreds of messages your recipients receive, you can take

steps to make sure your messages are as compelling and inviting as possible.

Depending on the nature of the work you do, the feature that enables you to include screen

shots in the messages you send can be a big help. (See Figure 2-11.) If you help customers

find products online, support your staff through technical training, or assist users as they try

to find specific items on your Web site, including a screen shot can show readers what you’re

talking about and help them understand an important process.

Tip Setting up Outlook 2010 to send items to your mobile phone is as simple as clicking the File

tab to display Backstage view, choosing Options, and clicking Mobile. If you have an SMS Service

Provider, you can send calendar items, reminders, and messages to your mobile phone by simply

choosing the options you want to set and clicking OK. If you don’t yet have an SMS Service

Provider, you can find one on Office.com.

FIGURE 2-10 The main window of Outlook 2010 now sports the Ribbon, offering the tools you need within

easy reach.

Not only can you make your e-mail messages look more attractive and inviting for those

you contact, but Outlook 2010 gives you a number of ways to keep up with the mountain of

messages you receive every day. Here are a few of the key features that help you manage the

volume of mail you receive:

n Work faster with Quick Steps. The addition of Quick Steps enables you to carry out

routine

tasks with a quick click of the mouse. Now you can move a message to a

specific

folder, reply to a meeting, or send a message to your team with one little click.

Simple.

n Conversation View enables you to view the most relevant threads of a conversation

and suppress redundant threads. Having the ability to remove redundant messages in

the conversation saves Inbox space and helps you manage the volume of e-mail you

receive.

n Easily manage and clean up threads and move on to more pressing tasks without

getting

bogged down in messages that don’t relate to the task at hand.

n Easily manage your multiple e-mail accounts, whether you want to combine home and

work or any one of a number of Web-based e-mail accounts. Now you can bring them

all together in one place with Outlook 2010.

FIGURE 2-11 With Outlook 2010, you can easily add and enhance pictures, screen shots, and more.

Coming Next

In this chapter, you learned about the range of features in Office 2010 that enable you to add

visual effects and enhancements to the ideas you share with others. You explored new and

improved features in Word 2010, Excel 2010, PowerPoint 2010, Outlook 2010, Access 2010,

and Publisher 2010. The next chapter gives you this same kind of big-picture view of the

collaboration

features you’ll use as you work with teams in your office or around the world.

Chapter 3

Collaborate in the Office and

Around the World

In this chapter:

n It’s All About the Teamwork

n Benefits of Office 2010 Collaboration

n Stay in Touch with Your Team

n Co-Author Files Across Applications

n Use the Presence Icon to See Author Availability

n Using the Office Web Apps

n Sharing on the Road with Office Mobile

Depending on the type of work you do, you might work on your own most of the time,

or you might work as part of a larger group—or perhaps, several groups. Your groups might

come together to complete a specific project—for example, to produce a new marketing

plan—or they might work together over the long term, completing multiple projects.

Whether your need for collaboration is short term or ongoing, you’ll find features in

Office 2010 that enable you to

n Work with others simultaneously on the same file.

n Connect instantly with others working on your document or presentation.

n Know when others are actively working in the document you are viewing.

n Collaborate in real time with others working from their PCs, browsers, or phones.

It’s All About the Teamwork

As the world grows smaller and more accessible, our teams expand and grow. Today it isn’t

unusual to have someone from another continent on your team—whether the home office

for your company is in the U.S., the U.K., India, or another Web-enabled place on the globe.

In a team context, we plan projects, assign tasks, share documents, work collaboratively on

files, resolve problems, and ultimately produce materials that further the missions of our

companies, reach customers, and accomplish the tasks we’re asked to complete.

30 Part I Envision the Possibilities

What Teams Look Like Today

So how many of us are working in groups today? The exhaustive research data Microsoft

compiled shows that 52 percent of those surveyed currently work on two or more projectbased

teams. Others responded that most of their work is done independently, but they

collaborate

with others occasionally.

How do teams work collaboratively?

n Surprisingly, many workers collaborate by e-mail, sending versions of documents back

and forth. This isn’t the most accurate or most secure method of working collaboratively

because file versions can be misplaced and a sensitive file might be vulnerable to

interception unless it’s encrypted or sent only on the company intranet.

n Others collaborate via instant messaging (IM), discussing projects, making decisions,

and planning next steps.

n Still others use social networking tools to discuss items in a group, set up an event, and

more.

n Some businesses encourage the use of an employee portal or provide collaboration

software to facilitate teamwork.

n Other ways to collaborate include face-to-face meetings (of course) as well as shared

blogs, wikis, SharePoint sites, and more.

Office 2010 makes collaborating a natural and intuitive process by bringing collaboration

tools into all the major applications. You can share your team document on SharePoint, share

OneNote notebooks, create Word and PowerPoint projects, and ask fellow authors questions

while you work.

Team Tasks and Methods

In most workplace environments, teams are made up of people who are brought together by

role, not by choice. This means you might be part of a group that is an interesting mix of personalities

and talents. No matter how similar or how different team members might be, your

team will need to accomplish two primary goals to be productive and deliver what you came

together to create:

n You need to be able to communicate effectively.

n You need to find a way to solve problems.

Chapter 3 Collaborate in the Office and Around the World 31

The content of each team’s tasks might vary widely. One team might be organized to assess

the need for a human resources program; another team might come together to produce an

annual report. Still another team might be charged with planning, hosting, and evaluating a

national event.

Whatever the focus your work has or whatever the objective is you’re working toward, you

need to be able to share information, connect with others, and coordinate your efforts.

Along the way, each and every team needs to discover how to work together most effectively

to get the job done.

A Quick Look at Group Process

American psychologist Bruce Tuckman studied group dynamics, and in 1965, he

proposed

that every group goes through four stages:

n Forming is the stage in which the group comes together. During this stage,

the members get to know one another and begin the process of communicating.

Some members might be polite and others might be anxious, but basically

everyone is on his or her best behavior.

n Storming is the stage when team members begin evaluating their place in

the group and struggling to determine the overall work goals of the group.

Personalities might clash at this point as the group begins to carve out helpful

ways to interact.

n Norming occurs when the group members begin to settle into individual roles

that feel like a good fit for the group. The anxiety of testing roles and boundaries

calms down, personality differences are less of an obstacle, and the leadership

of the group seems suitably established.

n Performing is the stage when the group finally comes together as a team.

The group has a shared vision of the final goal of the work and is able to

move productively toward that end.

In the 1970s, Tuckman added a fifth stage that he called adjourning, which involves

disbanding

the group. This stage might bring anxiety for group members, but

it also offers closure if the group was convened to accomplish a specific goal or

complete

a project.

Benefits of Office 2010 Collaboration

No matter where your team might be in the group-formation process, the tools in

Office 2010 can help you communicate effectively, share questions and concerns, complete

project-related tasks, and speak each others’ languages. With the tools in Office 2010, you

can streamline the following team-related activities:

n Staying in touch no matter where you’re working

n Creating shared folders that others can use to access project assets such as images,

logos,

charts, documents, and more

n Working with more than one teammate on the same file at the same time

n Checking the online status of other authors of your shared document using the

presence

icon

n Translating words and phrases easily as you work

Stay in Touch with Your Team

Office 2010 Professional Plus includes SharePoint Workspace 2010, a data-sharing workspace

that enables you to have access to all the files and data you need as you work with your

team. With SharePoint Workspace 2010 (shown in Figure 3-1), you can create a team workspace

that gives you access—both online and offline—to tasks, links, announcements, team

discussions, and document and picture libraries.

FIGURE 3-1 SharePoint Workspace 2010 enables you to take your work offline and share your files and other

project resources with team members.

Share Files in the Workspace

You can create new workspaces related to specific projects your team will be working on and

set up the elements you want to share. Because SharePoint Workspace 2010 is a true Office

2010 application, you can simply drag and drop the files you want to work on later offline or

make available to the team into the Documents library. (See Figure 3-2.)

FIGURE 3-2 You can easily drag and drop files to a workspace to share items with the team.

The files are available whether you’re offline or connected to the server; when a server

connection

is established, features such as Check In and Check Out are enabled, and you can

navigate easily through the original site to sync, download, or review files.

Not only can teams share files, post announcements, and chat in real time in SharePoint

Workspace 2010, but team members can also post to a discussion board to carry on

conversations

related to the project at hand. (See Figure 3-3.) A chat feature also is available

to allow team members to ask and answer quick questions as they work in the space.

FIGURE 3-3 With SharePoint Workspace 2010, you can have conversations with teammates on topics related

to your shared project.

Share Files and Folders

SharePoint Workspace 2010 also enables you to create shared folders in your own system so

that you can exchange files easily with your server or share them with other people. By creating

and sharing a folder with files related to your specific folder, teammates can access the

resources they need in the shared folder you created. This enables you to share files easily

without uploading them to your company’s server space and also gives you an easy way to

organize and work with files you eventually check back in to the server.

Co-Author Files Across Applications

The new co-authoring feature, available in Word 2010 and PowerPoint 2010, is one of the

new collaboration features in Office 2010. Co-authoring is also available in the Excel Web

App. Now you can work collaboratively on a document at the same time others are working

in the file. You can edit files in real time, coordinating your changes, talking about revisions,

and reviewing the work of each person on your team.

Tip You also can simultaneously edit a shared notebook that is stored on SharePoint or

Windows Live with others who are using OneNote 2010, OneNote Web App, and OneNote

Mobile 2010.

No matter how many people on your team are working together online at the same time,

each person can work on her respective section, and then changes are synchronized automatically

when the file is saved or when the user logs on the next time. The names of all

authors are displayed next to the areas of the document they are editing. Additionally, a

pop-up list of available authors is available on the application status bar when the shared

document is active. (See Figure 3-4.)

FIGURE 3-4 Co-authoring enables you to see who else is working on the current document.

Note Co-authoring requires Office Communicator 2007 R2 and Office Communications Server

2007 R2. For Office 2010 users in a business environment, SharePoint Workspace 2010 is also

needed for co-author capability. Home users can use Windows Live as the co-authoring platform

for file collaboration.

Connect via Presence

The Presence icon is a new feature in Office 2010 that shows you the availability of team

members when you’re working on shared documents. When you point to the icon, a contact

card expands, listing the ways you can contact that person. (See Figure 3-5.)

If you’re using Windows Live for document sharing, you’ll be able to send an instant message

to other authors working on the document. You’ll also be able to schedule a meeting, add

the author to your Microsoft Outlook contacts, and work with Outlook properties. If you’re

using Office Communicator R2, you’ll have these choices and also be able to start a video

call, tag the contact for presence alerts, and add the contact to your Quick Contacts list.

FIGURE 3-5 When you click the Presence icon of a contact, the contact card for that person is displayed.

So how does Office 2010 know all this information about the various members of your team?

The Presence icon reflects online status information available in either Windows Live or

Office Communicator.

Microsoft Office Communicator 2007 R2

Microsoft Office Communicator 2007 R2 is an enterprise communications tool that

enables

you to unify the various ways you communicate with others—by e-mail, instant

message, voice, or video.

Microsoft Office Communicator 2007 R2 requires Microsoft Office Communications

Server 2007 R2, which runs only on 64-bit systems. Communicator client does not

require

a 64-bit computer.

Using Office Web Apps

Today more and more of us are escaping the confines of the cubicle and venturing out to

work in unusual places—the corner coffee shop, classroom, convention floor, park bench,

or client office. Flexibility is good, but we also need an easy, secure, and reliable way to access

mission-critical documents and files. We need to review records, update reports, quickly

check the budget, get an estimate on reservations, and approve the finals on the four-color

report before it goes to the printer.

The new Office Web Apps enable you to access and work with your files anywhere in the

world you have Web access. (See Figure 3-6.) And if you don’t have Internet Explorer, no

problem—Office Web Apps support Windows Internet Explorer 7 or later for Windows,

Safari 4 or later for Mac, and Firefox 3.5 or later for Windows, Mac, and Linux. That means

that whether you’re logging in on a Mac, PC, or kiosk at a hotel or airport, you’ll find the

same reliable Microsoft Office interface and be able to review, edit, and save the files that are

important for your work.

FIGURE 3-6 Office Web Apps offer the consistent look and feel in an easy-to-use, light editing interface.

Sharing on the Road with Office Mobile

Office Mobile 2010 (shown in Figure 3-7) lets you take Office with you wherever you go—

whether you’re carrying your laptop with you or not. Using Office Mobile 2010, you can

easily

check e-mail in Outlook Mobile 2010, organize your inbox, update schedules and tasks,

and even work on your favorite Office documents, all from your favorite smart phone.

The rich interface for small devices enables you to view and edit files from your mobile

phone. You can choose from a variety of display options, perform editing and formatting

operations,

and send files to SharePoint Server 2010 or to your Windows Live account.

FIGURE 3-7 The clear, easy-to-read Office Mobile interface makes it simple for you to read e-mail, work with

Office Mobile applications, and send and receive files.

Coming Next

The collaboration features in Office 2010 enable you to continue working with your favorite

applications in real time, no matter where you are or who you’re working with. With

SharePoint Workspace 2010, you can keep your team headed in the right direction; using the

coauthoring features and the presence icon, you can work collaboratively on documents and

contact other authors in your shared documents. And Office Web Apps and Office Mobile

both help you access your files remotely so that you can keep things moving whether or

not you’re able to be at your desk to do it. The next chapter kicks off Part II of this book by

showcasing

the new features available to you in Office Word 2010.

First Look: Microsoft Office 2010

Part II

Hit the Ground Running

Each of the applications in Office 2010 offers new ways to communicate your ideas more

visually and effectively, features that enable you to collaborate easier than ever before,

and ways to access your work from anywhere, anytime. Whether you spend your time in

Office 2010 generating content, managing projects, or analyzing financial data, you will

discover

new capabilities that streamline your tasks, add professional impact, and add

flexibility

and creativity along the way.

This part of the book introduces you to the new features in each application and gives you

some hands-on experience with different elements so that you can get up to speed quickly

with this new release:

n Chapter 4: Create and Share Compelling Documents with Word 2010

n Chapter 5: Create Smart Data Insights with Excel 2010

n Chapter 6: Manage Rich Communications with Outlook 2010

n Chapter 7: Produce Dynamic Presentations with PowerPoint 2010

n Chapter 8: Organize, Store, and Share Ideas with OneNote 2010

n Chapter 9: Collaborate Effectively with SharePoint Workspace 2010

n Chapter 10: Create Effective Marketing Materials with Publisher 2010

n Chapter 11: Make Sense of Your Data with Access 2010

Chapter 4

Create and Share Compelling

Documents with Word 2010

In this chapter:

n Start Out with Word 2010

n Format Your Text

n Illustrate Your Ideas

n Improve Your Text

n Co-Author Documents

n Access Your Documents Anywhere

What’s the big story in Microsoft Word 2010? Think flexibility and freedom of expression.

Imagine working on documents in the quiet of your neighborhood coffee shop, collaborating

with a coauthor who lives in Taipei, or doing a quick review on your smartphone before

you forward the document to a major client.

Word 2010 is designed to be simple to use and yet give you all the tools you need to create

sophisticated, professional documents that express your ideas clearly and well. The new features

enable you to work efficiently and collaboratively, in a consistent, familiar interface,

whether you’re working on your computer, in your browser, or on your smartphone.

Additionally, you’ll discover simplified ways for getting around in the document and an

expanded

slate of tools that enable you to include great quality fonts and picture effects so

that your finished product looks as good as possible. This chapter introduces you to the key

new features in Word 2010 and encourages you to try a few techniques along the way.

Start Out with Word 2010

The Word 2010 window is designed to help you focus on the task at hand, whether you are

writing, formatting, editing, illustrating, securing, or sharing documents. The Ribbon (shown

in Figure 4-1) stretches across the top of the window, providing all the tools you need, just

when you need them. The status bar at the bottom of the window enables you to get current

statistics on the document as you work (for example, checking the number of words in the

file) and change views so that you can display the document in different ways.

42 Part II Hit the Ground Running

Ribbon

Status bar elements View controls

FIGURE 4-1 The Word 2010 window maximizes your workspace while providing tools across the top and

document information and views along the bottom.

Tip To display the Customize Status Bar list and add other information elements to the status

bar, right-click the status bar at the bottom of the Word window.

Get Familiar with the Word Ribbon

The Ribbon in Word 2010 makes it easy for you to find just the tools you need when you

need them. (See Figure 4-2.) Tabs contain tools related to specific tasks you want to complete.

The Insert tab, for example, includes the tools you need to add illustrations, links,

tables, and much more.

Chapter 4 Create and Share Compelling Documents with Word 2010 43

FIGURE 4-2 The Ribbon offers just the tools you need, depending on what you’re working on in Word.

Contextual tabs appear when you select a specific element in the document—for example,

when you click a picture, the Picture Tools contextual tab appears, as you see in Figure 4-3.

FIGURE 4-3 Contextual tabs offer tools that relate to the selected object in the document.

One great feature common to all Office 2010 applications is the ability to customize the

Ribbon so that you can add your own tabs, putting together the tools you use most often in

the configuration that fits you best. You can create your own custom tabs or tab groups, and

move tools on the existing Ribbon tabs to create groups just the way you want them.

Tip You can hide the Ribbon easily and maximize your work area by pressing Ctrl+F1 or clicking

the Minimize The Ribbon button on the right side of the window above the Ribbon.

Find What You Need Easily with the Navigation Pane

The new Navigation Pane is a great addition to Word 2010. Combining the best of the Find

utility with Outline view and thumbnail displays, the Navigation Pane gives you multiple ways

to find what you’re looking for in your document. Now you can move right to a section in

your document by clicking the heading you want (as shown in Figure 4-4), scroll through the

list of page thumbnail images, or enter a search phrase and choose from the list of results (as

shown in Figure 4-5).

FIGURE 4-4 With the Navigation Pane, you can move easily through the document by clicking the heading of

the section you want to see.

The improved search features offered in the Navigation Pane enable you to find the content

you need, whether it is in the basic body text or headings of the document or whether it

appears

in your document in the body text, headings, tables, graphics, footnotes, sidebars, or

comments.

FIGURE 4-5 The powerful search capability in the Navigation Pane displays a clickable results list showing all

the places in the document your search word or phrase appears.

Print and Preview in a Single View

If you are like most Word users, you print your documents on a regular basis. And using Print

Preview is part of the process, enabling you to make sure the overall page looks the way you

want it to look, the pictures are in the right places, and the headings are in the appropriate

spots. Word 2010 smooths out the printing process by combining the print and preview tasks

into a single step.

Now you can preview and print your document in Backstage view with literally a single click.

You can still change your print options, setting print quality, choosing the paper source and

size, and specifying the number of copies before you print. You can also page through the

document easily and shrink or enlarge the page view so that you can check all details easily

before sending the document to the printer.

Format Your Text

Word users typically spend quite a bit of time formatting documents. New features in

Word 2010 enable you to apply stylized effects to text, use high-quality fonts, and choose

just the right paste options for the task at hand so that your documents look professional

with just a little help from you.

Did You Know?

n Eighty percent of all documents use fewer than 20 styles.

n Word users include an average of 16 styles in each document.

n The most common formatting changes are font size, font face, and font color.

Step by Step: Printing and Previewing

Because the print and preview features are streamlined into one process in Word 2010,

you can easily review and print the file. Here’s how to print and preview a document in

Backstage view:

1. Open a document or create a new document that you’d like to print. If you create

the file, save it before you prepare to print.

2. Click the File tab on the Ribbon. This displays the Word 2010 Backstage view.

3. Click Print. The document appears in the Print Preview window to the right of the

Print options.

4. Preview the document by clicking the Previous Page or Next Page controls in

the bottom left corner of the preview window. You can also change the size of

the page display by adjusting the Zoom control in the lower right corner of the

preview

window.

5. Use the options in the center column to choose the print settings you want to

apply

to the printed document. For example, you might need to select your

printer,

specify the pages you want to print, or change from one-sided to

double-sided printing.

6. When the document in the preview looks the way you want your printout to

appear,

click the Print button at the top of the center column.

Tip Before you print a long document or multiple copies of the same document, test print one

copy to ensure the format of the document and the page margins appear the way you want

them to.

Apply Text-Formatting Effects

You’ll find a number of features in Word 2010 that help you create a pleasing, professional

look for documents of all types. Document themes let you choose a consistent color scheme,

font style, and object format; Quick Styles give you a gallery of text styles to apply to your

text; and the Font and Paragraph groups in the Home tab enable you to make changes to

individual words, phrases, lists, and more in the documents you create.

Now Word 2010 gives you the ability to add special touches to the format of your text. You

might use a special text effect to create a compelling headline, make a product name stand

out, or create an attention-getting banner for a flyer or brochure. You’ll find the Text Effects

tool in the Font group of the Home tab. Clicking it reveals a list of ready-to-apply text effects,

as well as a collection of artistic effects (Outline, Shadow, Reflection, and Glow) that

enable you to fine-tune the look even more. (See Figure 4-6.)

FIGURE 4-6 Text effects give you the tools to format headlines and text elements to make them stand out.

Tip Even after you’ve applied a special effect to your text, you can continue to modify the look

by making additional choices in the Text Effects list. You might change the bevel style of the

letters,

for example, or add a glow to specially formatted text.

Preserve Your Format Using Paste with Live Preview

How often do you copy and paste something in the documents you create? Microsoft research

shows that Copy and Paste are two of the most often-used features in Word 2010;

users might copy and paste as many as 300 times per month. Both procedures are simple,

requiring only that you select the item you want to copy, click Copy on the Home tab, put

the cursor where you want the item, and click Paste. Simple, right?

The challenge was that pasting the text or object sometimes had unexpected results,

depending

on where users pasted the item and what type of item was being inserted in

the document. In fact, users undo paste operations more than any other in Office 2010.

To answer this challenge and provide consistently what users expect, Copy and Paste have

been improved in Word 2010. Now more than 400 clipboard formats are supported to

make copying

and pasting as easy and as reliable as possible while you’re working on your

documents.

Step by Step: Pasting Content Your Way

Use Paste with Live Preview to get just the paste results you want.

The variety of paste formats and the flexibility Word 2010 offers you when you paste

text and objects in your document results in more reliable formats and less tweaking

after the fact. And that means better efficiency and reliable results. Nice!

This example shows you how to use Paste with Live Preview so that you can get the

results

you want when you paste content in your Word 2010 document:

1. With your document open on the screen, highlight and copy the text or object

you want to paste. Click Copy in the Clipboard group of the Home tab or press

Ctrl+C.

2. Click at the point in your document where you want to paste the copied item.

3. Click the Paste arrow in the Clipboard group of the Home tab. A Paste Options

gallery appears as you see here:

4. Point to each Paste icon to see a live preview of the way the item will appear in

your document.

5. Click your choice. The item is pasted in the document as you selected.

You can also display the Paste Options gallery by right-clicking in the document at the

point you want to paste the copied information. After you paste the content, Paste

Options are displayed near the paste location in case you want to make a change:

Tip Paste with Live Preview is available in all Office 2010 applications, and the selections

displayed

under Paste Options in the gallery vary depending on the type of content you have

copied to the Clipboard. To display the Office 2010 Clipboard, click the dialog launcher

in the

lower right corner of the Clipboard group in the Home tab.

Illustrate Your Ideas

With Word 2010, you don’t need to switch back and forth between software programs to

include professional-quality images in your documents. Using Word’s illustration features,

you can easily apply special artistic filters to your photos or capture screen shots to include in

your documents.

Apply Artistic Effects

The Artistic Effects in Word 2010 give you a variety of filters you can apply to your images

to produce a wide range of special effects. For example, you might apply the Pencil Sketch

effect to convert an image to an artistic black-and-white rendering, or use the Paint

Brush effect

to create a dramatic image of a new product.

To apply an artistic effect, simply select the image in the document you want to modify.

The Picture Tools contextual tab is available. Click Artistic Effects in the Adjust group.

(See Figure 4-7.) Preview the different effects by pointing to the effect you want to see; the

image is displayed with that particular effect. Click the one you want, and it is applied to

the image.

Tip Office 2010 offers enhancements to SmartArt in Word 2010, Excel 2010, and PowerPoint 2010.

Learn about SmartArt’s new features in Chapter 5, “Create Smart Data Insights with Excel 2010.”

FIGURE 4-7 Artistic effects enable you to apply a variety of special filters to the figures in a document.

Insert Screen Shots

Adding pictures of your screen can come in handy when you are preparing team documents,

sharing procedures with others, or writing a process to let others know how to work with a

specific document. No matter what you might want to capture on the screen, Word 2010

makes it easy for you to grab the parts you need and include them in your document.

When you click the new Screenshot tool, available in the Illustrations group of the Insert

tab, a gallery of screen-shot options appears. (See Figure 4-8.) The images in the gallery are

thumbnails of the various applications you currently have active on your system. To choose

one of the screen shots, click it; the image is added at the cursor position in your document.

FIGURE 4-8 You can insert a screen shot of any window currently open on your system.

Step by Step: Adding a Screen Clipping

Grab just the portion of the screen you want to include in your Word 2010 document.

Suppose that you want to quickly clip a segment of a worksheet to include in the report

you’re writing. You can grab a screen shot of the worksheet and add it to your document

by following these steps:

1. Open the document in which you want to add the screen shot, and click where

you want to insert the screen shot.

2. Open the worksheet you want to use for the screen shot.

3. Display the document again, and click the Insert tab. In the Illustrations group,

click the Screenshot arrow and choose Screenshot Clipping.

4. The worksheet automatically displays. Click in the upper left corner of the area

you want to clip, and drag across the area to be included.

5. When you release the mouse button, the area is clipped and inserted in the

document

at the cursor position.

Improve Your Text

The Word 2010 Spell Check is smarter than ever; now it takes context into account as it

checks your document. And the Word 2010 language tools make it easy to translate on the

fly when you work with colleagues around the world.

Catch More Than Typos with a Contextual Spell Check

Have you ever received a document that was spelled right but included words that were used

incorrectly? Words such as their and there or seen and scene can easily be misused in a document,

causing a disconnect for your readers and clouding your message. The Word 2010

enhanced spelling checker now evaluates the words you use for the context in which they

appear,

which helps you ensure that your documents are as correct as possible.

Tip You can have Word 2010 check for grammar and style in your document, which includes

searching for punctuation, usage, clichés, gender-specific words, and more. To change the grammar

settings, click the File tab and then Word Options, click Proofing, and then click the Settings

button in the Writing Style area.

Start Spell Check by clicking the Review tab and clicking Spelling & Grammar in the Proofing

group. The Spelling & Grammar dialog box shows you one by one any issues that the checker

discovers so that you can enter changes as you go. The Dictionary Language setting enables

you to look up the word in another language to see whether the usage or spelling is correct.

(See Figure 4-9.)

FIGURE 4-9 Now you can check the spelling and usage of words and phrases in dictionaries from

other languages.

Tip If your spelling checker doesn’t seem to be working properly in Word 2010, make sure

you’ve set your default language and that Spell Check isn’t disabled. Here’s how: On the Review

tab, click Language in the Language group and choose Set Proofing Language. In the Language

dialog box, click your primary language and choose Set As Default. When prompted, click Yes.

Now make sure the Do Not Check Spelling Or Grammar check box is clear. If a check mark

appears,

click the option to clear the box; then click OK.

Use Language Tools, and Translate on the Fly

Many people now work in teams that span not only cities but continents. When you are

working with peers in Europe, Asia, or other continents, language differences can present

challenges. Word 2010 now includes enhanced language features that help you stay in sync

with the global workplace. Now you can translate words and phrases on the fly, and develop

documents that offer ScreenTips and Help in a variety of languages.

Translating in Real Time

Another language feature in Word 2010 enables you to translate text easily as you work in a

document. Using the Translation Language tools, you can choose to show side-by-side translations,

display the full document in a Web-based translated view, or use the Mini Translator

toolbar to translate words as you go.

Setting up the translation tools is simple. In the Language group of the Review tab, click

Translate. A list offers you four choices:

n Translate Document

n Translate Selected Text

n Mini Translator

n Choose Your Translation Language

Start with the last item first if this is the first time you’re using the Translation tools. When

you click Choose Your Translation Language, the Translation Language Options dialog box

appears so that you can choose the language you’re translating from as well as the language

you’re translating to. (See Figure 4-10.) Click your choice for each option on the Translate

Document tab; then choose the language you’re translating to on the Mini Translator tab.

Finally, click OK to save your settings.

FIGURE 4-10 Set your translation languages using the Translation Language Options dialog box.

When you choose Translate Document, a dialog box appears alerting you that the document

will be translated by the Web site WorldLingo and displayed in your browser online.

To continue

the operation, click Send and your document is translated and displayed in your

Web browser window.

To translate a specific phrase or paragraph, begin by highlighting the text you want to translate;

then click Translate and choose Translate Selected Text. The Research pane appears on

the right side of your document window, and the text you selected is translated, according

to the language you selected. You can modify the settings by choosing new options in the

Research pane.

Finally, to translate words and phrases on the fly, you can use the Mini Translator tool. This

convenient little toolbar enables you to translate text as you go by simply hovering the

mouse over the word you want to translate. (See Figure 4-11.) This can help you review text

quickly and double-check your translations.

FIGURE 4-11 The Mini Translator enables you to translate words, phrases, or blocks of text as you work.

Tip The Mini Translator tool also includes an audio feature that will read back the text you select

in your document. Simply highlight the text you want to hear, choose Mini Translator in the

Translate list, and click the Play button on the Translator tool.

Co-Author and Share Documents

Word 2010 makes it easy for you to work collaboratively with others whether they work

down the hall or on the other side of the world. Co-authoring features in Word 2010 make

it possible for multiple authors to work on a file at the same time and contact each other

in the process. And using SharePoint Workspace 2010 or Windows Live, you can save to an

online workspace, communicate with coauthors, and keep track of changes in the file without

e-mailing multiple documents back and forth or running the risk of overwriting changes

another

person on your team has made.

The co-authoring capabilities require SharePoint Foundation Services (for business clients)

or Windows Live (for personal use). When you post a document to your SharePoint

Workspace or Windows Live SkyDrive space and invite another author to share it with you,

you will be able to see indicators in the document when the other author makes changes.

(See Figure 4-12.) At each point the author makes changes, you see the author’s name and

a presence indicator that shows you the author’s online availability.

FIGURE 4-12 When you co-author a Word 2010 document, you can see who else is working on your shared

document at the same time.

Tip The instant messaging, presence, and voice call features require Office Communicator 2007

R2 and Office Communicator Server 2007 R2, Windows Live Messenger, or another instant-messaging

program that works with IMessenger.

Overview: Introducing Presence

Presence technology is all about being able to reach team members whenever they are

present online. In Office 2010 Presence is available for systems running Microsoft Office

Communicator 2007 R2, enabling you to see at a glance which of your coauthors are

available online to answer questions, chat about your project, or talk about next steps.

Office 2010 applications that support presence display a small green icon beside a

person’s

name in a status bar list when they are available for contact. When you hover

the mouse over the person’s name, a pop-up list of options for contact appears. You

can send an instant message, open a chat window, compose an e-mail message, or initiate

a phone call by clicking one of the communication options. You can also click the

menu to display the contact’s full contact information and discover additional ways of

making contact.

Working with Shared Documents

You can easily add authors to documents you’re working on and share files using SharePoint

Workspace 2010 or Windows Live SkyDrive. You first post the files you want to share and

invite authors to the SharePoint workspace or SkyDrive folder. Once you begin working with

the shared document, Backstage view gives you information about the shared file, as you see

in Figure 4-13.

Shared workspace

where the file is stored

Save status of the file

Click to send

a message

to a coauthor

Authors currently

working in the file

Notes added

to the file

Click to see other documents

in the shared workspace

FIGURE 4-13 You can add the names of coauthors in Backstage view and check documents stored in the

shared workspace for the current file.

Access Your Documents Anywhere

Picture this: You are rushing out the door to get to a meeting uptown. You had hoped to put

the finishing touches on the client report you need to share with the team, but because your

afternoon meeting ran long you were unable to finish it. Now you’re on the train and have a

few minutes to spare. Luckily, you can access Word 2010 on the Web and wrap up those last

few details.

Use Word Web 2010

By logging into Windows Live, accessing your SkyDrive folders, and opening a Word 2010

document you’ve posted there, you can access the familiar Word 2010 tools and features so

that you can complete your document on time. Log in to your account, choose SkyDrive (in

the More menu), and open your My Documents folder; then click the document you want

to work with. Click View to open the file. In the Word Web App window, you can simply review

the file (if you don’t want to make any changes). Figure 4-14 shows the Word Web App

window.

FIGURE 4-14 You can review the document, do light editing to it, and even print it from the Web.

If you click Open in Word, the Word Web App will ask for your Windows Live login information.

After you provide it, the document opens in Protected view. (Protected view is always

used when a file is downloaded from an Internet location.) Click Enable Editing to activate

editing mode, and the document opens in Word 2010. When you save the document,

the changes you made are synchronized with the online document with no further action

from you.

Check Your Document with Word Mobile 2010

Need to take a quick look at a customer document before that important meeting? Or

perhaps

you want to sign off on a report so that others on your team can share it right away.

With Word Mobile 2010, you can use the familiar Word 2010 experience to display and

search your document, make simple changes, and save and finalize the file.

With Word Mobile, you can easily open, view, edit, and copy and paste information in your

Word documents—using your smartphone. Spell Check and AutoCorrect are available to

help you make sure your edits are accurate.

The document formatting—even in tables, charts, and graphics—will be preserved on your

phone display, thanks to Office Mobile’s Text Reflow technology. You can use all the familiar

formatting basics—bullets, numbering, fonts, paragraph formats, and more.

You can also send your document from your smartphone by e-mail to a friend or colleague

or post it to your SharePoint workspace.

Word Mobile 2010 is not part of Office 2010, but the software will be available at the release

of Office 2010 for phones using Windows Mobile 6.5 or later.

Chapter 5

Create Smart Data Insights with

Excel 2010

In this chapter:

n Start Out with Excel 2010

n Summarize Your Data Easily

n Illustrate Information Effectively

n Show PivotTable Data Your Way

n Work Anywhere with Excel 2010

Helping others understand the information you present—whether you work with words,

numbers, pictures, or media—is a key part of success in any business environment. The big

story in Microsoft Excel 2010 includes new features that help you convey your findings in

ways others can easily understand. Sparklines are small, cell-sized charts you can add to your

worksheet to provide a visual summary of the data in selected ranges; new icon sets and improvements

to data visualization options give you greater variety in the way you present information;

SmartArt and charting enhancements offer additional flexibility; and slicers enable

you to graphically slice-and-dice your PivotTable to display just the information you want to

show at any given time.

And Excel 2010 also includes new offerings for the high-end spreadsheet user: new

formulas,

support for spreadsheets with millions (yes, millions) of rows, and the integration

of SharePoint 2010 and Excel Services, which enables you to publish worksheets and

dashboards

to your intranet or to the Web. This chapter touches on the top new features in

Excel 2010 and encourages you to give a few of them a try.

Start Out with Excel 2010

The Excel 2010 window offers you an open, visually inviting workspace that presents

everything

you need to work with multiple worksheets, enter formulas and cell values, and

change the way you view information on the screen. Figure 5-1 introduces you to the various

elements in the Excel 2010 window.

62 Part II Hit the Ground Running

Quick Access Toolbar

Navigate worksheets Worksheet area View controls

Tabs Ribbon

Minimize

The Ribbon

button

FIGURE 5-1 The Excel 2010 workspace gives you plenty of room onscreen while providing the tools you need.

Tip You can hide the display of the Ribbon to maximize your space onscreen by clicking the

Minimize The Ribbon button in the upper right area of the Excel 2010 window.

The Office 2010 Ribbon includes eight tabs, each with groups of tools related to a specific

focus: File, Home, Insert, Page Layout, Formulas, Data, Review, and View. For example, to add

a SmartArt diagram to the current worksheet, you click the Insert tab and choose SmartArt in

the Illustrations group. A group of SmartArt tools appears in the Ribbon in a contextual tab

to provide the tools you need for creating and customizing SmartArt. (See Figure 5-2.)

You’ll find many of the new tools and enhancements in Excel 2010 on the Insert tab.

Specifically, the Screenshot tool in the Illustrations group, the Sparklines group, and the Slicer

tool in the Filter group are all new. The following sections introduce you to these features in

more detail.

Chapter 5 Create Smart Data Insights with Excel 2010 63

FIGURE 5-2 Contextual tabs display tools you need only when you are working with a specific item on your

worksheet.

Summarize Your Data Easily

Your worksheets enable you to organize, track, and calculate financial information over time.

An important part of making sense of the data you gather—and sharing what you find—involves

communicating the results in a way others can easily understand. Sparklines are small,

cell-sized charts that appear within your worksheet, giving readers a quick picture of what

the numbers on the worksheet mean. Because sparklines stay with your data (unlike a chart,

which might appear in a section of the worksheet some distance from the data it reflects),

they show clearly the relationship among the data values used to create them.

You can create three kinds of sparklines in Excel 2010. The program offers you the choice of

line, column, or win/loss sparklines:

n Line sparklines show trends and changes in values over time.

n Column sparklines enable you to compare values.

n Win/loss sparklines enable you to analyze values in relation to a norm.

Step by Step: Add Sparklines to Your Worksheet

Here’s how to summarize your data easily with sparklines.

You can add sparklines at any point in your worksheet where you want to show data

trends, comparisons, or summaries. Here are the steps to add sparklines and customize

them to meet your needs:

1. Open a worksheet or create a new worksheet in Excel 2010. If you are creating a

new worksheet, enter the data you want to use as the basis for the sparklines.

2. Click and drag to select the cells that include the data you want to show in the

sparkline.

3. Click the Insert tab, and click the type of sparkline you’d like to create: Line,

Column, or Win/Loss.

4. The Create Sparklines dialog box shows the range of cells you selected in the top

data field.

5. Click in the Location Range field of the Create Sparklines dialog box, and then

click the cell on the worksheet where you want the sparkline to appear.

6. Click OK. The sparkline is added to the document.

7. Click the elements in the Show/Hide group of the Sparkline Tools Design tab to

customize the appearance of the sparkline you added. As you click your choices,

new points are added to the examples in the Style gallery, as shown here:

8. Click the Sparkline Color arrow to display the palette and set the color of the

sparkline.

9. Click the Marker Color arrow to choose the color of the markers displayed on the

sparkline.

10. After you set the sparkline formatting options as you want them, you can copy

and paste the sparklines to other cells in the worksheet. Excel will update the

references

to show the correct sparkline representation in the cell.

Note The formatting options for sparklines are group-based, which means that making changes

to one sparkline changes the format of all sparklines in the series. To change an individual sparkline

(for example, to make the color of one sparkline stand out), remove it from the group by

clicking it and choosing Ungroup in the Group area of the Sparkline Tools tab.

Tip To delete an unwanted sparkline from the worksheet, click the sparkline, click Clear, and

then click Clear Selected Sparklines in the Group area of the Sparkline Tools tab.

Illustrate Information Effectively

Especially when they are reviewing large worksheets, people who are unfamiliar with the

data you’re presenting might not be drawn instantly to the key points you want them to

understand.

To help you spotlight important data elements on your worksheets, Excel 2010

includes a number of conditional formatting features. Here are a few of the elements that

draw readers’ attention to important data and help communicate what the data represents:

n Icon sets display small icons in data cells that spotlight high, mid, and low values, for

example.

n Data bars enable you to show how values in a range of cells compare with one another.

Call Attention to Your Data with Icon Sets

When you want to call attention to a specific range of cells in your worksheet, consider using

icon sets to do it. Icon sets are small pictures that appear with the data in a cell to help the

reader evaluate what the value means. For example, a cell showing a low sales value might

display a red flag, while a cell showing a top sales value could display a green flag.

Excel 2010 includes a number of new features that enhance the capabilities of icon sets. New

icon sets include new ratings sets, such as stars and boxes, and you can customize the formatting

and display choices for icon sets so that you can create exactly the display you want.

Step by Step: Use Improved Icon Sets to Highlight Data

You can spotlight key data values easily with icon sets.

Excel 2010 includes 20 icon sets in four categories: Directional, Shapes, Indicators, and

Ratings. Now you can easily format and customize the icon sets so that they show those

viewing your worksheet what’s most important and why. Data bars now show gradients

relative to the values they display, and you can show both positive and negative values

in data bars. In short, you have more control over the visualizations you choose for your

worksheet data. Follow these steps to use an icon set to display worksheet values:

1. Open the Excel worksheet you want to use.

2. Select the individual cell or range of cells where you want to add icon sets.

3. Click Conditional Formatting in the Styles group of the Home tab.

4. Click Icon Sets, and a list of icon sets appears, as you can see here:

5. Click the icon set you want to apply to the selected cells.

6. To customize the icon set, select the cells, click Conditional Formatting in the

Styles group of the Home tab, and click Manage Rules.

7. In the Conditional Formatting Rules Manager, click Edit Rule. The Edit Formatting

Rule dialog box appears, as shown here:

8. You can change the display of an individual icon by clicking the down arrow to

the right of the icon you want to change and choosing a new icon.

9. Change the range of values represented by the icon by changing the Value and

Type settings for each icon.

10. Click OK to save your changes. The worksheet displays updates automatically.

Tip You can do much more with icon sets in Excel 2010. To learn more about the different ways

you can spotlight the data on your worksheets, see Microsoft Office 2010 Plain and Simple by

Jerry Joyce and Marianne Moon (Microsoft Press, 2010).

Data Bar Improvements

Data bars offer another type of conditional formatting element in Excel 2010 that help you

analyze data values in a range of cells. Enhancements to data bars make it possible for you

to include negative values and apply formats that make the data bars easier to understand.

(See Figure 5-3.)

FIGURE 5-3 Now in Excel 2010 you can add data bars that reflect negative as well as positive values.

Step by Step: Compare and Analyze Values with Data Bars

Data bars help you show at a glance how key data values compare to one another. You

can use data bars to display a range of values that can include both positive and negative

numbers.

1. Select the range of cells where you want to add data bars.

2. Click Conditional Formatting in the Styles group of the Home tab.

3. Click Data Bars to display the list of data bar styles you can apply to the selected

cells, as shown here. As you point to each data bar style, the selected range

previews

the choice. Click the style you want to apply.

4. Add negative value capability by clicking Conditional Formatting again and

choosing Manage Rules.

5. In the Conditional Formatting Rules Manager, click Edit Rule. In the Edit

Formatting Rule dialog box, click the Negative Value And Axis button.

6. In the Negative Value And Axis Settings dialog box, select the fill color and axis

settings you want to apply to the negative value display. You can also set the axis

color.

7. Click OK to apply your changes. The data bars appear in the selected cells,

showing

any negative values as you selected.

Tip You can change the way the data bars appear in your cells by choosing either gradient

solid fills. Additionally, you can customize the way the fills and borders look to create the best

look for the data displayed on your worksheet.

New SmartArt Enhancements

SmartArt diagrams give you a simple way to add professional diagrams to your worksheets,

documents, and presentations. Whether you need to create an image that links product

descriptions to the values on your worksheet, spotlights key initiatives your data reflects,

or helps new hires learn your sales tracking process, you can use SmartArt to organize and

depict

your thoughts easily and effectively.

Excel 2010 adds a new set of SmartArt graphics to the mix, enabling you to choose from

additional

layout styles and incorporate pictures easily in the diagrams you add to your

worksheet. Now expanded support for text gives you more flexibility for the descriptions you

provide so that you can let the image further explain the connections between worksheet

data and the images the diagram presents.

Use Slicers to Show Data Your Way

Using Excel 2010 you can track, analyze, and report on the information you gather about

your organization. Displaying the data you need and making decisions based on the insights

you gather is an important part of working effectively in Excel. Excel 2010 includes a new

feature called slicers, which enable you to slice your data easily and include only the elements

you want in the PivotTables and PivotCharts you create. Using slicers, you can easily add

and remove elements from the table display, which helps you compare and evaluate data

from different perspectives. What’s more, you can use the slicers you create with multiple

PivotTables and PivotCharts to showcase your data consistently in a variety of scenarios.

Step by Step: Use Slicers to Segment Data Display

Use these visual controls to move data elements in and out of your PivotTable display.

1. First create the PivotTable you want to use with the slicer you create.

2. Click the PivotTable to select it.

3. Click the Insert tab, and click Slicer in the Filter group.

4. In the Insert Slicers list box, click the field you want to use to slice the PivotTable

data. In the example shown here, Cost is used to select a specific data value to

display.

5. Click OK. The slicer appears on the worksheet.

Tip You can attach a slicer you create to another PivotTable connected to the current worksheet

by clicking the slicer and choosing PivotTable Connections in the Slicer group of the Slicer Tool

Options tab. Select the PivotTable to which you want to add the slicer and click OK.

Note You can easily change the look of the slicers you add to your worksheet by clicking the

Slicer Tool Options tab and changing the slicer caption, style colors and effects, button size and

format, and overall size of the slicer window.

Tip PowerPivot for Excel is a new add-in (previously called “Project Gemini”) available with Excel

2010 that supports extremely large worksheets of up to 2 gigabytes (GBs). PowerPivot enables

you to model and analyze data on worksheets that include literally millions of rows and to sort,

filter, and use table lookup functions on multiple tables in Excel 2010.

Work Anywhere with Excel 2010

Chances are that the worksheets you create are changed, shared, and improved over time.

Perhaps you start a draft and share it with your team, and each person takes responsibility

for updating a specific section for the final version. When all the pieces are finished, you can

review the finished version and save and distribute the file you’ve created. Along the way,

you check on worksheet changes, make suggestions, answer questions, and add data visualizations,

charts, and PivotTables that help showcase the results you want readers to understand.

Using both Excel 2010 Web App and Excel Mobile 2010, you have the ability to view

your worksheet from any point you have Web or phone access and make sure the worksheet

is developing according to plan.

Excel 2010 Web App

Using the Excel 2010 Web App enables you to view your worksheet using SharePoint

Workspace 2010 or Windows Live and work with your favorite tools and features to do light

editing, review content, and work collaboratively. The Web window, shown in Figure 5-4,

offers

the consistent look and feel of the Excel 2010 interface and enables you to view, edit,

open, format, recalculate, search, and refresh the data connections in your workbook.

Tip In Excel 2010, Excel Services is integrated with SharePoint 2010, enabling you to share your

analyses with others in your organization. If you have SharePoint Server 2010, you can share your

worksheet in Excel Services by clicking the File tab, and, in Backstage view, clicking Share. Finally,

choose Publish To Excel Services.

Excel Mobile 2010 has been designed specifically to give you a simple, intuitive interface

even on your smartphone’s small display. The fonts, bullet styles, and worksheet display make

it simple for you to navigate your worksheets and find what you need easily. What’s more,

when you make changes to your worksheet using Excel 2010 Mobile, worksheet values are

recalculated instantly—no syncing required.

With Excel Mobile, you can easily create, view, and recalculate your workbooks, and you can

add charts as needed. The worksheet on your phone will support 140 different functions, so

you won’t trade processing power for flexibility when you’re crunching numbers on the road.

FIGURE 5-4 Excel Web App enables you to view, edit, format, and work collaboratively in your worksheet.

Excel Mobile 2010.

Chapter 6

Manage Rich Communications with

Outlook 2010

In this chapter:

n Starting Out with Outlook 2010

n Managing Your Conversations

n Cleaning Up Your Messages

n Streamlining E-mail Tasks

n Coordinating Calendars

n Improving the Look of Your Messages

n Keeping in Touch with Outlook Mobile

Communication is at the heart of everything you do. Whether you are finishing a report for

others to review, posting a new document for human resources, wrapping up a presentation

for the sales staff, or setting and scheduling appointments with clients, being able to stay in

touch with key people is a vitally important part of your daily activities.

Today’s computer user receives close to 100 e-mail messages a day, and that volume is

steadily increasing. To manage so much e-mail effectively, you need to be able to separate

the necessary messages from the unnecessary ones. Microsoft Outlook 2010 includes a

number

of new features that enable you to easily manage the messages you receive, track

important conversations, and automate your common messaging tasks. What’s more, you

can stay up to date with all your friends and colleagues via social networks and communicate

in real time using instant messaging—all within Outlook 2010. This chapter introduces you to

the new features in Outlook 2010 that help you get control of your Inbox, communicate easily

with your team, create and use group schedules, and access Outlook from your browser.

76 Part II Hit the Ground Running

Starting Out with Outlook 2010

The Outlook 2010 window gives you all the tools you need for managing e-mail and working

with calendars, contacts, and tasks. The work area is divided into five separate panes (as

described

in the following list and shown in Figure 6-1), each providing you with a different

way to work with the information you see:

n The Navigation Pane enables you to choose what you want to do. The top of the pane

displays favorite folders, the center shows all active folders in Outlook 2010, and the

bottom area enables you to choose the view you want to see.

n The Inbox (in Mail view) lists the e-mail messages you receive, arranged according to

your selection.

n The Reading Pane enables you to read the selected e-mail message without opening it.

n The To-Do Bar lists the calendar of the current month, appointments for the current

week, and your upcoming tasks.

n The People Pane shows you any social media information available for the person

sending the current message, and it lists files, appointments, and notes related to that

person.

Navigation Pane

Inbox

Ribbon Reading Pane To-Do Bar

People Pane

FIGURE 6-1 The Outlook 2010 window provides you with different ways to work with messages,

appointments,

and tasks.

Chapter 6 Manage Rich Communications with Outlook 2010 77

Using the Outlook 2010 Ribbon

Outlook 2010 includes the familiar Ribbon, designed to give you just the tools you need for

the type of operation you’re performing. The Ribbon offers seven tabs (File, Home, Send/

Receive, Folder, View, Add-Ins, and Conferencing), giving you specific tools to manage the

volume of messages, tasks, and appointments you create and receive. The Ribbon display

changes based on the view you display. For example, Figure 6-2 shows the Home tab when

Calendar view is active.

FIGURE 6-2 The Calendar view Home tab includes groups and tools for creating arranging, managing, and

sharing calendars.

Tip In Outlook 2010, you can easily customize the Ribbon to add new groups or organize tools

so that they are available just the way you want them. Click File, choose Options, and click the

Customize Ribbon category to change the way the Ribbon is displayed.

Setting Preferences with Backstage View

You’ll also discover that Outlook 2010 shares Backstage view with the rest of Office 2010,

enabling

you to add new e-mail accounts; modify your account settings; set up automatic

replies, rules, and alerts; and specify mailbox cleanup options. (See Figure 6-3.) Backstage

view makes it easy for you to view and change settings for one or more e-mail accounts that

you use with Outlook 2010.

Tip Want to use Outlook 2010 for your various e-mail accounts? Display Backstage view,

click Account Settings, and click Add Account to add your other communications connections

(

including text messaging services) to Outlook.

FIGURE 6-3 You can add e-mail accounts, set out-of-office replies, and more from Backstage view.

Managing Your Conversations

Conversation view is one of the big improvements in Outlook 2010, enabling you to see at

a glance the important messages in a conversation thread. With Conversation view, you can

stay on top of changing information, make decisions in a flash, and opt out of conversations

that no longer require your input. In addition to gathering related messages in one convenient

thread, Conversation view makes it easy for you to categorize, remove, or clean up the

messages you don’t need, which cuts down on the clutter in your Inbox.

When you open Outlook 2010, your messages are displayed in Conversation view by

default,

with the most recent messages first. The Reading Pane shows the first message in the

selected conversation. (See Figure 6-4.)

In addition to viewing and responding to messages in Conversation view, you also can

organize

your conversations to streamline your message-management tasks. For example,

you can choose the Move tool in the Actions group of the Home tab to tell Outlook to place

the current conversation in a specific folder. When you choose the feature Always Move

Messages In This Conversation, Outlook 2010 enables you to specify the destination folder so

that all messages related to the current conversation are stored in that folder automatically.

FIGURE 6-4 Use Conversation view to track conversations and bow out when they no longer require your

attention.

Step by Step: Following E-mail Conversations

Follow these steps to use Conversation view in Outlook 2010:

1. When you first open Outlook 2010, Conversation view is displayed by default. If

you have changed to another view, return to Conversation view by clicking the

View tab and choosing Conversation in the Arrangement group.

2. In the Inbox area, click a conversation group you want to view. The group expands

to show the messages most relevant to the conversation thread, as shown here:

3. To display all messages in the thread, double-click the first message in the

conversation.

The list expands to show all the messages that have been sent in

this thread.

After you’ve read the conversation, you might want to organize it in any of the

following

ways:

n Ignore further messages in this conversation by choosing Ignore in the Delete

group of the Home tab

n Clean up the conversation and remove redundant messages by choosing

Clean Up Conversation in the Clean Up selection of the Delete group (also in

the Home tab)

n Move the conversation to a specific folder by clicking Move and choosing the

selection you want in the Actions group on the Home tab.

Cleaning Up Your Messages

Many of the e-mail messages we receive on a daily basis are unnecessary, and yet because

of the volume of messages we receive we might not organize or remove them as soon as we

read them. Outlook 2010 provides the Clean Up command to help you remove redundant

messages from your Inbox.

You will find the Clean Up tool in the Delete group of the Home tab. When you click the

tool, you are given the option of cleaning up the current conversation, cleaning up a specific

folder, or cleaning up the current folder and any subfolders it contains. Simply click the item

you want and Outlook 2010 displays a message box telling you that any redundant messages

removed will be placed in your Deleted Items folder. (See Figure 6-5.)

FIGURE 6-5 Outlook 2010 makes it easy for you to clean up conversations, folders, and subfolders and weed

out unnecessary messages.

Streamlining E-mail Tasks

Most likely, you perform many of the basic Outlook tasks regularly: you check e-mail,

respond

to e-mail, set appointments, schedule tasks, and track contact information. For this

reason, the designers of Outlook 2010 set out to save you time and streamline your tasks by

giving you the ability to do those tasks with a single click. Using Quick Steps, you can perform

common tasks with a single click of the mouse button. For example, you might forward

a message to your manager, reply to a meeting, or send a team e-mail message with a single

click.

By default, Outlook 2010 includes a set of 10 Quick Steps available in the Quick Steps group

of the Home tab. (See Figure 6-6.) You can easily set up these Quick Steps to do what you

need them to do or create your own Quick Steps based on other common e-mail tasks.

FIGURE 6-6 Quick Steps enable you to complete routine tasks with a single click of the mouse button.

Table 6-1 introduces you to each of the Quick Steps that come ready to use in Outlook 2010.

TABLE 6-1 Introducing Quick Steps

Quick Step

Description

Move to:?

Prompts you to choose a folder, and gives you the option of Move To

Folder/Mark As Read

Forward: FYI

Forwards the selected message, and adds “FYI:” to the beginning of the

subject line

Team E-Mail

Displays the Customize Quick Step window so that you can enter the

mail addresses of teammates and set up the feature

Reply & Delete

Displays the Reply window, and deletes the existing message after you

click Send

Create New

Enables you to create a new Quick Step

Meeting Reply

Sends a meeting response to the sender

To Manager

Automatically sends a message to your manager

Done

Moves the current message to a folder you specify, and marks the

message

as complete

Team Meeting

Creates a new meeting request for your team

Step by Step: Create a Custom Quick Step

Quick Steps were designed to simplify your common e-mail tasks. The following steps

show you how to create a new Quick Step to fit your needs:

1. Enter a new name for the Quick Step.

2. In the Actions area, click the down arrow to the right of Choose An Action, and

select the item you want to add from the displayed list, as shown here:

3. Click Add Action.

4. Click the Shortcut Key arrow, and choose a shortcut key from the list if you want

to assign one.

5. Enter text to describe the action in the Tooltip Text text box.

6. Click Create.

Tip The Quick Contacts feature enables you to locate contacts quickly in Outlook 2010. In the

Find group on the Home tab, simply click in the Find A Contact box and type the first few characters

of the person’s name. If you are using Office Communicator, a list of contacts that match

the characters you typed appears toward the bottom of the To Do Bar. If you are not using Office

Communicator, when you press Enter, the Choose Contacts message box appears, listing all contacts

that match the characters. Choose the contact you want by clicking the name and clicking

OK. The person’s contact record is displayed so that you can review or modify it and save it.

Working with Presence and Social Media

When you are working collaboratively, being able to tell when teammates are online and

available to answer a quick question is a definite plus. Outlook 2010 provides presence and

status information for your contacts so that you can easily communicate with others in real

time. When the green presence indicator shows you that a coworker is online, you can make

a quick call, start a video conference, or take a few minutes to meet virtually—communication

that enables you to keep your project moving forward.

If you are working with Outlook Web App or using Office Communicator, you will be able to

see the presence information in the Reading Pane and in the Contacts list. Hover the mouse

over a name in the message header to display a complete contact card with pictures and

contact information. (See Figure 6-7.)

FIGURE 6-7 Outlook 2010 includes presence information for your contacts in Outlook Web App and Office

Communicator.

Tip If you are using Outlook with Microsoft Exchange 2010, the MailTips feature enables you to

double-check the e-mail messages you send. MailTips alert you when you are sending messages

to contacts outside your office, replying to a large distribution list, or sending a message with

confidential information.

The People Pane at the bottom of the Reading Pane shows you the sender’s profile picture

and gives you access to additional information about that person—her recent status update,

any files she sent you, upcoming relevant appointments, and more. You can sign up with

third-party social media sites and receive status updates and profile changes without ever

leaving Outlook 2010. (See Figure 6-8.)

FIGURE 6-8 Outlook 2010 incorporates social media features that enable you to stay in touch with friends

and colleagues through social media.

Coordinating Calendars

One of the challenges to keeping a team moving in the right direction involves finding a

time to meet when everyone can attend. The new scheduling features in Outlook 2010 make

viewing, updating, and sharing calendars easier than ever. And you can easily create Share

groups so that all scheduling information for your team is kept together in one easy-tounderstand

view.

Viewing Group Schedules

The new Schedule view enables you to easily combine a number of calendars on the screen

at one time. This makes setting appointments simple and enables you to coordinate the free

and busy time your group needs to get things done.

Step by Step: View Calendars in Schedule View

You can easily view the calendars of everyone on your team using Schedule view:

1. First, make sure that others on your team have shared their calendars with you.

2. Click Calendar.

3. In the Home tab, click Schedule View. The display shows the calendars that you

have permissions to view, along with busy and available times, as shown here:

Create a Calendar Group

If you are part of a team that works together regularly, you might want to create a calendar

group so that you can easily view all schedules together. You can create a calendar group

by clicking Calendar Groups in the Manage Calendars group on the Home tab. If you want

to save the group of calendars in the current view as a calendar group, click Save As New

Calendar Group. If you want to select a new set of calendars and create a group, choose

Create New Calendar Group. With either selection, the Create New Calendar Group dialog

box appears. (See Figure 6-9.) Type the name you want to use for the group, and click OK.

FIGURE 6-9 Create a calendar group to display all schedules for your team.

After you create the calendar group, the group name appears in the navigation pane on the

left side of the calendar window. (See Figure 6-10.) You can hide the display of the group

when it’s not needed by clearing the check box; to redisplay the calendar group, select the

check box again.

FIGURE 6-10 After you create a group calendar, Outlook 2010 adds it to your Calendars list.

Improving the Look of Your Messages

Now you can apply the professional designs available in Office 2010 themes to the e-mail

messages you create and send. Outlook 2010 includes dozens of themes you can apply to the

messages you send to clients, peers, friends, and family. When you apply an Office theme to

your message, the colors and fonts that are part of that theme are reflected in the styles you

choose to format the text in your message. Later if you choose to change the selected theme,

the styles you use will also change automatically to reflect the new styles.

The Office themes also are consistent throughout all your Office 2010 files, which means that

you can coordinate the files you send so that the message you send can have the same look

and feel as the files you attach to it.

Step by Step: Applying Office Themes

Adding an Office theme to your e-mail messages is as simple as point and click. Here’s

how to do it:

1. Open the message to which you want to apply the theme.

2. Click the Options tab.

3. Click Themes. The Themes gallery appears, as shown here:

4. Hover the mouse over themes you want to preview.

5. Click the theme you want to apply to the message.

Note You can choose different fonts and colors from those offered by a specific Office theme,

but if you later change the Office theme applied to the message, the fonts and colors you

changed will not be updated automatically.

Keeping in Touch with Outlook Mobile

In today’s mobile world, many people are accustomed to checking e-mail, receiving text

messages, and updating their calendars online. Now, using your Windows Mobile smartphone,

you can use the high-quality phone display to view your messages in Conversation

view, selecting and moving messages using touch technology, and get online access to your

calendar, contacts, tasks, and more.

Tip Let Outlook 2010 know which types of alerts you want to receive on your smartphone by

clicking File, choosing Options, and clicking the Mobile category in the Outlook Options dialog

box. Select the appropriate check boxes if you want Outlook to send calendar summaries and

reminders to your smartphone. Additionally, you can arrange to forward Outlook items to your

phone or find a text-messaging service provider and set up your phone for Short Messaging

Service (SMS) messaging.

Outlook Mobile is also available to Microsoft Exchange customers as part of the standard

licensing agreement.

Chapter 7

Produce Dynamic Presentations with

PowerPoint 2010

In this chapter:

n Starting Out with PowerPoint 2010

n Editing and Formatting Video

n Creating and Working with Animations

n Enhancing Your Presentation with Transitions and Themes

n Adding Sections to Your Presentation

n Managing and Sharing Your Presentation

Imagine this: You’re in a crowded board room waiting for your chance to present new ideas

to a prospective client. You and your team have been working on this presentation for weeks.

The content is just right—the approach hits the mark, and you’ve even added custom video

clips and animations to spotlight key ideas you want the client to remember.

PowerPoint 2010 gives you the ability to add and edit video in your presentation, edit

pictures

on your slides, enhance animations, choose from among improved transitions,

add great narration, compare and merge presentations, and much more. The collaborative

features in PowerPoint 2010 enable you to easily work on your presentations with a

team, communicate in real time with coworkers, and access your files anywhere—using your

browser window or your smartphone.

Starting Out with PowerPoint 2010

The PowerPoint 2010 window gives you a simple, intuitive interface that provides all the tools

you need for building effective, professional presentations. The PowerPoint Ribbon offers

tabs that include tools specific to each of nine different tasks: File, Home, Insert, Design,

Transitions, Animations, Slide Show, Review, and View. The PowerPoint work window displays

Slide view by default, which shows your current slide in the largest area of the window,

along with a segment for notes and a panel that will show all the slides you create in the

presentation.

(See Figure 7-1.)

90 Part II Hit the Ground Running

Notes View controls

Tabs Current slide Ribbon

Slide thumbnails

FIGURE 7-1 The PowerPoint 2010 window provides the tools for creating the current slide, adding notes, and

working with all slides in your presentation.

You display Backstage view in PowerPoint by clicking the File tab. The Backstage view gives

you the controls you need to work with the presentation file you’re creating. (See Figure 7-2.)

Commands in Backstage view enable you to optimize the media you include in your presentation,

set permissions for your co-authors and teammates, control version information, and

prepare the file for distribution.

Additionally, you use Backstage view to create new presentation files, print slides and

handouts,

set PowerPoint options, and choose how you want to share the presentation.

Chapter 7 Produce Dynamic Presentations with PowerPoint 2010 91

FIGURE 7-2 Backstage view enables you to set preferences for your PowerPoint files and compress and

optimize

the media files you use.

Tip If you are working with Office Communicator, you will be able to add and view authors in

the right panel of the Info display in Backstage view. If you are using Windows Live to share your

presentation, you can add authors in Backstage view but presence information will not display.

Editing and Formatting Video

Video is the big story in PowerPoint 2010. It’s no secret that video is everywhere—whether

you are interested in filming your own video clips to demonstrate a new product or service

or want to include a video from the Web for a little visual interest, you can easily incorporate

video into your PowerPoint presentation.

Now when you add a video to your presentation by choosing Video from File in the Media

group of the Insert tab, the file is embedded into the presentation itself, which makes

packaging

and presenting a snap.

Tip If you choose Video From Online Video Site, PowerPoint displays a dialog box so that you

can paste the embedded code from the site onto the slide. Using this process to include video

does not embed the file, however; it simply creates a link to the video online. This is an important

distinction to remember if you are presenting in a place that does not have an Internet connection.

The video editing features available in Video Tools Playback enable you to trim the video

without ever leaving PowerPoint. (See Figure 7-3.) When you’ve captured more video than

you need for the current presentation, having the ability to cut the clip down to size without

leaving PowerPoint is a great timesaver. You can also change the fade in and fade out values,

adjust the volume, and set playback options using the editing tools in the Video Tools

Playback tab.

FIGURE 7-3 You can easily trim your video in PowerPoint 2010 to select the best segment for your

presentation.

PowerPoint also includes video styles you can apply to the video clip in your presentation.

When you apply a style, the format remains in effect while the video plays. For example, if

you choose the Reflected Perspective Right Video Style, which angles the video slightly to the

right, the entire video clip will play at that angle.

Tip PowerPoint also shares the picture editing tools available throughout Office 2010. Simply

click a picture in your presentation and the Picture Tools Format tab appears, offering you tools

for making picture corrections, changing the color, applying artistic effects, applying picture

styles, and much more.

Step by Step: Adding and Editing Video

Follow these steps to add and edit video in PowerPoint 2010:

1. Display the PowerPoint slide where you want to add the video.

2. Click the Insert tab, and click Video in the Media group.

3. On the File tab, click Video.

4. In the Insert Video dialog box, select the video file you want to add and click

Insert. The video is placed on the slide.

5. Click the Video Tools Edit tab, and click Trim Video. Drag the beginning and end

markers to the place you want the video clip to begin and end, as shown here:

6. Test the video by clicking the Play button. Adjust the markers as needed.

7. Click OK to save your settings.

8. Change the Fade In and Fade Out settings to change the way the video begins

and ends.

You can continue to make changes and play the video by clicking the Play button

in the video player on the slide.

Tip Now you can add precision to your slides by using the dynamic alignment guides. These

guides appear automatically whenever you drag an object on the slide, helping you align the

object with other elements on your slide.

Creating and Working with Animations

PowerPoint’s improved animation features enable you to animate objects on your slide,

spotlighting

key points and adding movement to your presentation. The animations in

PowerPoint 2010 are now more realistic, creating the effect of smooth movement for the

animations

you create. (See Figure 7-4.)

FIGURE 7-4 PowerPoint 2010 now includes improved animations you can add to virtually any object on

your slide.

Tip PowerPoint 2010 includes a number of new transitions that enable you to control the way

your slides advance. The transitions you choose are important because they convey the overall

tone of your presentation—slow or fast, blocks or fades, sweeps or subtle dissolves—each

transition

contributes to the overall effect you’re creating.

When you add an animation to your slide, the Trigger tool becomes available in the Custom

Animation tab of the Animations tab. Using the Trigger feature, you can add a bookmark to

trigger the animation you want to play on the slide.

PowerPoint also includes the Animation Painter, which enables you to apply animation

settings

to other objects on your slide. This feature is similar to the Format Painter, which

enables

you to choose format settings to apply to other text in your document.

Step by Step: Adding an Animation

Here’s how to add an animation to your slide in PowerPoint 2010:

1. Display the PowerPoint slide where you want to add an animation.

2. Click the object you want to animate.

3. In the Animations tab, click Add Animation in the Custom Animation group. A

gallery of animation options appears.

4. Point to different animations to see how the animation looks on the slide. Click

the style you want.

5. You can customize the animation by choosing the options at the bottom of the

Add Animation gallery.

After you add the animation, you can fine-tune the animation’s behavior by using

the tools in the Animation, Custom Animation, and Timing groups. After you

make changes, use Preview to view your changes.

Tip PowerPoint 2010 also includes improved narration features, which means that you now can

record voiceovers that you play back during your slide show. To try the narration feature, connect

your microphone and make sure it’s working; then click Record Slide Show in the Set Up group of

the Slide Show tab.

Enhancing Your Presentation with Transitions and

Themes

If you’ve been using PowerPoint for any length of time, you’ve no doubt already discovered

slide transitions. Transitions give your presentation a professional touch by fading, dissolving,

wiping, or panning one slide into another. PowerPoint 2010 introduces great new transitions

to help you capture and keep the attention of your audience. (See Figure 7-5.) The new transitions

are easy to find and apply; and you can preview the different effects until you find

just the transition style you want. You can apply transitions to a single slide or to all slides,

and you can customize the transition by specifying the duration of the transition, deciding

whether to apply a transition sound, or choosing transition effects.

FIGURE 7-5 PowerPoint 2010 includes new transitions you can apply to your slides.

Step by Step: Applying Transitions

Here’s how to apply slide transitions in PowerPoint 2010:

1. Open the presentation to which you want to apply the transitions.

2. In the Transition To This Slide gallery, click the transition you want to try.

3. Experiment with other transitions until you find the one you want to use.

4. Click Effect Options to display the additional choices for the transition you selected,

and choose the effect you want.

5. If you want to add sound to the transition, click the Sound arrow and choose the

sound you’d like to add.

6. In the Duration field, set the number of seconds you want to allow for the transition

to be completed.

7. If you want to apply the transition settings to all slides in the presentation, click

Apply To All.

Adding Sections to Your Presentation

Now in PowerPoint 2010 you can organize your presentation into sections to enable you to

navigate through your slides in the way that fits your content best. Suppose, for example,

that your presentation introduces your audience to a new program being offered through

your human resources department. As you begin the presentation, you discover that the

group has previously seen an introductory presentation that covers all the material you introduce

in section one. Because you have organized your presentation in sections, you can easily

jump to section two of your presentation and continue with the information your audience

hasn’t seen, without missing a beat. (See Figure 7-6.)

FIGURE 7-6 Create sections and use them to navigate easily during the presentation.

Working with sections also enables you to organize long presentations so that you can work

with them easily while you’re creating and editing your work. When you add a section—by

clicking the first slide in the section you want to create, clicking Section in the Home tab, and

choosing Add Section (as shown in Figure 7-7)—the Slides view changes to highlight the new

section. Additional controls enable you to rename, collapse, expand, and remove sections

easily.

FIGURE 7-7 Add sections to your presentation to better organize and navigate longer slide shows.

Managing and Sharing Your Presentation

Because it’s likely that you will be collaborating with others on the presentations you create,

PowerPoint 2010 makes it easy for you to manage, compare, merge, and share presentations.

This reduces the likelihood that you’ll have multiple versions of the same file in circulation

and enables you to keep the project moving by communicating with other authors easily

while you work.

Merging Presentations

If you are combining changes that several reviewers have made in a single presentation, the

Compare tool in PowerPoint 2010 can simplify the process of putting it all together. The

Compare tool helps you see which changes have been made in the presentation and create

one merged file that reflects the latest version of the shared file.

The compared view shows the changes made among the versions and provides the

Revisions panel to help you see where the changes were made and what they entailed.

(See Figure 7-8.) You can click the small note icons on the slides to display information about

the specific changes made to the elements on the slide.

FIGURE 7-8 PowerPoint 2010 compares and displays changes among versions of the file.

Step by Step: Comparing Presentations

Follow these steps to use the compare and merge feature in PowerPoint 2010:

1. Open the presentation that you want to compare to another version of the file.

2. Click the Review tab, and click Compare.

3. In the Choose File To Merge With Current Presentation dialog box, navigate to

the folder with the file you want to use.

4. Select the file and click Open.

5. PowerPoint displays the merged file and shows the Revisions pane to list the

changes between the versions. Click a change in the Details pane to display a

pop-up list of changes to that element.

6. Select the check boxes of changes you want to keep. Repeat this step for all

changes in the presentation.

7. To save the merged file, press Ctrl+S. To discard the changes, close the file without

saving.

Tip PowerPoint includes co-authoring capability for users who work with SharePoint Workspace

2010 or Windows Live. Co-authoring gives you the ability to work collaboratively with authors

on the file. You can share the file by clicking the File tab to display Backstage view and then

clicking Share. You can enter author information in the Related People area of the Info panel

in Backstage view. If you are using Office Communicator, you can send an instant message to

another author while you work on a slide, which enables you to ask and answer questions in real

time while you work.

Broadcasting Your Presentation

PowerPoint 2010 enables you to broadcast a presentation to others at remote locations

whether or not they have PowerPoint installed on their computers. The Broadcast Slide Show

feature works with SharePoint Server 2010 or Windows Live. You simply choose Broadcast

Slide Show in the Start Slide Show group of the Slide Show tab and click Start Broadcast. (See

Figure 7-9.) The broadcast service sends your remote audience members the link to your presentation

so that they can log in using their Windows Live account and participate directly.

Note Broadcasting your presentation in this way displays the visual portion of your

presentation

only; no audio is transmitted. If audio is necessary for your presentation, you might

want to set up a conference call so that participants can hear your narration, ask questions, and

participate in the presentation.

FIGURE 7-9 Broadcasting a slide show.

PowerPoint 2010 gives you the choice of sending an e-mail message or an instant message

to the people who want to view the presentation. When others receive the link, they can

simply

click it to join the presentation in real-time. (See Figure 7-10.)

FIGURE 7-10 Others can view your broadcast presentation in real time whether they have PowerPoint or not.

Printing Presentation Notes

If you have written a script or presentation notes that you want to have available in your presentation,

you can print them easily using Backstage view. Simply click the File tab to display

Backstage view, and click Print. In the Print display, select the Full Page Slides setting in the

Other Settings area. A pop-up gallery of print layouts appears. (See Figure 7-11) Click Notes

Pages, and then click the right arrow below the preview document to page through your

notes.

Tip PowerPoint also includes the Create Handouts In Microsoft Word feature to enable you to

save your notes, along with the representative slides, to a Word document. To save the notes

to Backstage view, click File to display Backstage view, click Share, and click Create Handouts In

Microsoft Word. Finally, click Create Handouts.

FIGURE 7-11 You can easily print your presentation notes by choosing Notes Pages in the Print settings.

Save Your Presentation as a Video

Another new feature in PowerPoint 2010 makes it easy for you to save your presentation as

a video and share it with others. Click File to display Backstage view, and click the Share tab.

Then click Create A Video to display the available video options. Table 7-1 shows your choices

for the level of video quality.

TABLE 7-1 Video quality for presentations

Used For

Resolution

Quality

Frames Per Second

Computer & HD Displays

960 by 720

High

Internet & DVD

640 by 480

Medium

Portable Devices

320 by 240

Low

After you choose the video quality, click the Use Recorded Timings And Narrations option. A

range of options enables you to disallow timings and narrations, set new timings and narrations,

preview the current settings, or use the existing settings. Click the Create Video button

to start the process, and PowerPoint saves a video as a .wmf file. (See Figure 7-12)

FIGURE 7-12 Create a video of your PowerPoint presentation.

Work with the PowerPoint 2010 Web App

As part of the access-anywhere approach available in Office 2010, you can use the

PowerPoint Web App to view and edit presentations from your Web browser. The PowerPoint

Web App offers a limited set of editing tools in the familiar Ribbon interface (shown in

Figure 7-13); you also can work collaboratively with other authors using the PowerPoint Web

App and display your presentation in full-screen view.

FIGURE 7-13 PowerPoint Web App displays your presentation in the familiar Office interface.

Using PowerPoint Mobile 2010

PowerPoint Mobile 2010 enables you to view your presentation slides in the familiar Office

interface designed for the simplicity of a small mobile screen. You can use PowerPoint Mobile

2010 to review your presentation notes, easily work with slides using Slide Manager, and

page through the slides you’re presenting. And in the live presentation, you can also use

PowerPoint on your smartphone to advance your slides remotely.

Tip If you use a Tablet PC or another device with tablet features, you can take advantage of the

better inking features in PowerPoint 2010. A greater variety of inking tools—including a large

collection of pens and expanded inking options—are available for you to use. You’ll find the

inking

features in the Review tab of your tablet-enabled device.

105

Chapter 8

Organize, Store, and Share Ideas

with OneNote 2010

In this chapter:

n Starting Out with OneNote 2010

n Capturing Notes Easily

n Working with Linked Notes and Task Notes

n Finding Just the Notes You Need

n Sharing Ideas Effectively

n Accessing Your Notes Anywhere

Are you a big note-taker? Do you scribble ideas on backs of envelopes, sticky notes, scraps of

paper, and pieces of napkins? Taking notes is a good practice—and it just might turn out that

one of those notes will have the perfect answer to a problem down the road. But if you can’t

find what you’ve scribbled later, chances are that your inspiration will go to waste.

Microsoft OneNote 2010 is the electronic equivalent to a handy notebook, with a twist:

the program is designed to capture text, links, Web content, video and audio clips, articles,

drawings, and more—in whatever way you collect them. Whether you prefer to doodle your

ideas, speak into a microphone, scribble cryptic notes, or clip pieces of Web pages, OneNote

makes it easy to pull all that content together in one searchable place where you can find it

easily later. And as you continue to amass ideas that fascinate you, you can easily find and

incorporate those sparks of inspiration into your current projects.

The new features in OneNote 2010 make it easy to collect notes from Microsoft Office 2010

applications such as Word, PowerPoint, and Outlook and print the notes to your OneNote

notebook. You can also share your notebooks and track changes by author, version, recent

changes, and more. So even though you work collaboratively, you can still separate the great

ideas from the good ideas. The search feature in OneNote is now better than ever and includes

an expanded search pane, which enables you to see at a glance all the occurrences of

the search phrase that appear in your notebook.

The OneNote Web App enables you to access your notebooks from any place you have

access

to the Web, and OneNote Mobile makes it possible for you to review and do simple

editing in your notebook using your smartphone. As you begin to use OneNote regularly to

collect your thoughts and plans, you’ll realize that more and more of your great ideas are

actually making it into the work you do in Office 2010.

106 Part II Hit the Ground Running

Starting Out with OneNote 2010

The OneNote 2010 window gives you a great range of flexibility in the way you work with

the notes you create. Each tab in the Ribbon offers tools related to a specific note task:

File, Home, Insert, Share, Draw, Review, and View. (See Figure 8-1.) The various areas of the

OneNote window enable you to work with the notes you create in different ways:

n The Navigation Bar enables you to move easily among your notebooks. You can click

the expand button at the top of the Navigation Bar to display a list of all notebooks

and the pages they contain.

n Section tabs give you a simple way to store different types of information in your

notebook.

You can create subpages and sections within each section and rename the

section tabs to fit your needs.

n The page tabs display the titles of existing pages in the current section. You can click

page tabs to move among pages in a section easily.

n The OneNote page collects your notes in any way you want to enter them. Simply click

anywhere and type, write, or draw, or add a voice, video, or file object.

Navigation Bar

Unfiled Notes

Section Tabs Ribbon Search Page Tab

FIGURE 8-1 The OneNote 2010 window gives you access to your notebooks, unfiled notes, and much more.

Chapter 8 Organize, Store, and Share Ideas with OneNote 2010 107

Capturing Notes Easily

The whole idea behind OneNote is that you need one central location where you can gather

all your notes—in whatever form—so that you can organize, share, and use them easily. You

might insert pieces of interviews, maps to locations, clips of video, photos of products, meeting

notes, articles, and anything else that provides you with the information you need for the

projects you’re creating.

Today everyone suffers from information overload in one way or another, and we all find

ways to do our best to manage the data that comes our way. Developing a smart, fast method

of organizing your notes helps you be more efficient and bring an even greater range of

resources to your work. One Note 2010 includes several features that make it simple to create

and work with the information you capture in your notebooks.

Using OneNote as You Work

Chances are that as you work on a specific document, worksheet, or presentation—or even

as you’re composing an e-mail message or surfing the Web—ideas are occurring to you that

relate to the task at hand. If you’re editing a report, for example, you might think of a site

you visited last week that had some interesting statistics about the topic you’re reviewing.

Using OneNote, you can make a note to yourself to look up that site when you’re finished

editing. And you can make the note right alongside the work you’re doing in Word.

The Dock To Desktop feature in OneNote lets you easily position OneNote in a reduced-size

window alongside other open applications. This means you can take notes—or refer to notes

you’ve already taken—while you’re working on a document, worksheet, or presentation, To

dock OneNote to your desktop, click the Dock To Desktop tool in the Quick Access Toolbar in

the top left corner of the screen. The OneNote window shrinks to the current note page only,

as you see in Figure 8-2.

When you’re ready to expand the OneNote window to full screen view, click the Full Screen

icon in the Quick Access Toolbar. To return the Navigation Bar, section tabs, and page tabs to

the display, click the View tab and click Normal View.

Tip Although there’s no hard-and-fast rule for the way your notes are supposed to look in your

own personal notebook, using specific styles on your note headings can help you find the information

you need easily when you’re scanning through the pages. You can now apply Quick

Styles to your note headings by selecting the text (or clicking where you want the style to begin)

and clicking the style you want in the Styles group on the Home tab.

Dock to Desktop Full Page View

FIGURE 8-2 You can use Dock To Desktop mode to keep a OneNote page open alongside your other

application

or browser window.

Tip One new feature in OneNote 2010 enables you to recover notes you’ve previously deleted.

To restore deleted notes, click the Share tab, and in the History group, click Notebook Recycle

Bin to open the bin so that you can open the notes you want to preserve.

Create Notes Anywhere

For OneNote 2010 to be a hub for all the notes you collect, the program needs to work

seamlessly with your other Office applications. Perhaps you finished a presentation and realize

after the fact that you can use it as the basis for another project you’ll be working on next

month. You can save the presentation in your OneNote notebook for that project with just a

few simple clicks.

You can use the Print command in either Word 2010 or PowerPoint 2010 to send an entire

file—in living color—to the OneNote notebook you select. You can then clip sections, graphics,

notes, or slides; leave the entire file as is; or use the content in other documents or notebooks

you create. This technique is simple, reusable, and smart—putting your content within

easy reach for you to use in other projects down the road.

Step by Step: Printing a File to OneNote 2010

Here’s how to incorporate an entire file in your OneNote 2010 notebook:

1. Open the PowerPoint presentation or Word document you want to add to your

OneNote notebook. (This example uses a PowerPoint presentation, but the process

is the same for either application.)

2. Click the File tab to display Backstage view.

3. Click Print. In the Printer area, click Send To OneNote 2010, as shown here:

4. Click Print. The Select Location In OneNote dialog box appears, as you can see

here:

5. Click the expansion button to the left of the notebook to which you want to add

the file.

6. Click the section or page where you want the article to be placed, and click OK.

The entire file is inserted on the page you selected.

Working with Linked Notes and Task Notes

OneNote 2010 offers the Linked Notes feature in Word 2010, PowerPoint 2010, and Internet

Explorer 8. Suppose that you are writing a report in Word 2010 and you want to check something

in the notes you took at the last team meeting. While you’re working in the document,

click the View tab and then click Linked Notes. The notebook you used to gather ideas for

this document opens automatically in a panel along the right side of the Word window.

OneNote automatically saves a link to the file or Web page you are reviewing so that you

can easily return to the source of your notes later. When the OneNote window is open, you

can add, edit, move, search, or organize your notes; copy them into the Word document; or

simply

check what you wanted to review and close the file.

Note The Linked Notes feature does not appear automatically in the Internet Explorer toolbar.

To open a OneNote page while working in Internet Explorer 8, click Tools and choose Linked

Note from the Tools menu.

The first time you use the Linked Notes feature with a new file, OneNote 2010 will display the

Select Location In OneNote dialog box so that you can choose the notebook, section, and

page you want to use with the current application. Expand the notebook to display the sections

and pages; then click the OK button and the OneNote page appears in a small window

along the right side of your work area.

Tip Linked Notes works only with OneNote 2010, so if you have saved your notebook to be

compatible with OneNote 2007 users, you will need to convert the notebook to OneNote 2010

in order to use Linked Notes. A dialog box will prompt you to change the notebook properties if

Linked Notes is not currently available for your file.

OneNote 2010 is also available in Outlook 2010, enabling you to easily capture notes on

project-

related tasks you create for yourself or for members of your team. To send the Task

Notes you create to a OneNote 2010 notebook, simply create and save the task; then click

OneNote in the Actions group of the Task tab. (See Figure 8-3.)

FIGURE 8-3 You can easily send task information to a OneNote notebook.

Finding Just the Notes You Need

If you’re like most people, a percentage of the notes you take never make it into any kind of

practical use. Why? Chances are that a number of your good ideas simply get lost—stuck in

a coat pocket somewhere, left in your other briefcase, or forgotten on the table of the restaurant.

With OneNote 2010, you capture all your information in one easy-to-navigate place,

which means you can always retrieve what you need easily using OneNote’s powerful search

tools.

In OneNote 2010, the search features have been expanded so that they now go through all

types of content—including video clips, embedded objects, and more. The search navigation

pane appears as you begin to type a search word or phrase, showing you all the places in

your notes where the search characters appear. (See Figure 8-4.)

FIGURE 8-4 The expanded search pane instantly shows results as you begin to type your search phrase.

You can customize where OneNote searches information in your notebooks by clicking the

Finished: All Notebooks (Change) link at the top of the search pane, and if you want to keep

the Search Results Pane open alongside the work area, click the Open Search Results Pane

link at the bottom of the search pane.

Sharing Ideas Effectively

One of the great things about OneNote is that it serves as a creative catch-all for any type

of information you gather related to a specific project, idea, or event. No matter what your

team is working on, team members can save text notes, diagrams, sketches, audio clips, video

clips, Web links, and even doodles to the pages in your shared notebook.

OneNote 2010 includes a number of new sharing features that help you recognize easily who

has added what—and when. You will be able to search through the notebook by author,

highlight the changes you haven’t read, check note versions, and much more. You’ll find all

the tools you need for working with shared notebooks on the Share tab. (See Figure 8-5.)

FIGURE 8-5 The new Shared Notebook features in the Share tab enable you to track the changes in the

notebook

that are made by your team.

Creating a Shared Notebook

To create a new shared notebook in OneNote 2010, click the Share tab and click New Shared

Notebook in the Shared Notebook group. Backstage view appears, with Network Notebook

selected. Enter a name for the new notebook, and click Browse to select a location for the

notebook. Finally, click Create Notebook to add the new shared notebook to your files.

Tip After you’ve created a notebook, you can invite others to share the notebook with you by

displaying the Info page of Backstage view. All notebooks are listed, showing the title of the

notebook, the Settings button for each notebook, as well as the location of the notebook file and

the sharing status of the notebook. To display the properties for the notebook, click the Settings

button and click Properties.

Finding Entries by Author

Suppose that you’ve been discussing a new idea for a video clip with one of your teammates,

and she mentions that she posted the link to a video she liked on a page in your

shared OneNote notebook. How can you find the note she added to the file? The Find By

Author tool in the Shared Notebook group on the Share tab enables you to sort entries first

by person and then by date, showing the most recent posts first. In this way you can find the

author’s name, read the description of the addition as well as the date, and then click the one

you want to display it in the OneNote page view. (See Figure 8-6.)

FIGURE 8-6 Find By Author enables you to display all entries made by author, date, and description so that

you can move easily to the note you need.

Working with Page Versions

One of the challenges in working with multiple authors on a notebook is that it can be

difficult

to tell which version of the page you’re viewing. Are you adding your notes on the

most recent version of the page? OneNote 2010 includes the Page Versions feature to help

you determine that you’re working on the most recent page available so that you can be sure

all changes are incorporated in the page you’re viewing. Simply click the Share tab and click

Page Versions in the History group. (See Figure 8-7.)

FIGURE 8-7 Displaying page versions enables you to review the content added to your pages and make sure

you’re working with the most current page.

Tip You can also use page versions to undo changes that were made inadvertently. You can

return

the page to an earlier version, and changes will be merged and synched automatically.

Accessing Your Notes Anywhere

Following the work-anywhere trend set by the other applications in Office 2010,

OneNote 2010 enables you to add, update, and share notebooks on the Web or access and

make changes to your notes using your Windows Mobile smartphone. The OneNote Web

App enables you to view, share, and edit your notes pages; you can easily add notes, edit

existing notes, and capture links, screen shots, files, and more. Additionally, OneNote Mobile

fits the OneNote 2010 interface to the small smartphone display, enabling you to review,

update, and work with your notes while you’re on the go. When the product is available,

you will be able to set up OneNote Mobile to work with your Windows Mobile smartphone

by clicking File and choosing Options in Backstage view. Click the Advanced tab and scroll

down to the OneNote Mobile settings. Click the Install OneNote Mobile button to install the

mobile

version

of OneNote 2010 and set up the software for your smartphone.

117

Chapter 9

Collaborate Effectively with

SharePoint Workspace 2010

In this chapter:

n What Can You Do with SharePoint Workspace 2010?

n Starting Out with SharePoint Workspace 2010

n Setting Workspace Preferences

n Accessing Your Files Seamlessly

n Simplified Searching

n Checking Files In and Out

n Connecting with Your Team Instantly

n SharePoint with InfoPath and SharePoint Business Connectivity Services

n Using SharePoint Workspace on the Go

When you need to send a document to a teammate for review, what do you usually do?

Most people attach a document to an e-mail message and click Send. This works—somewhat

inefficiently—for a file two people share, but what happens when you have six or eight

people on your team? You could conceivably have multiple versions of the same file moving

back and forth through e-mail; how will you know when you are working on the most recent

version

of the file?

SharePoint Workspace 2010 takes all the guesswork out of collaboration by providing

you with an easy-to-use workspace in which you can post files you want to share, hold

discussions, chat in real time, and check files in and out. You can synchronize files with

SharePoint Server 2010, ensuring that your document libraries, InfoPath forms, and other

files are current and up to date. SharePoint enables you to work with your files—and each

other—as you collaborate in real time to complete your shared projects.

118 Part II Hit the Ground Running

And for those times when you want to work with files stored on the server but won’t be able

to maintain your server connection while you work, SharePoint Workspace 2010 works with

SharePoint Server 2010 to enable you to take content offline so that you can do the work

you need to do and upload the files when you reconnect. Taking your content offline is a big

benefit when you’re on the road, because you can ensure you’re working with the most recent

version of the file and integrate your changes automatically when you log back into the

server. In the meantime, you can lock the file so that other users won’t inadvertently make

changes that can be overwritten later.

What Can You Do with SharePoint Workspace 2010?

SharePoint, as the name implies, is all about sharing—sharing files, sharing discussions, sharing

folders and workspaces and contacts. SharePoint Workspace 2010 is the next incarnation

of Microsoft Office Groove, which is available both as part of Office 2010 and as a standalone

product. Because sharing files and information is a flexible process that can take many forms,

you can work with SharePoint Workspace in a number of ways:

n Create a SharePoint workspace to create a personal copy of the server workspace on

your PC and store files you need to work on. If you use SharePoint Server 2010, this

enables

you to take your content offline and synchronize it again later.

n Create a Groove workspace to collaborate with team members (who may or may not

have access to SharePoint Server 2010) so that you can share files, folders, discussions,

and more.

n Create a shared folder in your workspace so that you can give other users or computers

access to the files stored there.

SharePoint Workspace 2010 gives you the space to collaborate with your team easily and

naturally. In your team workspace, you can share documents, have discussions, chat in real

time, set appointments, leave notes for each other, and much more.

Note SharePoint Workspace 2010 is available only in the Microsoft Office 2010 Professional Plus

version. SharePoint Server 2010, which is required for the check-in and check-out features, will be

available in the first half of 2010.

Chapter 9 Collaborate Effectively with SharePoint Workspace 2010 119

Starting Out with SharePoint Workspace 2010

When you first start SharePoint Workspace 2010 from the Windows Start menu, the

Launchbar opens on the left side of the window, along with a pop-up list displaying any

unread

files that have been added to the workspace since you last logged in. (See Figure 9-1.)

The Launchbar lists any workspaces you currently belong to and also enables you to view and

work with your Contacts list.

FIGURE 9-1 SharePoint Workspace also lets you know if your workspaces have any unread files.

To display a workspace, double-click the name of the workspace in the Launchbar. The

workspace

window uses the familiar Ribbon you see in other Office 2010 applications.

(See Figure 9-2.) The default tabs include the File, Home, Workspace, and View tabs.

FIGURE 9-2 The SharePoint Workspace 2010 window gives you what you need to work collaboratively with

your colleagues.

The work area is divided into three areas. On the left side of the work area is the Content

panel, where you can choose the type of content you want to view in the space. The Content

area might show Files, Meetings, Calendar, Discussion, Notepad, Pictures, or other categories,

depending on the features that are used in the workspace.

Tip You can add tools to the Content area by clicking the Workspace tab and clicking Add in the

Tools group.

What About Groove?

If you used Microsoft Office Groove in Office 2007, you are already familiar with the

workspace

concept and know how it can be used to collaborate with team members far and

near. In Office 2010, Groove has been renamed as SharePoint Workspace 2010, so you’ll find

similar functionality with some major improvements.

You can still create Groove workspaces in SharePoint Workspaces 2010 so that you can

collaborate

with team members who don’t have access to SharePoint Server.

And now, with SharePoint Server 2010, which will be available in the first half of 2010, you will

be able to download the files or folders you need so that you can work on files while you’re

offline and then synchronize the file versions when you connect to the server once again.

Step by Step: Adding a Workspace

You can create a SharePoint or Groove workspace easily from the Launchbar.

Here’s how:

1. In the SharePoint Workspace 2010 Launchbar, click New in the Workspace group

of the Home tab.

2. Choose one of the following options:

n SharePoint Workspace, which creates a workspace on your computer

that enables you to download files from the SharePoint Server and work

on them offline

n Groove Workspace, which creates a classic Groove workspace so that

you can save files and work collaboratively with your team

n Shared Folder, which enables you to create a new folder you can share

with others

3. If you selected SharePoint Workspace, enter the server location of the space you

want to copy, as shown here:

4. If you chose Groove Workspace, click Options and make sure 2010 is selected in

the Workspace Version field (as shown here). Click Create.

5. If you chose Shared Folder, enter a name for the new folder and click OK. In the

Select A Folder For Sales Report dialog box, choose whether to create the folder

on your desktop, in another location you specify, or to use an existing folder.

After you make your choice, click OK.

Tip SharePoint workspace synchronizes only your changes, which means the entire file doesn’t

need to be copied each time you sync your files, which saves time and preserves bandwidth.

Setting Workspace Preferences

Backstage view in SharePoint Workspace 2010 provides you with various ways to set the

preferences

for different program features. (See Figure 9-3.) The Info tab enables you to

choose your preferences for the following elements:

n Change Online Connection Settings lets you choose the way the various workspaces

and communication options in your workspaces are synchronized. You can also choose

to pause the workspace or work offline.

n Alert Me To Workspace Changes enables you to customize the way in which your

workspace

alerts you when changes are made to the site. You can set the Alert level,

review roles and permissions, and set download preferences.

n Manage Account Settings makes it easy for you to set account preferences, including

whether you want SharePoint Workspace to launch when Windows starts up, how

you want your online presence to be displayed, and how you want folders to be

synchronized.

n Manage Messages And Contacts displays your message history, including all messages

you’ve sent and received, and it enables you to work with all contacts who have access

to the workspace.

FIGURE 9-3 Use Backstage view to set your workspace preferences.

Accessing Your Files Seamlessly

One of the great new features SharePoint Workspace 2010 offers is transparency. Now when

you use SharePoint Workspace 2010, you can open the files in your SharePoint and Groove

workspaces as easily as you open a file on your desktop.

To move directly to your workspace folders without launching SharePoint Workspace 2010,

click the Windows Start button and click your user name. Locate the Workspaces folder,

as shown in Figure 9-4. Open the Workspaces folder by double-clicking it and then move

directly

to your workspace to find the files you need.

FIGURE 9-4 You can access your workspace files from the folders on your computer.

Simplified Searching

Another natural integration between your SharePoint and Groove workspaces occurs in the

indexed Windows search function on your computer. Now when you search for a word or

phrase by clicking the Windows Start button and typing the phrase in the Search field, the

list of results displays any results found in the files in any of your workspaces. (See Figure 9-5.)

Tip You can synchronize any shared folder on your computer simply by right-clicking the folder

and clicking Shared Folder Synchronization. Choose the Start Synchronizing option from the

menu to begin the process.

Results from a shared folder in SharePoint Workspaces 2010

FIGURE 9-5 The indexed search feature includes results from your SharePoint and Groove workspaces.

Checking Files In and Out

Keeping your files straight is one of the challenges of working with other authors on shared

documents. SharePoint Workspaces 2010 enables you to check files in and out of SharePoint

Server 2010 so that you can ensure that your work isn’t being duplicated—or overwritten—

by others working on the same file.

Note To use the check-in and check-out features, you must have access to SharePoint Server

2010, which will be available in 2010.

You can work with files easily in SharePoint Workspace 2010 by clicking Files in the Content

pane and then selecting the file you want to check out or in. In the Home tab, click Check

Out or Check In. The file is marked so that other users on SharePoint Server will see the

availability

of the file and, if it is checked out, be blocked from accessing it.

When you check the file back in, your changes are synchronized and the file is made

available

to your teammates once again.

Tip You can move back to the original SharePoint Server site by clicking the breadcrumb

trail

in SharePoint Workspace 2010. This gives you instant access to folders, files, libraries, and other

assets

on the server site.

Synchronize Your Data from SharePoint Server to Your

Workspace

If you want to save files or folders to your PC-based workspace so that you can work

on them whenever and wherever you choose, you can download the content from

SharePoint Server 2010. Simply open your SharePoint Server 2010 site, click Site Actions,

and choose the Sync To Computer option. Click OK in the SharePoint Workspace 2010

dialog box to begin the download.

Connecting with Your Team Instantly

The Groove workspace you create in SharePoint Workspaces 2010 enables you to share files,

discuss projects, plan meetings, and much more. With a Groove workspace, you can work

collaboratively without having access to SharePoint Server 2010.

You can display and work with contact information in two different places in your SharePoint

and Groove workspaces. In the Launchbar, you can click Contacts at the bottom of the display

to display all Active, Online, and Offline contacts who have access to the current workspace.

In a workspace, scroll through the Members areas in the bottom left corner of the

window.

You can tell which contacts are currently online by the small presence icon that appears to

the left of the contact’s name. A green icon indicates that the member is online; yellow indicates

that the member is busy; red means the contact is unavailable. If a presence icon is

gray, the contact is not currently online.

When you find a member you want to contact, click the person’s name to display a contact

card. By default, a small contact card appears, as you see in Figure 9-6. You can expand the

contact card to show more information—including department, phone numbers, and other

relevant information such as organization and workspace roles—by clicking the Expand

Contact Card button in the lower right corner.

E-mail

Expand Contact

Card

Presence icon View More Options

FIGURE 9-6 The contact card initially shows a select group of contact options, but you can expand the card

to show more information.

The different option tools in the contact card enable you to send the contact an e-mail message,

begin an instant message session, make a phone call, or set up a meeting. Additionally,

with Office Communicator 2007 R2, you can share video, share your desktop, and send files.

You can also open an instant messaging window by double-clicking the contact’s name in

the Members list. You can send the message to one or many team members and attach files

to the message if you like. (See Figure 9-7.)

FIGURE 9-7 Send an instant message to one of your contacts online in the current workspace.

Note To see the online presence and social contact information of others on your team, you

need to be using Office Communicator 2007 R2.

SharePoint with InfoPath and SharePoint Business

Connectivity Services

SharePoint Server 2010 makes it simple for you to work with the variety of business applications

and services you need on a daily basis—and to do that naturally, online or offline.

Now SharePoint Server 2010 supports InfoPath forms, which means you can add, delete, or

edit data and data records on your forms and trust that they will sync automatically with the

forms and data on the server.

Additionally, users who use SharePoint Business Connectivity Services can count on using

line-of-business data in SharePoint and know that it will synchronize seamlessly with both

their line-of-business system and SharePoint Server.

For both InfoPath and SharePoint Business Connectivity Services, after the data is synchronized,

the information will also be available in SharePoint Workspace 2010.

Using SharePoint Workspace on the Go

SharePoint Workspace 2010 will also be available in a mobile counterpart (sold separately

from Office 2010) after the final release of Office 2010. With SharePoint Workspace Mobile

2010, you will be able to open your workspaces, look through your files and folders, and

review

and modify your files in a special Office screen designed for your smartphone.

After you view, edit, and save your documents, you can easily sync them back to the server

with a single touch on your phone. This process makes it easy for you to work with your

workspace files anywhere and anytime you need them.

129

Chapter 10

Create Effective Marketing Materials

with Publisher 2010

In this chapter:

n Starting Out with Publisher 2010

n Creating and Using Templates and Building Blocks

n Creating Precise Layouts

n Enhancing Typography with OpenType Features

n Working with the Improved Color Palette

n Previewing and Printing Publications

n Preparing for Commercial Printing

n Sharing Publisher Files

How do you create your marketing materials today? If you are spending a big portion of your

budget outsourcing four-color postcards, brochures, newsletters, and more, you can do the

job closer to home with Publisher 2010 and save money, time, and effort. What’s more, you

can create and save reusable content—called building blocks—that you can insert in future

materials, which helps you ensure that your messaging stays consistent no matter what kinds

of materials you create.

Improvements in Publisher 2010 make it easier than ever to create new files using both

built-

in and community-submitted templates. New layout tools help you align objects, place

captions, and position elements on the page in accurate and aesthetically pleasing ways. You

can also spruce up your photos with artistic effects, improved editing tools, and support for

OpenType features such as ligatures and stylistic sets.

Starting Out with Publisher 2010

The Microsoft Publisher 2010 window gives you plenty of room to work on screen, while

keeping the tools you need within reach. (See Figure 10-1.) The Ribbon includes seven tabs—

File, Home, Insert, Page Design, Mailings, Review, and View—and each tab contains groups

of tools related to the tab topic. On the Insert tab, for example, you’ll find Picture in the

Illustrations group, enabling you to add pictures to the current page with just a few clicks of

the mouse.

130 Part II Hit the Ground Running

Page Navigation Pane

Ribbon Work area

Scratch area View controls

FIGURE 10-1 The Publisher 2010 window includes the Ribbon and the Page Navigation pane.

The Publisher window also includes a scratch area surrounding the entire publication. The

scratch area enables you to place objects partially off the page so that you can create bleeds

(photos, backgrounds, or other graphical elements that print all the way to the edge of the

page). In Publisher 2010, you can choose to hide or display the scratch area so that you can

see the overall design, including bleeds, as well as the page as it will appear when printed.

Collapse and Expand Page Navigation Pane

The Page Navigation pane along the left side of the window displays thumbnails of the pages

in the current document, which enables you to get a sense of the document flow and overall

layout at a glance. You can use the Page Navigation pane to scroll through the different

pages in your document, checking text flow, placement of illustrations, format of headings,

and other parts of the design.

Chapter 10 Create Effective Marketing Materials with Publisher 2010 131

If you want to increase the amount of space available for the current page, you can collapse

the Page Navigation pane, which reduces the size of the displayed thumbnails. To expand the

pane, simply click the Expand button at its upper-right corner.

Use the Mini Toolbar

Now Publisher 2010 also includes the Mini Toolbar, a set of text-formatting tools that appears

when you select text in your document. When the Mini Toolbar first appears, it is transparent;

when you point to the toolbar it becomes solid, as Figure 10-2 shows. If you move the mouse

pointer away from the toolbar, it disappears altogether. In this way, the tools are within reach

if you need them, but they fade away if you don’t.

FIGURE 10-2 The Mini Toolbar displays formatting tools when you select text.

Tip If the Mini Toolbar doesn’t appear automatically when you select text in your document,

you can turn the feature on by clicking File and choosing Options. On the General tab, click the

Enable The Mini Toolbar option.

Creating and Using Templates and Building Blocks

Publisher 2010 offers dozens of built-in templates you can use to create letters, newsletters,

brochures, business cards, calendars, labels, and much more. When you choose to create a

new publication based on a template, you can use one of the templates installed with the

software or access templates available online in the Publisher community. (See Figure 10-3.)

In addition to using templates to start your publication, you can add predesigned elements

to your pages by choosing from a gallery of page parts, known as building blocks. Building

blocks are available in the Page Parts tool in the Building Blocks group of the Insert tab.

(See Figure 10-4.)

FIGURE 10-3 You can choose to begin a new publication based on an installed or online template or create

your own design on a blank page.

Tip A building block is a part of a page that you might want to use regularly in your publications.

By default, Publisher includes headings, pull quotes, sidebars, and stories in the Page Parts

gallery of the Building Blocks group on the Insert tab. You can insert the building blocks as

they are and then customize them to fit your publication, or you can create your own Page

Part and then save it as a building block. Either way, the building blocks feature can save

you time and effort and help you provide a consistent look and feel among the various

publications

you create.

FIGURE 10-4 Page Parts can save you time and add a professional touch to your page design.

Step by Step: Creating a Building Block

Here’s how to create a new building block in Publisher 2010:

1. Open the Publisher document you want to use.

2. Create and select the element you want to save as a building block. Hint: You

might want to save a page heading, report title, pull quote, table, or other

often-

used element.

3. Right-click the element you selected and choose Save As Building Block.

4. In the Create New Building Block dialog box, shown here, enter a title and

description.

5. Click the Category arrow, and choose the category that best applies to the type

of element you’ve created.

6. Click OK to save the building block.

Creating Precise Layouts

Publisher 2010 also includes dynamic layout guides to help you position elements precisely

on the page. Guides appear automatically as you drag an object—text, picture, or shape—on

the page. Turn on the display of guides by clicking Guides in the Show group of the View tab.

Vertical and horizontal guides help you ensure you’re positioning elements so that they

align with other objects on the page. Guides appear and disappear as you drag the object

so that they take up space on-screen only in the areas where you need to use them.

(See Figure 10-5.)

Horizontal

guide

Center

guide

FIGURE 10-5 Publisher 2010 provides dynamic guides that help you place objects on the page.

Enhancing Typography with OpenType Features

Both Publisher 2010 and Word 2010 are now able to make use of OpenType features such

as ligatures and stylistic sets in the fonts that offer them. Ligatures are a combination of two

letters

shown typographically as a single character in some fonts. For example, the letters

fi in some typefaces are placed close together and shown as a single character. This type of

text control is used most often in high-quality typography work.

Similarly, stylistic sets offer a variety of appearances in the selected font. The Typography

tools are found on the Text Box Tools Format tab, which appears when a text box is selected

in your Publisher document. Figure 10-6 shows some of the stylistic sets available for the

Gabriola font.

FIGURE 10-6 Publisher 2010 enables you to take advantage of professional typography features available

with some OpenType fonts.

In addition to ligatures and stylistic sets, Publisher 2010 also supports number styles,

stylistic

alternates, and swash features. Figure 10-7 shows the primary Typography tools, and

Table 10-1 provides a description of each one.

FIGURE 10-7 Publisher 2010 offers different OpenType features you can apply to fonts that support them.

TABLE 10-1 Typography features in Publisher 2010

Tool

Description

Ligatures

Enables you to choose whether to use ligatures in the document (and,

if so, what kind)

Number Style

Sets the appearance of numerals in the selected font in the current

document

Stylistic Alternates

Offers alternate characters you can use in the text in your document

Stylistic Sets

Displays a gallery so that you can choose the format style of the

selected

font

Swash

Works as a toggle, and turns on or off decorative text elements

Tip Publisher 2010 also includes Paste with Live Preview, which enables you to preview the way

an object will look before you paste it in your Publisher document.

Working with the Improved Color Palette

Publisher 2010 updated its color palette to include elements that help you keep a consistent

look and feel throughout the materials you create. Now you can stay true to the color

scheme you selected and apply a variety of tints, shades, and gradients to the text and

shapes on your pages. You’ll find the new palette on all border and fill tools—for example,

Figure 10-8 shows the color palette that appears when you click the Shape Fill tool in the

Shape Styles group of the Drawing Tools Format tab.

FIGURE 10-8 The improved color palette now displays color scheme selections and expanded color choices.

Previewing and Printing Publications

The Print feature in Publisher 2010 now enables you to preview, adjust, and print all in the

same screen in Backstage view. (See Figure 10-9.) When you click the File tab and click Print,

you see the current page of your open publication, complete with the page margins, headers

and footers, and more. You can easily choose the print options you need—for example, select

the printer you want to use, choose the print layout and paper style, and select whether

you want to print as an RGB color publication or a composite black and white.

FIGURE 10-9 The Preview And Print interface in Publisher 2010 enables you to make last-minute changes and

print, all from the same screen.

Have you ever printed a double-sided report only to find out that the image on the back of

the page made the text on the front hard to read? Publisher 2010 includes a backlight feature

that enables you to see through the page on double-sided publications so that you can

avoid that kind of situation in the future. When you choose two-sided printing in the print

options, the Decrease Transparent View and Increase Transparent View tools appear at the

upper-

right corner of the preview window. (See Figure 10-10.) To change the transparency

and display the back of the page while you’re looking at the front, drag the Transparency

slider to the right. You can also turn the page and view the transparency from another

perspective

using the Front and Back tools at the bottom of the preview area.

FIGURE 10-10 The Transparency tools become available when you choose two-sided printing.

Tip Before you finalize your design, be sure to run the Design Checker to identify and correct

design problems in the publication. You’ll find the Design Checker on the Info tab of Backstage

view.

Preparing for Commercial Printing

Publisher 2010 includes expanded support for the four-color process and spot color printing,

including CMYK composite postscript and Pantone colors (both PMS and the new Pantone

GOE color system). You’ll find the tools you need to prepare a file for commercial printing by

clicking the File tab to display Backstage view. Click Info, and click Commercial Print Settings.

Commercial Print Settings enables you to choose the color model you want to use, work

with the embedded fonts in your publication, and manage the registration of the document.

(See Figure 10-11.) When you’re ready to finalize the file, click File and in Backstage view, click

Share and choose Save For A Commercial Printer.

FIGURE 10-11 Commercial Print Settings enable you to prepare your publication for professional printing.

Sharing Publisher Files

You can share the files you create in Publisher 2010 in various ways. You can e-mail pages

from within Publisher, create a PDF/XPS document, publish the document as HTML, save the

piece for a commercial printer, or save the publication for another computer. You’ll find these

options in the Share tab of Backstage view.

Tip Before you send your Publisher document by e-mail, you can preview it by clicking File

to display Backstage view and choosing Share. Click E-mail Preview to display a version of the

document

as it will appear to the recipient.

141

Chapter 11

Make Sense of Your Data with

Access 2010

In this chapter:

n Starting Out with Access 2010

n Using Application Parts

n Applying Office Themes

n Adding New Fields

n Showing Data Bars and Conditional Formatting

n Creating Navigation Forms

n Designing Access 2010 Macros

n Working with Access 2010 and the Web

Whether you work daily with large, sophisticated databases or occasionally create small

data tables to meet a specific need, Microsoft Access 2010 enables you to gather, organize,

analyze, report on, and share your data easily and effectively. New and improved features

in Access 2010 simplify the steps to creating a database by enabling you to add application

parts that include ready-made tables and forms. You can also use Quick Start fields to insert

commonly used fields and add calculated fields to build data analysis directly into your data

tables.

On top of the simplified tasks involved in creating and analyzing your data, Access 2010

includes

new data visualizations—including new data bars and improved conditional formatting—

that can tell the story of your data at a glance. You’ll also find plenty of Web support in

Access 2010. With little effort, you can create a Web database and publish your data online

so that it’s always available at any point you have Web access.

Starting Out with Access 2010

The first thing you’ll notice as you begin to work with Access 2010 is that the application has

the friendly and familiar Office interface that is common to other Office applications you

might use. The Ribbon includes five tabs—File, Home, Create, External Data, and Database

Tools—that offer sets of tools organized according to the data tasks you’ll be performing.

142 Part II Hit the Ground Running

In addition to these five tabs, Access 2010 displays the Table Tools contextual tabs (Fields and

Table) when you work with a data table. (See Figure 11-1.)

Record navigation

Ribbon Contextual tabs

Database search View controls

FIGURE 11-1 The Access window is designed to make it easy for you to work with the data objects and views

you need.

Below the Ribbon, the Access window is divided into two main windows. The All Access

Objects pane on the left side of the screen lists the various elements—tables, reports, forms,

and more—in your current database. To open an object, you double-click it in the All Access

Objects pane; the item then opens in the work area on the right side of the window. You can

have many open objects in Access at one time, and you can change the current display by

clicking the tab of the object you want to see.

Tip You can change the elements displayed in the left pane by clicking the arrow to the right of

the pane heading. You can choose to display objects by category or by group. Click the option

you want to display and the pane changes accordingly.

Chapter 11 Make Sense of Your Data with Access 2010 143

Along the bottom of the Access window you find controls that enable you to move through

records in the current data table, search for information in the database, or choose the view

you want to use to work with your data.

In addition to the flexible, easy-to-navigate user interface, Access 2010 offers behind-thescenes

features that help you manage your data files. Backstage view pulls together all

the tools you need to create, share, and set preferences for the various files you create.

(See Figure 11-2.)

FIGURE 11-2 Use Backstage view to work with the files you create in Access 2010.

Using Application Parts

Designing every database you create from scratch takes a lot of time and effort, and now

with Access 2010 there’s no need to reinvent the wheel every time you create a new data

table. Using the application parts in Access 2010, you can add ready-made forms and tables

to your Access database. By default, Access 2010 includes a number of blank, predesigned

forms as well as Quick Start tables (Comments, Contacts, Issues, Tasks, and Users) you can

add to your database.

Step by Step: Adding an Application Part

Follow these steps to add an application part to your Access database:

1. Open the database you want to use.

2. Click the Create tab.

3. Click Application Parts. The gallery appears, as shown here:

4. Click the application part you want to add to your database, and the form or

table you selected is inserted in the All Access Objects pane on the left side of the

work area.

Tip You can easily search for Access templates to use as the basis for a new database from

Backstage view. Click File to display Backstage view and then click New. Click in the Office.com

Templates box, and type a word or phrase indicating the type of template you’d like to find.

Applying Office Themes

Now you can apply the professionally designed Office themes—which include color scheme,

font selections, and styles—to your forms and reports in Access 2010. (See Figure 11-3.) This

level of consistency enables you to create a similar design for all the documents you create

in Office 2010. For example, suppose you’re putting together a lengthy sales report that

spotlights key products, provides sales data by region, and includes case studies showing

the ways in which your products are being used by your customers. Using the same Office

theme, you can prepare the case studies in Word 2010, the financial data in Excel 2010, and

the sales reports by region in Access 2010.

FIGURE 11-3 Office themes in Access 2010.

Step by Step: Applying an Office Theme

Here’s how to apply an Office theme in Access 2010:

1. Open the database you want to use in Access.

2. Open the form or report to which you want to apply the theme.

3. Click the Home tab.

4. Click Views, and choose either Design View or Layout View.

5. In the contextual Design tab, click Themes in the Themes group.

6. Preview a theme by pointing to it; the report or form display shows you how the

theme will look when it is applied to your data.

7. Click the theme you want to use, and it is applied to the form or report.

Adding New Fields

New field features in Access 2010 enable you to reduce the amount of time you spend setting

common fields in your databases by using Quick Start fields, and they help you expand

your data processing power by adding calculated fields to your tables.

Adding Quick Start Fields

New Quick Start fields in Access 2010 enable you to add fields you use regularly to your

data tables with a simple click of the mouse. Instead of adding Address, City, State, ZIP, and

Country codes one by one, for example, you can click the Address Quick Start field to add

all the fields in one click. Access 2010 includes nine Quick Start fields by default: Address,

Category, Name, Payment Type, Phone, Priority, Start And End Dates, Status, and Tag.

To add a Quick Start field to your data table, open the data table you want to use and click

to select the field to the right of the place you want to add the field. In the Table Tools Fields

tab, click More Fields in the Add & Delete group. Scroll down to the bottom of the list to find

the Quick Start fields, and click the one you want to add to the data table. (See Figure 11-4.)

FIGURE 11-4 Find the Quick Start fields in the More Fields list.

Each Quick Start field you add has preset field options already included. For example, when

you add the Payment Type Quick Start field, the added field includes Cash, Credit Card,

Check, In Kind, and Debit selections as part of the field. You can customize the field to include

the selections you want by right-clicking the field and choosing Edit List Items. In the

Edit List Items dialog box (shown in Figure 11-5), you can modify, remove, or add values to

the list by clicking and typing the new entry. Click OK to save your changes.

Tip What should you try first in Access 2010? Jeff Conrad, author of Microsoft Access 2010 Inside

Out (Microsoft Press, 2010), recommends these three things:

n Publish and share your database to Access Services, and view your forms and reports in a

Web browser.

n Try the new navigation form to see how simple it makes creating a navigation system.

n Attach data macros to table events, and create named data macros to incorporate more

business logic in your data tables.

FIGURE 11-5 You can easily edit the list items included in a Quick Start field.

Inserting Calculated Fields

Another new field feature in Access 2010 enables you to easily create and store calculations

that enable you to analyze your data. You can then apply the calculated field throughout

your database as needed. To add a calculated field to your data table, click More Fields in the

Add & Delete group of the Table Tools Fields tab. Then choose the field type for the type of

calculated field you want to create, and Access 2010 displays the Expression Builder dialog

box so that you can choose the elements, categories, and values to use in the calculation.

(See Figure 11-6.)

FIGURE 11-6 Add calculated fields to your data table.

Tip You can change the expression you’ve used to create a calculated field by right-clicking the

field label in the data table and choosing Modify Expression. In the Expression Builder dialog

box, modify the calculation as needed and click OK.

Showing Data Bars and Conditional Formatting

Not everyone can look at a table full of data and know instantly what it means. Some of

us need a little help interpreting facts and figures in a table or on a report. For this reason,

Access 2010 includes data visualization features that enable you to include data visualizations

in your tables and reports that will help your readers understand what your data means.

Data bar visualizations are helpful when you want to compare data among the records

in your report. For example, if you want to compare the projected workshop with the

actual

attendance

data, you can let data bars show you where your marketing efforts were

successful

and where they fell short. (See Figure 11-7.)

FIGURE 11-7 Data bars work with numeric fields, enabling you to visually contrast the data you’re reporting.

Tip Conditional formatting is easier in Access 2010, thanks to the addition of the Conditional

Formatting Rules Manager. Now you can create new rules that specify conditions for the

conditional

formats and preview the effects of your changes before you apply them.

Creating Navigation Forms

When you’re working with forms and reports in Access 2010, you can simply drag fields to

where you want them to create just the type of layout you want. Access 2010 also includes a

new Navigation Forms gallery, with a number of layouts you can customize to make it easy

for others to find the forms and reports they want to view in your database.

To create a navigation form, click the Create tab and choose Navigation in the Forms group.

(See Figure 11-8.) Click the navigation form layout you want to use, and then drag the reports

and forms you want to include from the All Access Objects pane to the navigation area

of the new form. Those reviewing your information will be able to click the name of the form

or report to display the data in the Access 2010 window.

FIGURE 11-8 Use the Navigation gallery to choose the layout of the navigation form you want to create.

Designing Access 2010 Macros

Although the word macros might make your eyes glaze over if you’re not interested in

automating

the data logic and processing in your database, two improvements in Access

2010 offer good news to macro aficionados. First, new data macros enable you to add data

logic to the actual table of data rather than requiring you to work at a form level. And the

enhanced Macro Designer is now more intuitive than ever, providing a look and feel that

enables you to build macros easily by dragging items where you want them to appear and

arranging them in the proper sequence. (See Figure 11-9.)

FIGURE 11-9 The improved Macro Designer in Access 2010 makes it easy for you to build macros by selecting

and dragging the items you need to the location you choose.

Working with Access 2010 and the Web

One of the major stories in Office 2010 is the ability to access your files—documents,

presentations,

worksheets, notebooks, and databases—anywhere you have access to

the Web. Access 2010 enables you to create a Web database so that you can use it with

SharePoint Server 2010 to publish your entire database—including tables, forms, and

reports—

and view it in a browser window.

Create a Web database by starting in Backstage view. Simply click File, choose New, click

Blank Web Database, and click Create. You can then create the database as usual, adding

data tables, forms, and reports. When you are ready to publish your database to the Web,

return to Backstage view by clicking File and then click Share. Choose Publish To Access

Services, and type the necessary information for the SharePoint site that will post the file.

Tip Before you post your database to the Web, be sure to run the Compatibility Checker to look

for any data items or settings that won’t function properly online. The Compatibility Checker is

found in the Share page of Backstage view.

Adding Web Controls

Another great Web feature included in Access 2010 makes it possible for you to incorporate

Web content in the database you are creating. This might enable you, for example, to

provide

live access to Web 2.0 content from within your database.

Display the form on which you want to add the Web Browser control, and then, in the Home

tab, click View and then Layout View or Design View in the Views group. In the Controls

group, choose Web Browser, and then click and drag at the point you want the control to

appear. When you release the mouse button, the Insert Hyperlink dialog box appears so that

you can enter the Web page or choose the item you want to include in the Web Browser

control. After the element is added, you can resize the object as needed in the Access

window

by simply dragging the corner or side of the object. (See Figure 11-10.)

Tip Now it is easier to add databases to your Trusted Documents list. When you open a

database

created by someone else, a Message Bar appears at the top of the Access 2010 window.

Macros are automatically disabled until you indicate that the database is a Trusted Document. To

enable the full functionality of the database, click Enable Content.

FIGURE 11-10 Incorporate Web content in your database using the Web Browser control in Access 2010.

Using Access 2010 with SharePoint

If you work with SharePoint Server 2010, additional Web features are also available to you in

Access 2010. First, you can take a Web database into offline mode so that you can continue

to work on the data as needed; the next time you connect to the Web, any changes you

made to the offline data are automatically synchronized with the database on the server. You

can also synchronize your data manually by clicking File to display Backstage view, and in the

Info page, clicking Sync All.

You also have the ability to save your database to SharePoint Workspaces 2010 so that you

can access your data using your Web browser or smartphone.

Gathering Data with InfoPath 2010

Microsoft InfoPath is a forms-creation and data-gathering tool included with Office

2010 that helps you collect and consolidate the data you need and then share that data

with colleagues in a variety of ways. (See Figure 11-11.)

FIGURE 11-11 InfoPath offers a collection of templates you can use as the basis for your forms, or

you can begin with a blank form and create your own.

InfoPath includes a variety of form templates you can use as the basis of the new forms

you create, or you can choose to start with a blank form and add the fields yourself.

InfoPath 2010 enables you to create professional forms by dragging fields where you

want them to be and arranging them in a way that makes sense for your data. You can

also add pictures and buttons to your forms, guarantee accuracy by running the Spell

Checker, include ScreenTips to prompt people filling out your forms, and share forms

easily using SharePoint Server 2010 and SharePoint Workspace 2010. Also, thanks to

integration with SharePoint Workspace 2010, users can complete the forms you create

online or offline.

InfoPath Filler makes it easy for users to open and complete the forms you create. You

set up the way you want users to submit completed forms by choosing to receive the

form by e-mail, have the data sent to a SharePoint library or SharePoint server, or use

Web services to gather the information. You can also include your InfoPath 2010 forms

in Outlook 2010 messages, which makes it possible for you to collect information by

e-mail and store it in the database or SharePoint library you specify.

First Look: Microsoft Office 2010

155

Part III

Next Steps with Office 2010

Now that you have explored the new features in each of the core Office 2010 applications,

this part of the book helps you answer the question, “What next?” Because Office 2010 is an

integrated suite of applications, you can use the programs together gracefully to get more

out of the files you create and share. This part also spotlights the security measures you’ll

find in Office 2010 and introduces various ways you can learn more about the programs and

increase

your own proficiency.

You’ll find the following chapters in this closing part of First Look: Microsoft Office 2010:

n Chapter 12: Putting It All Together

n Chapter 13: Security in Office 2010

n Chapter 14: Training Made Easy

157

Chapter 12

Putting It All Together

In this chapter:

n Using Excel 2010 Data with Word 2010

n Sharing SmartArt Among Office 2010 Applications

n Dragging Word 2010 Content to PowerPoint 2010

n Mail Merging Word 2010 Documents in Outlook 2010

n Sharing Access 2010 Data with Other Office Applications

n Scheduling a Meeting from a Shared Document

One of the great benefits of working with the Office 2010 suite is that no matter which

applications

you use most often, they all have a similar look and feel. This means that even

if you use PowerPoint 2010 only when you need to make a presentation to the board, or

you use Publisher 2010 only when you need to create business cards, you can find your way

around the programs easily using the familiar Ribbon and Backstage view.

Office 2010 enables you to share your work among applications—and among colleagues—

more easily than ever before. A number of integrated features make it simple to use what

you create in one application seamlessly in another. That saves you time and effort, and it

guarantees consistency throughout your creations no matter what kinds of final files you

create

and share. The examples in this chapter provide just a few of the ways you can share

your data among applications—take the time to discover how you can share what you create

in an application to get more from your work in Office 2010.

Using Excel 2010 Data with Word 2010

Once upon a time, to include worksheet data in your documents, you needed to create a

table and enter the values by hand. We’ve come a long way since those early days, thankfully,

and Paste with Live Preview makes it easy not only for you to copy and paste (or drag) your

worksheet information into your document, but to paste it just the way you want it—with or

without formatting. (See Figure 12-1.)

158 Part III Next Steps with Office 2010

FIGURE 12-1 Choose the way you want data to be pasted in your document.

The paste options provide you with a variety of choices that control the way the information

is inserted into the document. Table 12-1 explains the various paste options shown in

Figure 12-1.

TABLE 12-1 Paste options

Tool Name

Description

Keep Source Formatting

Pastes the copied information using the same formatting options

that were applied in the source document

Use Destination Styles

Pastes the new information into the document using the styles

existing in the receiving document

Link & Keep Source

Formatting

Preserves the formatting of the original document, but links the

data to the original file so that changes will be reflected in the

pasted information if you update the source document

Link & Use Destination

Styles

Uses the style formatting in the receiving document by

maintaining

a link to the original file so that the data is updated

if you change the original document

Picture

Pastes the information as a picture object in the document

Keep Text Only

Pastes the data as text only, with no applied formatting

Chapter 12 Putting It All Together 159

Tip In the Paste Options list, you can click Set Default Paste to display the Advanced options and

specify the default pasting styles you want to use in the current application.

Sharing SmartArt Among Office 2010 Applications

SmartArt in Office 2010 has been enhanced to provide new support for pictures as well as

additional layouts for you to use in the different Office applications. More good news is

that you can create a SmartArt diagram once—in, for example, Word 2010—and use the

same diagram in Excel, PowerPoint, Outlook, OneNote, and even Access. Figure 12-2 shows

a drag operation where a SmartArt diagram is dragged from Word 2010 to an Excel 2010

worksheet.

FIGURE 12-2 Drag SmartArt from Word to Excel.

Tip You can also save a SmartArt diagram as a picture in one of four formats (PNG, GIF, TIFF, or

WMP) so that you can easily post it in Web content, import it into a layout program, or use it in

any other application that works with standard graphics files. Note, however, that the picture file

you save will not have the interactive diagramming features that are available when you work

with the SmartArt diagram in your Office 2010 applications.

Dragging Word 2010 Content to PowerPoint 2010

Depending on your creative style, you might be more comfortable drafting your ideas in

Word and then porting them into PowerPoint. You can easily create your outline, paragraphs,

or bulleted lists in a Word document and then drag it into PowerPoint. (See Figure 12-3.)

Again, Paste with Live Preview comes into play.

After you place the content in PowerPoint 2010, you might still have some basic formatting

to do to fit the right content on the right pages, but the process is much faster than

retyping the information you need. You can also drag notes from Word into the Notes area

of your PowerPoint slides and then print the notes pages (with text and slide images) from

within PowerPoint 2010.

FIGURE 12-3 Drag an outline from Word to PowerPoint.

Mail Merging Word 2010 Documents in Outlook 2010

Suppose that you want to send a new e-mail catalog list to all your customers. Rather than

beginning in Word 2010, drafting your document, and then connecting a contact list you

exported from Outlook 2010 and saved in Excel 2010, you can do the whole process in two

simple steps:

1. Create the document you want to send in Word 2010.

2. Mail merge the document with your Outlook contacts and send it via e-mail.

Nice, right? The new mail merge feature in Outlook 2010 enables you to easily create mail

merge projects you can use to send form letters, mailing labels, envelopes, and catalogs.

(See Figure 12-4.) To find the Outlook Mail Merge feature, click Contacts at the bottom of the

navigation pane on the left side of the Outlook 2010 window and then click Mail Merge in

the Actions group of the Home tab. The Mail Merge Contacts dialog box opens, offering you

a collection of options you can use to set up your merge.

FIGURE 12-4 Use the Outlook Mail Merge feature to simplify mail merge projects.

In the Document File area of the Mail Merge Contacts dialog box, you can create a new

document or choose Existing Document and click Browse to navigate to the document you

created in Word. Click Open to add the file to the Mail Merge Contacts dialog box, and, after

adding the contact data file, click OK to start the merge.

Sharing Access 2010 Data with Other Applications

Although the features that enable you to use Access data with other Office 2010 applications

aren’t new in this latest release, the program makes it easy for you to share your Access

data with files and messages you create in Excel 2010, Word 2010, Outlook 2010, and even

OneNote 2010. You’ll find the tools you need in the External Data tab of the Access 2010

window. (See Figure 12-5.)

Use as data

source for Word

mail merge

Export

e-mail

Save as

text file Export to an Access database

Create a

PDF file

Create

an XML

data

file

Save

Excel

workbook

FIGURE 12-5 Share Access 2010 data with Excel, Word, Outlook, and OneNote 2010.

You can easily use portions of your Access data in other applications as well. Suppose that

you’re reviewing your customer database and you see that several customers haven’t been

contacted in more than a year. You can make a note to yourself in your OneNote notebook

to follow up with those customers by simply dragging the highlighted names and e-mail

addresses to your To Do notebook. Or you can create a task in Outlook and assign the follow-

up calls to someone else on your team. By simply dragging the content, or cutting and

pasting it (using Paste with Live Preview), you can share the data in ways that enable you to

follow through on action items that might otherwise slip your mind.

Scheduling a Meeting from a Shared Document

And of course one of the biggest stories in Office 2010 is the ability you now have to share

your files, in real time, from within core applications. For example, if you co-author a Word

document and notice that one of your co-authors is working on the document while you’re

editing it, you can send her a quick instant message to ask a question about the file.

To contact co-authors while you work on a document, click the File tab and then click Info in

Backstage view. In the Author area on the right side of the view, double-click the name of the

co-author you want to contact. The contact card appears, listing various ways you can reach

the co-author. (See Figure 12-6.)

FIGURE 12-6 Contacting a co-author from within a shared document is a great new feature in Office 2010

applications.

When you click Schedule A Meeting, an Outlook appointment window appears so that you

can schedule the time and the format for the meeting. You can set up a Live Meeting, schedule

a conference call, or invite your co-author to a face-to-face meeting at a location you

specify. (See Figure 12-7.)

FIGURE 12-7 Schedule a meeting on a shared document.

Meeting scheduling is of course just one aspect of the various ways you can communicate

with co-authors who share your documents. You also can send an instant message to anyone

on your team whose presence shows them to be available; you can launch a video call or

voice-over-Internet call (if your communications client allows this). And, when instant contact

isn’t an option, you can always send an e-mail message.

165

Chapter 13

Security in Office 2010

In this chapter:

n Understanding Security in Office 2010

n Opening Files Safely

n Working with Protected View

n Password Protecting a File

n Limiting File Changes

n Setting Role-Based Permissions

n Recovering Unsaved Versions

n Working with the Trust Center

How many attached files do you receive via e-mail every day? How often do you open

documents

sent to you by people you don’t know? How many times in a week do you

forward

files along to coworkers for review or correction? Do you regularly trade files with

others both inside and outside your organization?

This chapter introduces you to the security features in Office 2010. Today it’s more important

than ever to ensure that the files you send and receive are secure, and for this reason, Office

2010 includes new security features that add layers of protection to the files you create and

share. Although much of the file checking that goes on is an invisible part of the process, you

can control many levels of protection in Office 2010 applications, specifying who you want to

have access to your files and what kinds of actions they can perform.

Understanding Security in Office 2010

Security in Office 2010 is focused on safeguarding your files, and Microsoft has accomplished

that by making Office 2010 more resilient to attack. Office 2010 has a new security workflow

with multiple layers that Office documents go through during the File Open process. This

whole security effort is designed to be invisible to you as a user, so you won’t notice any

delays

or dialog boxes when you open files you need to use.

166 Part III Next Steps with Office 2010

Opening Files Safely

One of the vulnerabilities in previous versions of Office occurred in the file-opening process

when a user went to open a file from a previous version of an Office application. Because

hackers

often design malicious files that masquerade as previous file types, this left user files

unprotected when users worked with legacy formats.

The first line of Office 2010 defense includes a new Open File Validation process that checks

to ensure files from previous versions of Office match the required format before the files will

open. This process occurs behind the scenes, but you can open files confidently knowing that

if the files open, they’ve already passed the Office validation check. If the system finds anything

that poses a risk—perhaps an unrecognized file format—the system prompts you with

a Protected View message at the top of the document window. (See Figure 13-1.)

FIGURE 13-1 Protected View lets you know that the file is potentially unsafe.

The Open File Validation process now checks every file to validate the legitimacy of the file

type before the file will open in your Word, Excel, or PowerPoint window. This process is

transparent to you—you won’t experience any delays in opening files or have to interact with

any dialog boxes in the process.

Chapter 13 Security in Office 2010 167

If you choose, however, you can specify how you want the various applications to handle the

files as they go through the Open File Validation process. You can display the File Blocking

settings by clicking the Click For More Details link in the Message Bar when Protected View

displays a message, or you can follow these steps to open the Trust Center and display the

File Blocking settings in your current application:

1. Click the File tab.

2. In Backstage view, click Options.

3. In the Options dialog box, click the Trust Center category.

4. Click the Trust Center Settings button.

5. In the Trust Center, click the File Block Settings category. The File Block Settings window

appears, as shown in Figure 13-2.

FIGURE 13-2 You can tailor the way applications open files from different formats.

By default, the file types with no check marks are opened in the application; they are not

blocked or displayed in Protected view. The action selected in the Open Behavior For

Selected File Types list that appears below the file types shows you what action is taken if a

file type is selected. Table 13-1 explains a little more about each of these behaviors.

TABLE 13-1 File block behaviors

Setting

Description

Do Not Open Selected File Types

The selected files are blocked and will not be

opened.

Open Selected File Types In Protected View

The selected file is opened in a safe mode that is

protected from other files and processes.

Open Selected File Types In Protected View

And Allow Editing

The selected file type is opened in safe mode, but

the user is allowed to edit as normal.

Working with Protected View

When you see the Protected View message in your application window, the file you tried

to open has either been blocked or has been determined to be in a file format flagged for

blocking. If you still want to see what’s in the file or find out if it is from a source you trust,

you can open the file in Protected View.

Protected View is a safe mode that enables you to display a read-only view of the document.

The file is opened in a protected space called a sandbox, where the file cannot affect your

other files or system data. After you determine that the file is acceptable, you can click Enable

Editing to open the file normally.

You also can change the way Protected View is used when questionable files are opened. The

settings for Protected View are available in the Protected View category in the Trust Center.

(See Figure 13-3.)

FIGURE 13-3 You can change the Protected View settings used to protect your computer.

Password Protecting a File

Although you’ve been able to add passwords to your Word, Excel, and PowerPoint files for a

while, in Office 2010 the password encryption rules have been changed to account for password

strength as well. You set an encrypted password for your file using Backstage view, as

you see in Figure 13-4.

Tip You can also set the password during the Save As process by clicking the Tools button,

choosing General Options, and typing the password required to open the file. If you plan to

share the file with others, you can type a separate password that you share with co-authors to

enable modification and file sharing.

FIGURE 13-4 You can set an encrypted password for your document.

Note that when you set an encrypted password for your Word, Excel, or PowerPoint file, the

password cannot be recovered if you forget the password later. For this reason, you should

keep a copy of your passwords in a safe place you can access easily.

Limiting File Changes

In addition to ensuring the files you open and work with are trustworthy, you also need to be

able to set parameters for the types of modifications that can be made to the file by others

who work with the file as well. Each of the Office 2010 applications includes protection features

that enable you to set safeguards at the file level. You’ll find all the protection features

in Backstage view, in the Info category.

Excel 2010 enables you to set several levels of protection for the current worksheet. In the

Info category of Backstage view, you can choose one of the following options to limit the

changes that can be made to the file:

n Protect Current Sheet displays the Protect Sheet dialog box, which enables you

to create

a password other users must enter in order to modify the worksheet.

Additionally, you can choose the actions you want to allow users to complete, such as

format cells, insert columns, and delete rows. (See Figure 13-5.)

n Protect Workbook Structure lets you safeguard the structure of the workbook—or the

Excel window—by prohibiting others from adding new worksheets, for example.

FIGURE 13-5 Excel 2010 enables you to limit the changes you want other users to make in the file.

Word 2010 offers several levels of restrictions that enable you to set the level of changes that

can be made in a specific document. (See Figure 13-6.) These features, which were also part

of Word 2007, allow you to limit others’ changes in the following ways:

n Formatting restrictions limit users to changing styles used in the document.

n Editing restrictions enable you to specify whether you want users to view the file as

read-only, or whether you will allow them to enter tracked changes, enter comments,

or complete forms.

In addition to these two main restrictions, Word 2010 enables you to choose the parts of the

document and indicate which users have the necessary permissions to edit those parts.

FIGURE 13-6 Word 2010 file protection enables you to block or restrict editing by type of task or by person.

PowerPoint enables you to restrict others’ ability to copy, edit, or print content in

your PowerPoint presentation. You can also mark a file as final so that others can view—but

not modify—the file.

Tip Word, Excel, and PowerPoint all give you the option of adding a digital signature to your

document to verify a document’s integrity. The digital signature feature in each of these applications

requires a signature service from a third-party vendor. You can begin the process by clicking

the Add A Digital Signature option in the Protect selection of the Info tab, and a prompt will

offer you the option of finding a signature service online.

Setting Role-Based Permissions

Another way to limit the access others have to your documents is to restrict permissions

by role. The Restrict Permission By People option in the Protect settings in the Info tab (in

Backstage view) of Word, Excel, and PowerPoint enable you to choose the group of people

who you want to give access to your document. For example, in Figure 13-7, Unrestricted

Access is still applied to the current file, but you can choose another setting to connect to

your organization’s rights management system, which defines the roles and permissions in

your system. By default, choosing one of these options displays the introductory page of

Information Rights Management, which is a free service from Microsoft that enables you to

authenticate the credentials of others who work with your files.

FIGURE 13-7 You can set multiple levels of protection in your files, including restricting permissions to a file.

Recovering Unsaved Versions

The process of retrieving unsaved copies of recent files might not be a security issue in terms

of defending against data loss or attack, but having the ability to easily retrieve these copies

can help you avoid a major headache if you forgot to save a file with business-critical data.

Now Office 2010 applications enable you to recover unsaved versions of files you’ve worked

on and retrieve the information you need.

You can find the recovered versions in Backstage view, in the Info category. Click Manage

Versions to display the list of options. (See Figure 13-8.) Click Recover Draft Versions to

display

the Open dialog box, which lists any available previous versions of the file.

Tip You can also display unsaved documents from the Recent category. Scroll to the bottom of

the Recent Documents list, and click Recover Unsaved Documents. Double-click the file you want

to view in the Open dialog box.

FIGURE 13-8 You can recover unsaved drafts and work with versions in Backstage view.

Working with the Trust Center

The Office Trust Center was introduced in Office 2007 and has been expanded and improved

in Office 2010. The Trust Center enables you to choose your specifications for the way in

which files are opened, shared, and protected, and it enables you to create lists of trusted

publishers, documents, and locations that don’t have to be authenticated each time you

receive

a document from them.

Display the Trust Center by choosing File and selecting Options in Backstage view. In the

Options window, click Trust Center at the bottom of the category list, and click the Trust

Center Settings button. (See Figure 13-9.) Table 13-2 lists each of the categories in the Trust

Center and explains how you can use those options to safeguard your files.

FIGURE 13-9 Displaying Trust Center settings.

TABLE 13-2 Office 2010 Trust Center

The Abandoned Shopping Cart: Crafting a Better Customer-Monitoring Experience

Posted on 25 Mar 2013 by escapebusinesssolutions in Uncategorized

Brick and mortar retailers can’t offer customers the shopping convenience of an e-commerce website, but they can offer a smoother shopping experience, especially when store management can physically see problems that may lead customers to leave their half-full shopping carts at the door. Having visibility into what customers are experiencing allows retail store managers to move quickly and decisively to rectify problems. They can see not only what their customers are doing, but why they’re doing it.

While many online retailers use Web analytics alone to gain visibility into customer behavior on their Websites, that visibility is limited. They can see what is happening on the site—how many customers are abandoning their shopping carts and exiting the Website, and what they were doing when they left—but they can’t see why this is happening. Reduced sales mean business goals are not being met, and, since e-commerce, or “e-tail,” has become a major source of sales for retail companies—for many, it’s the only source—it’s critical for these organizations to pay very close attention to customer experience.

Having visibility into all aspects of the customer’s Website experience greatly enhances the ability to speed problem resolution and deliver the excellent shopping experience that is critical to sales. Combining Web analytics with a user experience monitoring solution can provide this essential visibility, which allows IT to see everything the customer does on the site, as well as the way the system responds to every mouse click.

An excellent example of the benefits of user experience monitoring can be found in the online travel industry. When your company’s Website accounts for 40 percent or more of your total business sales, identifying and resolving Website performance issues fast enough to keep frustrated customers from exiting the site, and having the visibility needed to understand why customers had departed, are crucial.

For example, a leading global online travel agency decided to employ a user experience monitoring solution to improve its website performance, and, with the resulting visibility into every aspect of the customer experience plus automatic alerts to problems in real time, the company was able to resolve issues 97 percent faster. Because users now have a consistently positive Website experience, bookings have increased by 30 percent, which translates into increased revenue for the company.

Following are four tips for employing user experience monitoring to improve e-commerce Website performance and increase revenue in support business goals:

•Understand the Customer’s Entire Experience on the Website, Not Just Their Actions
To keep customers shopping, your IT team needs to know not only what the customer is doing on your site, but why they are doing it. There could be hundreds of reasons why customers abandon their online shopping carts—maybe they couldn’t get a good view of the product they want to buy, or maybe their credit card verification took too long, or they got an indecipherable error message. A solution that records and replays every step of every customer’s visit to your site, including what they put in their shopping cart—and works in tandem with your web analytics solution—will give IT the information needed to fix the problem fast.

•Look at the Entire Enterprise
Ensuring application performance and availability have never been more important, especially when those applications serve to generate revenue. Instead of monitoring the disparate components of the IT environment separately, take an application-centric approach that uses customer experience monitoring to look at performance through the application and end user perspective. Choose a monitoring solution that provides a clear view of the entire environment, with dashboard views that communicate the health of mission-critical applications so IT can quickly see and resolve issues that matter to the business before they impact customer experience.

•It’s Not About Good Performance, It’s About Peak Performance
Evidence shows that the speed of a Website is directly related to the conversion rate. With the understanding that peak performance drives revenue, you want to select a user experience monitoring solution that alerts IT to emerging issues in real time, so problems could be addressed quickly, before they affected customers. Now when operating at peak performance, you can see the company’s website increases in the number of hits and bookings, and the average transactional value.

•Go Beyond—Think About Recapturing Lost Business
Take user experience monitoring a step further by recapturing the revenue lost when customers abandoned their shopping cars and left your site. Hundreds of thousands of dollars in lost business have been recovered by online travel agencies by using a solution that automatically sends shopping details of customers who dropped off the Website to retention teams, who then were able to follow up with emails to those potential customers and facilitate new bookings.

Proactive use of experience monitoring enables IT to gain a clear understanding of the customer’s experience on the website, so quick and decisive action can be taken before the shopping cart is abandoned. If a customer does leave the site with a half-full cart, details of the customer experience can help recover lost business so that revenue and business goals continue to be met.

Getting Started with SQL Azure Development

Posted on 14 Mar 201314 Mar 2013 by escapebusinesssolutions in Uncategorized

Microsoft Windows Azure offers several choices for data storage. These include Windows Azure storage and SQL Azure. You may choose to use one or both in your particular project. Windows Azure storage currently contains three types of storage structures: tables, queues and blobs.

SQL Azure is a relational data storage service in the cloud. Some of the benefits of this offering are the ability to use a familiar relational development model that includes much of the standard SQL Server language (T-SQL), tools and utilities. Of course, working with well-understood relational structures in the cloud, such as tables, views and stored procedures, also results in increased developer productivity when working in this new platform. Other benefits include a reduced need for physical database-administration tasks to perform server setup, maintenance and security, as well as built-in support for reliability, high availability and scalability.

I won’t cover Windows Azure storage or make a comparison between the two storage modes here. You can read more about these storage options in Julie Lerman’s July 2010 Data Points column (msdn.microsoft.com/magazine/ff796231). It’s important to note that Windows Azure tables are not relational tables. The focus of this is on understanding the capabilities included in SQL Azure.

This article will explain the differences between SQL Server and SQL Azure. You need to understand the differences in detail so that you can appropriately leverage your current knowledge of SQL Server as you work on projects that use SQL Azure as a data source.

If you’re new to cloud computing you’ll want to do some background reading on Windows Azure before continuing with this article. A good place to start is the MSDN Developer Cloud Center at msdn.microsoft.com/ff380142.

Getting Started with SQL Azure

To start working with SQL Azure, you’ll first need to set up an account. If you’re an MSDN subscriber, then you can use up to three SQL Azure databases (maximum size 1GB each) for up to 16 months (details at msdn.microsoft.com/subscriptions/ee461076) as a developer sandbox. To sign up for a regular SQL Azure account (storage and data transfer fees apply) go to microsoft.com/windowsazure/offers/.

After you’ve signed up for your SQL Azure account, the simplest way to initially access it is via the Web portal at sql.azure.com. You must sign in with the Windows Live ID that you’ve associated to your Windows Azure account. After you sign in, you can create your server installation and get started developing your application.

An example of the SQL Azure Web management portal is shown in Figure 1. Here you can see a server and its associated databases. You’ll notice that there’s also a tab on the Web portal for managing the Firewall Settings for your particular SQL Azure installation.

Figure 1 Summary Information for a SQL Azure Database

As you initially create your SQL Azure server installation, it will be assigned a random string for the server name. You’ll generally also set the administrator username, password, geographic server location and firewall rules at the time of server creation. You can select the location for your SQL Azure installation at the time of server creation. You will be presented with a list of locations (datacenters) from which to choose. If your application front end is built in Windows Azure, you have the option to locate both that installation and your SQL Azure installation in the same geographic location by associating the two installations.

By default there’s no access to your server, so you’ll have to create firewall rules for all client IPs. SQL Azure uses port 1433, so make sure that port is open for your client application as well. When connecting to SQL Azure you’ll use the username@servernameformat for your username. SQL Azure supports SQL Server Authentication only; Windows Authentication is not supported. Multiple Active Result Set (MARS) connections are supported.

Open connections will time out after 30 minutes of inactivity. Also, connections can be dropped for long-running queries and transactions or excessive resource usage. Development best practices in your applications around connections are to open, use and then close those connections manually, to include retry connection logic for dropped connections and to avoid caching connections because of these behaviors. For more details about supported client protocols for SQL Azure, see Steve Hale’s blog post at blogs.msdn.com/b/sqlnativeclient/archive/2010/02/12/using-sql-server-client-apis-with-sql-azure-vversion-1-0.aspx.

Another best practice is to encrypt your connection string to prevent man-in-the-middle attacks.

You’ll be connected to the master database by default if you don’t specify a database name in the connection string. In SQL Azure the T-SQL statement USE is not supported for changing databases, so you’ll generally specify the database you want to connect to in the connection string (assuming you want to connect to a database other than master). Here’s an example of an ADO.NET connection:

Server=tcp:server.ctp.database.windows.net;

Database=<databasename>;

User ID=user@server;

Password=password;

Trusted_Connection=False;

Encrypt=true;

Setting up Databases

After you’ve successfully connected to your installation you’ll want to create one or more databases. Although you can create databases using the SQL Azure portal, you may prefer to do so using some of the other tools, such as SQL Server Management Studio 2008 R2. By default, you can create up to 149 databases for each SQL Azure server installation. If you need more databases than that, you must call the Windows Azure business desk to have this limit increased.

When creating a database you must select the maximum size. The current options for sizing (and billing) are Web or Business Edition. Web Edition, the default, supports databases of 1GB or 5GB total. Business Edition supports databases of up to 50GB, sized in increments of 10GB—in other words, 10GB, 20GB, 30GB, 40GB and 50GB.

You set the size limit for your database when you create it by using the MAXSIZE keyword. You can change the size limit or the edition (Web or Business) after the initial creation using the ALTER DATABASE statement. If you reach your size or capacity limit for the edition you’ve selected, then you’ll see the error code 40544. The database size measurement doesn’t include the master database, or any database logs. For more details about sizing and pricing, see microsoft.com/windowsazure/pricing/#sql.

It’s important to realize that when you create a new SQL Azure database, you’re actually creating three replicas of that database. This is done to ensure high availability. These replicas are completely transparent to you. The new database appears as a single unit for your purposes.

Once you’ve created a database, you can quickly get the connection string information for it by selecting the database in the list on the portal and then clicking the Connection Strings button. You can also quickly test connectivity via the portal by clicking the Test Connectivity button for the selected database. For this test to succeed you must enable the Allow Microsoft Services to Connect to this Server option on the Firewall Rules tab of the SQL Azure portal.

Creating Your Application

After you’ve set up your account, created your server, created at least one database and set a firewall rule so that you can connect to the database, you can start developing your application using this data source.

Unlike Windows Azure data storage options such as tables, queues or blobs, when you’re using SQL Azure as a data source for your project, there’s nothing to install in your development environment. If you’re using Visual Studio 2010, you can just get started—no additional SDKs, tools or anything else are needed.

Although many developers will choose to use a Windows Azure front end with a SQL Azure back end, this configuration is not required. You can use any front-end client with a supported connection library such as ADO.NET or ODBC. This could include, for example, an application written in Java or PHP. Connecting to SQL Azure via OLE DB is currently not supported.

If you’re using Visual Studio 2010 to develop your application, you can take advantage of the included ability to view or create many types of objects in your selected SQL Azure database installation directly from the Visual Studio Server Explorer. These objects are Tables, Views, Stored Procedures, Functions and Synonyms. You can also see the data associated with these objects using this viewer. For many developers, using Visual Studio 2010 as the primary tool to view and manage SQL Azure data will be sufficient. The Server Explorer View window is shown in Figure 2. Both a local installation of a database and a cloud-based instance are shown. You’ll see that the tree nodes differ slightly in the two views. For example, there’s no Assemblies node in the cloud installation because custom assemblies are not supported in SQL Azure.

Figure 2 Viewing Data Connections in Visual Studio Server Explorer

As I mentioned earlier, another tool you may want to use to work with SQL Azure is SQL Server Management Studio (SSMS) 2008 R2. With SSMS 2008 R2, you actually have access to a fuller set of operations for SQL Azure databases than in Visual Studio 2010. I find that I use both tools, depending on which operation I’m trying to complete. An example of an operation available in SSMS 2008 R2 (and not in Visual Studio 2010) is creating a new database using a T-SQL script. Another example is the ability to easily perform index operations (create, maintain, delete and so on). An example is shown in Figure 3.

Figure 3 Using SQL Server Management Studio 2008 R2 to Manage SQL Azure

Newly released in SQL Server 2008 R2 is a data-tier application, or DAC. DAC pacs are objects that combine SQL Server or SQL Azure database schemas and objects into a single entity. You can use either Visual Studio 2010 (to build) or SQL Server 2008 R2 SSMS (to extract) to create a DAC from an existing database.

If you wish to use Visual Studio 2010 to work with a DAC, then you’d start by selecting the SQL Server Data-Tier Application project type in Visual Studio 2010. Then, on the Solution Explorer, right-click your project name and click Import Data-Tier Application. A wizard opens to guide you through the import process. If you’re using SSMS, start by right-clicking on the database you want to use in the Object Explorer, click Tasks, then click Extract Data-Tier Application to create the DAC.

The generated DAC is a compressed file that contains multiple T-SQL and XML files. You can work with the contents by right-clicking the .dacpac file and then clicking Unpack. SQL Azure supports deleting, deploying, extracting and registering DAC pacs, but does not support upgrading them.

Another tool you can use to connect to SQL Azure is the latest community technology preview (CTP) release of the tool code-named “Houston.” Houston is a zero-install, Silverlight-based management tool for SQL Azure installations. When you connect to a SQL Azure installation using Houston, you specify the datacenter location (as of this writing North Central U.S., South Central U.S., North Europe, Central Europe, Asia Pacific or Southeast Asia).

Houston is in early beta and the current release (shown in Figure 4) looks somewhat like SSMS. Houston supports working with Tables, Views, Queries and Stored Procedures in a SQL Azure database installation. You can access Houston from the SQL Azure Labs site at sqlazurelabs.com/houston.aspx.

Figure 4 Using Houston to Manage SQL Azure

Another tool you can use to connect to a SQL Azure database is SQLCMD (msdn.microsoft.com/library/ee336280). Even though SQLCMD is supported, the OSQL command-line tool is not supported by SQL Azure.

Using SQL Azure

So now you’ve connected to your SQL Azure installation and have created a new, empty database. What exactly can you do with SQL Azure? Specifically, you may be wondering what the limits are on creating objects. And after those objects have been created, how do you populate those objects with data?

As I mentioned at the beginning of this article, SQL Azure provides relational cloud data storage, but it does have some subtle feature differences to an on-premises SQL Server installation. Starting with object creation, let’s look at some of the key differences between the two.

You can create the most commonly used objects in your SQL Azure database using familiar methods. The most commonly used relational objects (which include tables, views, stored procedures, indices and functions) are all available. There are some differences around object creation, though. Here’s a summary of those differences:

SQL Azure tables must contain a clustered index. Non-clustered indices can be subsequently created on selected tables. You can create spatial indices, but you cannot create XML indices.
Heap tables are not supported.
CLR geo-spatial types (such as Geography and Geometry) are supported, as is the HierachyID data type. Other CLR types are not supported.
View creation must be the first statement in a batch. Also, view (or stored procedure) creation with encryption is not supported.
Functions can be scalar, inline or multi-statement table-valued functions, but cannot be any type of CLR function.

There’s a complete reference of partially supported T-SQL statements for SQL Azure on MSDN at msdn.microsoft.com/library/ee336267.

Before you get started creating your objects, remember that you’ll connect to the master database if you don’t specify a different one in your connection string. In SQL Azure, the USE (database) statement is not supported for changing databases, so if you need to connect to a database other than the master database, then you must explicitly specify that database in your connection string, as shown earlier.

Data Migration and Loading

If you plan to create SQL Azure objects using an existing, on-premises database as your source data and structures, then you can simply use SSMS to script an appropriate DDL to create those objects on SQL Azure. Use the Generate Scripts Wizard and set the “Script for the database engine type” option to “for SQL Azure.”

An even easier way to generate a script is to use the SQL Azure Migration Wizard, available as a download from CodePlex at sqlazuremw.codeplex.com. With this handy tool you can generate a script to create the objects and can also load the data via bulk copy using bcp.exe.

You could also design a SQL Server Integration Services (SSIS) package to extract and run a DDM or DDL script. If you’re using SSIS, you’d most commonly design a package that extracts the DDL from the source database, scripts that DDL for SQL Azure and then executes that script on one or more SQL Azure installations. You might also choose to load the associated data as part of the package’s execution path. For more information about working with SSIS, see msdn.microsoft.com/library/ms141026.

Also of note regarding DDL creation and data migration is the CTP release of SQL Azure Data Sync Services (sqlazurelabs.com). You can see this service in action in a Channel 9 video, “Using SQL Azure Data Sync Service to provide Geo-Replication of SQL Azure Databases,” at tinyurl.com/2we4d6q. Currently, SQL Azure Data Sync services works via Synchronization Groups (HUB and MEMBER servers) and then via scheduled synchronization at the level of individual tables in the databases selected for synchronization.

You can use the Microsoft Sync Framework Power Pack for SQL Azure to synchronize data between a data source and a SQL Azure installation. As of this writing, this tool is in CTP release and is available from tinyurl.com/2ecjwku. If you use this framework to perform subsequent or ongoing data synchronization for your application, you may also wish to download the associated SDK.

What if your source database is larger than the maximum size for the SQL Azure database installation? This could be greater than the absolute maximum of 50GB for the Business Edition or some smaller limit based on the other program options.

Currently, customers must partition (or shard) their data manually if their database size exceeds the program limits. Microsoft has announced that it will be providing an auto-partitioning utility for SQL Azure in the future. In the meantime, it’s important to note that T-SQL table partitioning is not supported in SQL Azure. There’s a free utility called Enzo SQL Shard (enzosqlshard.codeplex.com) that you can use for partitioning your data source.

You’ll want to take note of some other differences between SQL Server and SQL Azure regarding data loading and data access.

Added recently is the ability to copy a SQL Azure database via the Database copy command. The syntax for a cross-server copy is as follows:

CREATE DATABASE DB2A AS COPY OF Server1.DB1A

The T-SQL INSERT statement is supported (with the exceptions of updating with views or providing a locking hint inside of an INSERT statement).

Related further to data migration, T-SQL DROP DATABASE and other DDL commands have additional limits when executed against a SQL Azure installation. In addition, the T-SQL RESTORE and ATTACH DATABASE commands are not supported. Finally, the T-SQL statement EXECUTE AS (login) is not supported.

Data Access and Programmability

Now let’s take a look at common programming concerns when working with cloud data. First, you’ll want to consider where to set up your development environment. If you’re an MSDN subscriber and can work with a database that’s less than 1GB, then it may well make sense to  develop using only a cloud installation (sandbox). In this way there will be no issue with migration from local to cloud. Using a regular (non-MSDN subscriber) SQL Azure account, you could develop directly against your cloud instance (most probably using a cloud-located copy of your production database). Of course, developing directly from the cloud is not practical for all situations.

If you choose to work with an on-premises SQL Server database as your development data source, then you must develop a mechanism for synchronizing your local installation with the cloud installation. You could do that using any of the methods discussed earlier, and tools like Data Sync Services and Sync Framework are being developed with this scenario in mind.

As long as you use only the supported features, the method for having your application switch from an on-premises SQL Server installation to a SQL Azure database is simple—you need only to change the connection string in your application.

Regardless of whether you set up your development installation locally or in the cloud, you’ll need to understand some programmability differences between SQL Server and SQL Azure. I’ve already covered the T-SQL and connection string differences. In addition, all tables must have a clustered index at minimum (heap tables are not supported).

As previously mentioned, the USE statement for changing databases isn’t supported. This also means that there’s no support for distributed (cross-database) transactions or queries, and linked servers are not supported.

Other options not available when working with a SQL Azure database include:
Full-text indexing
CLR custom types (however, the built-in Geometry and Geography CLR types are supported)
RowGUIDs (use the uniqueidentifier type with the NEWID function instead)
XML column indices
Filestream datatype
Sparse columns

Default collation is always used for the database. To make collation adjustments, set the column-level collation to the desired value using the T-SQL COLLATE statement.

And finally, you cannot currently use SQL Profiler or the Database Tuning Wizard on your SQL Azure database.

Some important tools that you can use with SQL Azure for tuning and monitoring include:
SSMS Query Optimizer to view estimated or actual query execution plan details and client statistics
Select Dynamic Management views to monitor health and status
Entity Framework to connect to SQL Azure after the initial model and mapping files have been created by connecting to a local copy of your SQL Azure database.

Depending on what type of application you’re developing, you may be using SSAS, SSRS, SSIS or PowerPivot. You can also use any of these products as consumers of SQL Azure database data. Simply connect to your SQL Azure server and selected database using the methods already described in this article.

It’s also important to fully understand the behavior of transactions in SQL Azure. As mentioned, only local (within the same database) transactions are supported. In addition, the only transaction-isolation level available for a database hosted on SQL Azure is READ COMMITTED SNAPSHOT. Using this isolation level, readers get the latest consistent version of data that was available when the statement STARTED.

SQL Azure doesn’t detect update conflicts. This is also called an optimistic concurrency model, because lost updates, non-repeatable reads and phantoms can occur. Of course, dirty reads cannot occur.

Database Administration

Generally, when using SQL Azure, the administrator role becomes one of logical installation management. Physical management is handled by the platform. From a practical standpoint this means there are no physical servers to buy, install, patch, maintain or secure. There’s no ability to physically place files, logs, tempdb and so on in specific physical locations. Because of this, there’s no support for the T-SQL commands USE <database>, FILEGROUP, BACKUP, RESTORE or SNAPSHOT.

There’s no support for the SQL Agent on SQL Azure. Also, there is no ability (or need) to configure replication, log shipping, database mirroring or clustering. If you need to maintain a local, synchronized copy of SQL Azure schemas and data, then you can use any of the tools discussed earlier for data migration and synchronization—they work both ways. You can also use the DATABASE COPY command.

Other than keeping data synchronized, what are some other tasks that administrators may need to perform on a SQL Azure installation?

Most commonly, there will still be a need to perform logical administration. This includes tasks related to security and performance management. Additionally, you may be involved in monitoring for capacity usage and associated costs. To help you with these tasks, SQL Azure provides a public Status History dashboard that shows current service status and recent history (an example of history is shown in Figure 5) at microsoft.com/windowsazure/support/status/servicedashboard.aspx.

Figure 5 SQL Azure Status History

SQL Azure provides a high-security bar by default. It forces SSL encryption with all permitted (via firewall rules) client connections. Server-level logins and database-level users and roles are also secured. There are no server-level roles in SQL Azure. Encrypting the connection string is a best practice. Also, you may want to use Windows Azure certificates for additional security. For more details, see blogs.msdn.com/b/sqlazure/archive/2010/09/07/10058942.aspx.

In the area of performance, SQL Azure includes features such as automatically killing long-running transactions and idle connections (more than 30 minutes). Although you can’t use SQL Profiler or trace flags for performance tuning, you can use SQL Query Optimizer to view query execution plans and client statistics. You can also perform statistics management and index tuning using the standard T-SQL methods.

There’s a select list of dynamic management views (covering database, execution or transaction information) available for database administration as well. These include sys.dm_exec_connections , _requests, _sessions, _tran_database_transactions, _active_transactions and _partition_stats. For a complete list of supported dynamic management views for SQL Azure, see msdn.microsoft.com/library/ee336238.aspx#dmv.

There are also some new views such as sys.database_usage and sys.bandwidth_usage. These show the number, type and size of the databases and the bandwidth usage for each database so that administrators can understand SQL Azure billing. A sample is shown in Figure 6. In this view, quantity is listed in KB. You can monitor space used via this command:

SELECT SUM(reserved_page_count) * 8192

FROM sys.dm_db_partition_stats

Figure 6 Bandwidth Usage in SQL Query

You can also access the current charges for the SQL Azure installation via the SQL Azure portal by clicking on the Billing link at the top-right corner of the screen.

Learn More

To learn more about SQL Azure, I suggest you download the Windows Azure Training Kit. This includes SQL Azure hands-on learning, white papers, videos and more. The training kit is available from microsoft.com/downloads/details.aspx?FamilyID=413E88F8-5966-4A83-B309-53B7B77EDF78.

Also, you’ll want to read the SQL Azure Team Blog at blogs.msdn.com/b/sqlazure/ and check out the MSDN SQL Azure Developer Center at msdn.microsoft.com/windowsazure/sqlazure.

If you want to continue to preview upcoming features for SQL Azure, be sure to visit SQL Azure Labs at sqlazurelabs.com.

Office 365 plan comparison – Enterprise and Kiosk subscriptions (E1, E2, E3, E4, K1, K2)

Posted on 6 Mar 2013 by escapebusinesssolutions in Uncategorized

Office 365 plan comparison – Enterprise and Kiosk subscriptions (E1, E2, E3, E4, K1, K2)

By: mike|March 13, 2012No Comments
Microsoft Office 365 Plans Compared

There is a lot of high level information about what features are included with each Office 365 plan. However, the details for the Office 365 plans are scattered among numerous Microsoft documents that aren’t easily found, yet alone understood. Here I’ve attempted to describe in detail, the differences between the Office 365 Kiosk and Enterprise plans. This includes the Office 365 E1, E2, E3, and E4 enterprise plans and the Office 365 K1 and K2 kiosk worker plans.

In addition, the Sharepoint Online component can be accessed by external users who can collaborate on documents and projects within Office 365 Sharepoint sites. These access licenses are called Partner Access License (PAL) and each Office 365 installation is granted 50 Partner Access Licenses by default. Currently Microsoft doesn’t enforce this limit and allows up to 1000 external users per Office 365 installation.

This article doesn’t address the Office 365 P1 plan. This plan is targeted at professionals and small businesses and is roughly equivalent to the Enterprise E3 plan, however, there are many differences within each service. These are addressed in a separate article comparing the Office 365 P1, E1 and E3 plans.

This article covers the three major services included with Office 365:

Office 365 Enterprise and Kiosk plans – Sharepoint Online – (E1, E2, E3, E4, K1, K2 Plans)
Office 365 Enterprise and Kiosk plans – Exchange Online – (E1, E2, E3, E4, K1, K2 Plans)
Office 365 Enterprise and Kiosk plans – Lync Online – (E1, E2, E3, E4, K1, K2 Plans)
Note: Microsoft lowered pricing on their enterprise plans on March 14th, 2012. Those changes are:

Office 365 New Pricing – Office 365 Plans E1, E2, E3, E4, K1, and K2
SKU Previous Cost New Cost Reduction
Office 365 K2 $10.00 $8.00 20%
Office 365 E1 $10.00 $8.00 20%
Office 365 E2 $16.00 $14.00 13%
Office 365 E3 $24.00 $20.00 17%
Office 365 E4 $27.00 $22.00 19%
SharePoint Storage (GB) $2.50 $0.20 92%
Exchange Advanced Archiving $3.50 $3.00 14%
Notes:
This pricing applies to new customers only. Customers that are under contract will still pay the same rate that their contract states. Rates will be updated when the contract is renewed.
If a current customers purchases additional seats, the new seats will be subject to the new pricing.

These Office 365 subscription plans are targeted at business of all size that require maximum flexibility in there online e-mail and collaboration service.

Office 365 Sharepoint Online Features for the Enterprise and Kiosk Plans

Office 365 Plan Comparison – Office 365 Plans E1, E2, E3, E4, K1, and K2
Sharepoint Online Features
Feature
Office 365 K1 and K2 Plans SharePoint Online Kiosk 1 and Kiosk 2
Office 365 E1 and E2 Plans SharePoint Online 1
Office 365 E3 and E4 Plans SharePoint Online 2
Office 365 Partner Access License (PAL)
(external partners)
Can access all team sites by default?
Yes
Yes
Yes
No1
My Site
No
Yes
Yes
No
Enterprise Features (Access, Business Connectivity Services (BCS), InfoPath Forms, Excel and Visio Services)
Yes3
No
Yes2
Yes2
Office Web Apps
K1 – View only
K2 – View and edit
E1 – View only
E2 – View and edit
View and edit
View only
Adds storage to the company’s overall pooled quota?
No
Yes. 500MB per user subscription license
Yes. 500MB per user subscription license
No
Can be an administrator of tenant, site or site collection?
No
Yes
Yes
No
1 – External partners can only access the sites they have been invited to by delegated site collection owners. 2 – Can view and upload Visio diagrams, view and build external lists, build and visit Access-based webpages, build and view embedded Excel graphs and create/publish, fill in and submit InfoPath forms. 3 – Kiosk workers have read-only rights except they can edit web-based and InfoPath Forms only.

Office 365 Partner Access License (PAL)

The Office 365 Partner Access License grants access to features like another Office 365 Plan. Below is an excerpt from the “Microsoft SharePoint Online for Enterprises Service Description”:

External sharing: The external sharing capabilities in SharePoint Online enable a company to simply invite external users in to view, share, and collaborate on their sites. Once a SharePoint Online Administrator enables external sharing, a site collection administrator can activate external sharing for the site they manage, and then invite external users to collaborate on sites, lists, and document libraries. An external user has access rights to only the site collection they are invited into. Please also note the external user use rights as explained above.

Note
Every Office 365 SharePoint Online customer (at the tenant level, not per subscription) includes 50 Partner Access Licenses (PALs) that can be leveraged for external sharing. Customers are not currently required to obtain additional PALs for external sharing beyond 50 users with a limit of 1000 until the next major update of the Office 365 service at which time Microsoft may choose to make it available as a paid add-on.
Microsoft supports invited external users signing in to the service using a Microsoft Online Services ID.
External sharing also supports Windows Live ID, including @Live.com, @Hotmail.com and @MSN.com user names, plus regional derivations of LiveID user names.
EasiID, the portion of LiveID that allows external users to associate their business email address (ex: user@contoso.com) to the LiveID system, is not supported at this time.

Office 365 Sharepoint Online “Server Resources” quota

From a development perspective SharePoint Online offers a flexible, robust framework for customizing and developing solutions in Office 365. The development features and patterns used to develop for SharePoint Online are a subset of those available in SharePoint 2010 on-premises.

Note
While SharePoint Online offers many opportunities to building customized solutions, the service does not yet support Full Trust Coded (FTC) solutions or what is sometimes referred to as farm-level solutions. The SharePoint Online development patterns and practices are currently targeted at site collection level solutions.
“Server Resources” quota, what are used to determine amount of processing power available to Sandboxed Solutions, is determined by the number of licensed user seats in a company’s tenancy. To calculate server resource quota in Office 365, you can use the following equation: (#seats×200) + 300. For example, in a typical 25 seat license, the available server resources quota would be 5300.
Neither Kiosk 1 (K1) or Kiosk 2 (K2) add to the overall total server resources quota
Companies cannot purchase server resources as a standalone add-on

SharePoint Online key features and specifications

These are common for all Office 365 plans that include Sharepoint Online.

Feature Description
Storage (pooled) 10 gigabytes (GB) base customer storage plus 500 megabytes (MB) per enterprise user
Storage per Kiosk Worker Zero (0). Licensed Kiosk Workers do not bring additional storage allocation.
Storage per external user Zero (0). Licensed external users do not bring additional storage allocation.
Additional storage (per GB per month); no minimum purchase. $2.50USD/GB/month
Site collection storage quotas Up to 100 gigabytes (GB) per site collection
My Site storage allocation (does not count against tenant’s overall storage pool) 500 megabytes (MB) of personal storage per My Site (once provisioned)
*Note: the storage amount on individual’s My Site storage cannot be adjusted.
Site collections (#) per tenant Up to 300 (non-My Site site collections)
Total storage per tenant Up to 5 terabyte (TB) per tenant
File upload limit 250 megabytes (MB) per file
External Users (PALs) 50 PALs are included per tenant. Current “Feature Preview” allows for usage rights of up to 1000 external users without requiring additional PALs. Microsoft reserves the right to charge for additional PALs beyond 50 at the time the next major Office 365 update.
Microsoft Office support Microsoft Access 2010 Microsoft Excel® 2007 and 2010 Microsoft InfoPath® 2010
Outlook 2007 and 2010
Microsoft OneNote 2010
PowerPoint 2007 and 2010 Microsoft SharePoint Designer 2010 Word 2007 and 2010
SharePoint Workspace 2010
Project Professional 2010
Browser support Internet Explorer 7 Internet Explorer 8 Internet Explorer 9 Firefox 3 and higher Safari 3.1.2 on Macintosh OS X 10.5 Chrome
Mobile device support Windows Phone 7.5 codenamed “Mango” or later Windows Mobile® 6.1 or later Nokia S60 3.0 or later
Apple iPhone 3.0 or later
Blackberry 4.2 or later
Android 1.5 or later

Office 365 Exchange Online Features for the Enterprise and Kiosk Plans

There are significant differences between the Office 365 kiosk and enterprise plans for Exchange Online. The Kiosk version has limited connectivity versus the enterprise plans. The primary benefit of the Office 365 E3 and E4 plans for Exchange Online is unlimited storage, the legal hold feature, and voicemail integration.

Office 365 Plan Comparison – Office 365 Plans E1, E2, E3, E4, K1, and K2
Exchange Online Features
Feature
Office 365 K1 and K2 Plans
Exchange Online
Kiosk
Office 365 E1 and E2 Plans
Exchange Online
(Plan 1)
Office 365 E3 and E4 Plans
Exchange Online
(Plan 2)
Office 365 Partner Access License (PAL)
(external partners)
Mailbox size
500 megabytes (MB)
25 gigabytes (GB)*
Unlimited**
Office 365 Partner Access License (PAL) users have not access to Exchange online freatures.
Outlook Web App
(regular and light versions)
Yes
Yes
Yes
POP
Yes
Yes
Yes
IMAP
No
Yes
Yes
Outlook Anywhere (MAPI)
No
Yes
Yes
Microsoft Exchange ActiveSync®
No
Yes
Yes
Exchange Web Services
No***
Yes
Yes
Inbox rules
No
Yes
Yes
Delegate access
No (cannot access other users’ mailboxes, shared mailboxes, or resource mailboxes)
Yes
Yes
Instant messaging interoperability in OWA
No
Yes (requires Lync Online or Microsoft Lync Server 2010)
Yes (requires Lync Online or Microsoft Lync Server 2010)
SMS notifications
No
Yes
Yes
Custom retention policies
Yes
Yes
Yes
Multi-mailbox search
Yes
Yes
Yes
Personal archive
No
Yes
Yes
Voicemail
No
No
Yes
Legal hold
No
No
No
*25 GB of storage apportioned across the user’s primary mailbox and personal archive **25 GB of storage in the user’s primary mailbox, plus unlimited storage in the user’s personal archive. Refer to the personal archive section of this document for further information regarding unlimited storage in the archive ***Direct access to Kiosk user mailboxes via Exchange Web Services is not permitted. However, line of business applications can use Exchange Web Services impersonation to access Kiosk user mailboxes

Office 365 E3 and E4 Plans – Unlimited Personal Archive

In Office 365, a personal archive can only be used to store one user’s messaging data. In the Office 365 E1 and Office 365 E2 plan each user receives 25 gigabytes (GB) of total storage which includes both the user’s primary mailbox and personal archive. This effectively limits the personal archive for an Office 365 E1 and E2 plan user to less than 25 GB.

An Office 365 E3 plan and Office 365 E4 plan user has 25 GB for their primary mailbox, plus unlimited storage in the personal archive. For Office 365 E3 and E4 plan users, the personal archive has a default quota of 100 GB. This is generally large enough for reasonable use, including importing a user’s historical email. In the unlikely event that a user reaches this quota, Office 365 support can increase the quota.

To change the Single Item Recovery period for a mailbox, an administrator must contact Office 365 support. The Office 365 E1 and E2 plans support Single Item Recovery period of up to 30 days. The Office 365 E3 plan and Office 365 E4 plan both support a Single Item Recovery period of any length.

Office 365 Lync Online Features for the Enterprise and Kiosk Plans

This table comparing Office 365 plans for Lync Online is a little overkill because the Kiosk plans don’t include Lync and external users don’t have Lync rights either. On top of that all Office 365 E plans include Lync Online Plan 2.

Feature
Office 365 K1 and K2 Plans
Office 365 E1 and E2 Plans
Lync Online
(Plan 2)
Office 365 E2 and E3 Plans
Lync Online
(Plan 2)
Office 365 Partner Access License (PAL)
(external partners)
Instant messaging (IM) and presence
Office 365 Kiosk K1 and K2 plans have no access to Lync Online features of Office 365.
Yes
Yes
Office 365 Partner Access License (PAL) users have not access to Exchange online freatures.
Lync-to-Lync audio/video calling (1-to-1)
Yes
Yes
Lync federation (IM/presence/audio/video)
Yes
Yes
Click-to-communicate in Office
Yes
Yes
Authenticated attendee in Lync meetings*
Yes
Yes
Microsoft Exchange ActiveSync®
Yes
Yes
Online Meetings
Yes (up to 250 attendees)
Yes (up to 250 attendees)
Initiate ad-hoc and scheduled online meetings
Yes
Yes
Initiate multiparty (3 or more users) Lync audio/video sessions
Yes
Yes
Initiate interactive data sharing (screen/application/whiteboard)
Yes
Yes
Interop with third-party dial-in audio conferencing services
Yes
Yes
*Unauthenticated attendees who join scheduled Lync meetings do not require a Lync Online license.

References:

The information in this article comparing Office 365 subscription plans was gleaned from the following Microsoft documents:

Office 365 Microsoft Sharepoint Online for Enterprises Service Description (Updated 12/6/2011)
Office 365 Microsoft Exchange Online for Enterprises Service Description (Updated 11/18/2011)
Office 365 Microsoft Lync Online for Enterprises Service Description (Updated 7/29/2011)

Microsoft Office 365for professionals and small businesses(Plan P1) Service Description

Posted on 21 Feb 201321 Feb 2013 by escapebusinesssolutions in Uncategorized

Microsoft Office 365

Microsoft Office 365
for professionals and small businesses
(Plan P1)

Service Description

Note: This document is provided for informational purposes only, and Microsoft makes no warranties, express or implied, with respect to this document or the information contained in it.

Published: February 2012
Updated: May 29, 2012

For the latest information, please see http://www.microsoft.com/online.

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication and is subject to change at any time without notice to you. This document is provided “as-is.” Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT.

This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes. This document is confidential and proprietary to Microsoft. It is disclosed and can be used only pursuant to a non-disclosure agreement.

The descriptions of other companies’ products in this document, if any, are provided only as a convenience to you. Any such references should not be considered an endorsement or support by Microsoft. Microsoft cannot guarantee their accuracy, and the products may change over time. Also, the descriptions are intended as brief highlights to aid understanding, rather than as thorough coverage. For authoritative descriptions of these products, please consult their respective manufacturers.

All trademarks are the property of their respective companies.

Microsoft, ActiveSync, Backstage, Entourage, Excel, Forefront, Hotmail, InfoPath, Internet Explorer, Lync, MSN Messenger, OneNote, Outlook, PowerPoint, RoundTable, SharePoint, Silverlight, SkyDrive, SQL Server, Visual Studio, Windows, Windows Live, Windows Mobile, Windows Phone, Windows Server, and Windows Vista are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

Contents 2

Introduction 2

1. Why Office 365 for Your Organization 2

1.1 Virtually Anytime, Anywhere Access 2

1.2 Easy to Use 2

1.3 Improved Collaboration 2

1.4 Security and Reliability 2

2. Overview of Services Provided by Office 365 2

2.1 Email, Calendar, and Contacts 2

2.2 Team Sites and Public Website 2

2.3 Office Web Apps 2

2.4 Instant Messaging and Online Meetings 2

3. Requirements for Using Office 365 2

3.1 System Requirements 2

3.2 Using Office Desktop Applications 2

3.3 Using Mobile Devices 2

4. Office 365 Security 2

5. Email, Calendar, and Contacts 2

5.1 Access Your Email, Calendar, and Contacts 2

5.2 Functionality of Your Outlook Email, Calendar, and Contacts 2

5.3 Large, Easy-to-Use Mailboxes 2

5.4 Professional Email Addresses 2

5.5 Automatically Update Your Email, Calendar, and Contacts across Devices 2

5.6 See Colleagues’ Availability from Your Outlook Calendar 2

5.7 Antivirus and Anti-Spam Filtering 2

5.8 Reduce Inbox Overload with Conversation View 2

5.9 Set Out-of-Office Replies 2

5.10 Recover Deleted Items 2

5.11 Access Other Email Accounts through Office 365 2

5.12 Personal Archive 2

5.13 Additional Features 2

6. Team Sites and Public Websites 2

6.1 Public-Facing Website 2

6.2 Manage Important Documents 2

6.3 Plenty of Space for Your Documents and Sites 2

6.4 External Sharing 2

6.5 Microsoft Office Integration 2

6.7 Familiar Look and Feel 2

6.8 Data Is Highly Secure 2

7. Office Web Apps 2

7.1 Never Be without the Tools You Need 2

7.2 Ensure Consistent Document Views 2

7.3 Edit Content with Confidence 2

7.4 Work Easily with Others 2

8. Instant Messaging and Online Meetings 2

8.1 Find and Connect with Colleagues and Customers 2

8.2 Easily Conduct Professional Online Presentations or Spontaneous Online Meetings 2

8.3 Interoperability with 3rd Party Dial-in Audio Conferencing Services 2

8.4 View Presence Status and Click-to-Communicate In Microsoft Office Applications 2

8.5 Communicate with Other Office 365 and Windows Live Users 2

8.6 Presence with Microsoft Outlook and Other Office Applications 2

8.7 Presence with Exchange Online 2

9. Additional Office 365 Service Details 2

9.1 Administering Office 365 2

9.2 Getting Help 2

9.3 Additional Self-help Resources 2

9.4 Countries Where Office 365 (Plan P1) Is Available 2

9.5 Languages 2

9.6 Licensing 2

9.7 Buying your Office 365 Subscription 2

9.8 Microsoft Office 365 Marketplace 2

9.9 Service Level Agreement 2

9.10 Data Center Locations 2

Appendix A: Exchange Online Detailed Feature Summary 2

Appendix B: SharePoint Online Detailed Feature Summary 2

Introduction

Office 365 for professionals and small businesses (Plan P1).
is a set of web-enabled tools that lets you access your email, important documents, contacts, and calendars from virtually anywhere and on almost any device. Designed for organizations with one to 25 employees (with a technical limit of 50 users maximum), the service brings together online versions of the best business-grade communications and collaboration tools from Microsoft plus Microsoft Office Web Apps at a price that small businesses can afford. Office 365 works seamlessly with the programs you already know and use — Microsoft Outlook, Microsoft Word, Microsoft Excel, and Microsoft PowerPoint. This is the much-anticipated cloud service that gives small businesses the capabilities and efficiencies to grow and target more rapid success.

Powerful security features from Microsoft Corporation help protect your data, and it will be backed with a 99.9 percent financially-backed uptime guarantee. Office 365 was designed to be easy enough for small businesses to run without specialized IT knowledge.

1. Why Office 365 for Your Organization

1.1 Virtually Anytime, Anywhere Access

Office 365 helps you access your email, important documents, contacts, and calendar on nearly any device from almost anywhere. It frees you to work where and when you choose, allowing you to respond to important requests right away, no matter where you are. Because you can use your mobile device to access email and documents, you won’t have to hurry back to the office (or look for a WIFI hot spot if you are using your computer). When traveling, you can access your email and even edit online documents from most popular web browsers.

1.2 Easy to Use

Office 365 is easy to try, simple to learn, and straightforward to use. It works seamlessly with the programs you know and use most, including Outlook, Word, Excel, OneNote and PowerPoint. With Office 365, you can choose which tools to use.

1.3 Improved Collaboration

With Office 365, you can create a password-protected portal to share large, hard-to-email files both inside and outside your organization, giving you a single location to find the very latest versions of files or documents, no matter how many people are working on them.

1.4 Security and Reliability

Powerful security features from Microsoft help protect your data. Office 365 is backed with a 99.9-percent uptime, financially backed guarantee. Office 365 helps safeguard your data with enterprise-grade reliability, disaster recovery capabilities, data centers in multiple locations, and a strict privacy policy. It also helps protect your email environment with up-to-date antivirus and anti-spam solutions.

2. Overview of Services Provided by Office 365

2.1 Email, Calendar, and Contacts

Powered by Microsoft Exchange Online

Office 365 provides you access to email, calendar, and contacts from virtually anywhere at any time on desktops, laptops, and mobile devices—while helping to protect against malicious software and spam.

Easily manage your email with 25-gigabyte (GB) mailboxes and send emails up to 25 megabytes (MB) in size
Work from almost anywhere with automatically updated email, calendar, and contacts across devices you use most, including PCs, Macintosh computers, iPhone, Android phones, Blackberry smartphones, Microsoft Windows Mobile^®, and Windows^® Phones
Connect with Microsoft Outlook 2010 or Office Outlook 2007 and use all of the rich Outlook functionality you already know and use, whether you are connected to the Internet at home, or in the office, or you are working offline
Access your email, calendar, and contacts from nearly any web browser while enjoying a rich, familiar Outlook experience with Outlook Web App
Use your existing domain name to create professional email addresses powered by Exchange Online (for example, mark@contoso.com)
Easily schedule meetings by sharing calendars and viewing them side by side, seeing your colleagues’ availability, and suggested meeting times from your calendar
Help protect your organization from spam and viruses with Microsoft Forefront^® Online Protection for Exchange, which includes multiple filters and virus-scanning engines

2.2 Team Sites and Public Website

Powered by Microsoft SharePoint® Online

SharePoint Online helps you create sites to share documents and information with colleagues and customers. It lets you:

Work together effectively by sharing team documents and tracking project milestones to keep everyone in sync
Keep your team’s important documents online so the latest versions are always at hand
Provide all team members with online access to critical business information whenever and wherever they need it
Easily protect critical business information by controlling who can access, read, and share documents and information
Market your small business using a simple public-facing website with a custom domain name (for example, www.contoso.com)
Publish, share and edit Access database applications on your Team Site

2.3 Office Web Apps

Hosted on Microsoft SharePoint Online

Office Web Apps are convenient online companions to Word, Excel, PowerPoint, and OneNote^® that offer you an easy way to access, view, and edit documents directly from your web browser.

Work with others simultaneously in Excel spreadsheets and in OneNote notebooks while seeing who is editing what parts of the document
Access and view Office documents from your mobile device
Ensure that viewers experience great fidelity between documents viewed with the Office Web Apps and those viewed in the desktop Office applications

2.4 Instant Messaging and Online Meetings

Powered by Microsoft Lync Online

Microsoft Lync™ Online helps you find and quickly connect with the right person from within the Office applications you already use.

Find and connect with colleagues and customers from virtually anywhere via rich presence, instant messaging (IM), audio/video calls, and online meetings
Use the Presence indicator to see when coworkers and partners are online and available
Make PC-to-PC audio and video calls with colleagues and customers
Conduct rich online meetings—including audio, video, and web conferencing—with people both inside and outside your organization
Share your desktop, online whiteboards, and presentations with colleagues and partners inside and outside of your organization
Click-to-Communicate with other users of Office 365 and Windows Live™ Messenger

3. Requirements for Using Office 365

3.1 System Requirements

Office 365 works effectively with many combinations of browsers, operating systems, and supporting software. Please refer to System Requirements for Office 365 to view the latest software requirements.

3.2 Using Office Desktop Applications

For the best experience with Office 365, a set of software updates must be applied to each PC. These updates are required for all workstations that use rich clients (such as Microsoft Office 2010) and connect to Office 365 services. To apply these updates, each user should run the Office desktop set-up program, which can be found on the Office 365 home page.

3.3 Using Mobile Devices

You can access the Email, Team Sites, and Instant Messaging capabilities of Office 365 from a variety of phones and mobile devices.

Exchange ActiveSync technology synchronizes mailbox data between mobile devices and Exchange Online, so users can access their email, calendar, contacts, and tasks on the go. Exchange Online also provides better data security features on mobile devices with password enforcement and remote data wiping capabilities.

Team Sites (powered by SharePoint Online) give you a central place to share documents and information with colleagues and customers. Team Sites can render on many devices (including Web-enabled mobile phones) using a simplified text-only format.

The Lync Mobile client lets you send and receive instant messages from your mobile device.Lync Mobile clients are available for the leading smart phone platforms, including Windows Phone, iPhone, Android, and Nokia Symbian.

4. Office 365 Security

Powerful security features from Microsoft help protect your data
with security standards that exceed what many businesses can provide for themselves. With high reliability, disaster recovery capabilities, data centers in multiple locations, and a strict privacy policy, your data is more secure. Availability to the services will be backed with a 99.9-percent uptime, financially backed Service Level Agreement (SLA) when the service is released for general availability. The service includes:

Access secure features: Exchange Online is accessed through 128-bit Secure Sockets Layer (SSL) or TLS encryption
Intrusion monitoring: Microsoft continuously monitors the Office 365 systems for any unusual or suspicious activity. If Microsoft detects such activity, it investigates and responds appropriately
Security audits: Microsoft regularly assesses the Office 365 Services infrastructure to ensure that the latest compliance policies and antivirus signatures are installed, along with high-level configuration settings and required security updates. The Office 365 services have:
- Achieved ISO 27001 certification
- Completed SAS70 Type I and II audits
- Added controls that assist customers in complying with certain regulatory requirements
High availability: Office 365 has a 99.9-percent scheduled uptime. If a customer’s service is affected, Office 365 offers a service credit subject to the terms and conditions of the SLA.
Business continuity: Redundant network architecture is hosted at geographically dispersed Microsoft data centers to handle unscheduled service outages. Data centers act as backups for each other: If one fails, the affected customers are transferred to another data center with limited interruption of service.

5. Email, Calendar, and Contacts

Powered by Microsoft Exchange Online

Key Features and Benefits

Office 365 messaging services, powered by Exchange Online, provide you with a 25 GB mailbox, contacts, and calendar that is available almost any time and from almost anywhere. Read and reply to your email directly from almost any major smartphone, including iPhone, Android, Nokia, Blackberry, and Windows Phone, or use almost any Macintosh computer or PC.

The following details provide a look at some of the key benefits and capabilities of the messaging services provided by Office 365.

5.1 Access Your Email, Calendar, and Contacts

Microsoft Outlook Web App is a web-based version of Outlook that provides the familiar, rich functionality and experience you are accustomed to from the desktop version of Microsoft Outlook. If you are limited by low bandwidth, Outlook Web App is optimized so it minimizes data and bandwidth use. Cross-browser support for Safari, Firefox, Chrome, and Internet Explorer ensures that wherever you are connected to the Internet—at home, at the office, or on the road—you can access your email through Outlook Web App.

Users can access Outlook Web App from a link on the Office 365 Portal.

Figure 1: Access your email from a broad range of browsers with Outlook Web App

5.2 Functionality of Your Outlook Email, Calendar, and Contacts

Office 365 is the only set of services designed to be fully compatible with Microsoft Outlook. Exchange Online works with Outlook 2010 or Office Outlook 2007, making it easier to use the familiar desktop application.

5.3 Large, Easy-to-Use Mailboxes

Exchange Online provides you with 25 GB of mailbox storage. This removes the need to archive email locally with PST files and allows real-time access to your messages from Outlook, a browser or a mobile device. Emails have a size limit of 25 MB, allowing you to send large files, including videos and PowerPoint slides.

5.4 Professional Email Addresses

Use your existing domain name to create professional email addresses powered by Exchange Online (for example, mark@contoso.com). Adding your domain to Office 365 means that you can have your own domain name on email addresses, Lync Online accounts, distribution lists and your public website. When adding a domain, you can also choose to continue to host your public website with another provider.

5.5 Automatically Update Your Email, Calendar, and Contacts across Devices

You can access your email, contacts, and calendar from mobile devices that incorporate Exchange ActiveSync^® technology. These devices maintain a connection with the service, receiving any new or updated emails messages, calendar items, contacts or tasks as soon as they arrive on the service. Mobile access is available from a wide range of
devices including iPhone, Android, Nokia, Blackberry, and Windows Phone.

5.6 See Colleagues’ Availability from Your Outlook Calendar

Exchange Online lets you access a consistent calendar from your multiple devices, share your calendar with people inside and outside your company, view multiple calendars side by side, and use the scheduling assistant to view availability and schedule meetings with people inside and outside your company.

5.7 Antivirus and Anti-Spam Filtering

All messages sent through the Exchange Online service are automatically scanned for viruses and malware to help safeguard your data. Exchange Online uses Forefront Online Protection for Exchange—an enterprise-level email filtering technology—to help protect your incoming and outgoing messages. The service uses proprietary anti-spam technology to help achieve high accuracy rates and uses multiple, complementary antivirus engines. Additionally, internal messages are scanned to protect you from viruses that may be sent through email messages within your organization. Antivirus and anti-spam protections are preconfigured and automatically updated, so there are no steps necessary for setting up, configuring, or maintaining the filtering technology.

5.8 Reduce Inbox Overload with Conversation View

By grouping conversations together, you can view messages in context and narrow the number of relevant messages in your inbox. Messages within the conversation are grouped, no matter where the message exists within the mailbox, which helps you and your employees be more productive.

5.9 Set Out-of-Office Replies

With the Exchange Out-of-office feature, you can see if someone is out of office before sending an email message or scheduling an appointment. You can schedule out-of-office messages in advance with specific start and end times. You can configure separate out-of-office messages for users in your company and for external users such as your customers or partners. Junk email and mailing list awareness prevents external out-of-office messages from being sent to extended mailing lists and spammers. You can also format out-of-office messages as rich HTML messages with hyperlinks rather than as plain text. Exchange Online also gives you the ability to set out-of-office messages from mobile devices that support this Exchange ActiveSync feature.

5.10 Recover Deleted Items

Exchange Online enables you to restore items that have been deleted from any email folder—including the Deleted Items folder—in case you accidentally delete an important item. These items are kept in a Recoverable Items folder for 14 days before being permanently removed. You can recover these items yourself using the Recover Deleted Items feature in Outlook Web App or Outlook.

5.11 Access Other Email Accounts through Office 365

You can connect to as many as five email accounts from Outlook Web App, letting you send, receive, and read email messages from those connected accounts in one place.

Windows Live Hotmail: You don’t need to turn on POP or IMAP access for a Windows Live Hotmail^® account. If you have folders in your Hotmail account, these folders are copied to your account in Outlook Web App along with the email messages downloaded from your Hotmail account.
Gmail: Allow POP access from your Gmail account to download mail from the Gmail account to Outlook Web App.
Yahoo Mail Plus, Comcast, AOL: These services give you POP access automatically and don’t support IMAP access.
IMAP Access: Outlook Web App supports IMAP access for most services, except Gmail. With IMAP access, your folders and mail items within those folders are downloaded to Outlook Web App the same way you see them in your other account. If your other account allows IMAP access, ensure IMAP access is turned on before you connect to the account.

5.12 Personal Archive

Exchange Online offers archiving through the personal archive capabilities of Exchange 2010 to help you store historical data that you rarely access. A personal archive is a specialized mailbox that appears alongside your primary mailbox folders in Outlook or Outlook Web App similar to a personal folder. You can access the archive in the same way you access your normal mailbox. In addition, you can search both your personal archive and primary mailbox.

Outlook 2010 and Outlook Web App provides you with the full features of the personal archive, as well as related features like retention policies which can help you organize and clean up your mailbox.

Outlook 2007 provides basic support for the personal archive, but not all features are available in Outlook 2007. For example, with Outlook 2007, you cannot apply retention policies to items in your mailbox.

Administrators can use the Exchange Control Panel to enable the personal archive feature for specific users in your company.

Size of the Personal Archive

Each personal archive can only be used to store one person’s messaging data. You receive 25 GB in storage which can be used across both your primary mailbox and personal archive.

Importing Data to the Personal Archive

You can import historical data to personal archives in the following four ways:

Import data from a .pst file using Outlook’s Import and Export wizard
Drag email messages from .pst files into the archive
Drag email messages from your primary mailbox into the archive
Set retention policies to automatically move certain email messages from your primary mailbox, based on the age of the messages

5.13 Additional Features

Global Address List: A Global Address List gives companies a common directory of all email-enabled users, distribution groups, and external contacts, helping to ensure that users can access the most recent contact information.
Resource Mailboxes: Use Outlook or Outlook Web App to schedule use of shared resources, such as a conference room. After setting up the room alias (ex. ConfRm1@contoso.com), users can reserve the room by adding the conference room email alias to meeting requests.
Distribution Groups: Distribution groups make it easy to send messages to multiple people. Unlike personal distribution groups that individuals create in Outlook, these distribution groups are available to all users through their Global Address List in Outlook.
Integrated Instant Messaging and Presence: Outlook Web App has instant messaging capabilities integrated into the web client, connected to Lync Online. Using the colorful status indicator of another person, users can see who is online and quickly decide if they should send an e-mail or just fire off a quick IM to get a fast response.
Message Delivery Status Reports: Flexible message tracking capability to search for message delivery status on e-mail sent to or from users in Exchange Online. A web-based user interface also allows administrators to search for delivery reports by subject and within the last two weeks.

For a detailed feature summary of Exchange Online, see Appendix A.

6. Team Sites and Public Websites

Powered by Microsoft SharePoint® Online

Key Features and Benefits

Office 365 makes it easy for you to share documents with colleagues, customers, and even trusted business partners. SharePoint Online is designed to work with familiar Office applications. It’s easy to create Office documents and save directly to SharePoint Online or co-author documents with Office Web Apps. Information workers can access important documents offline or from familiar mobile devices and set document-level permissions to protect sensitive content. With one click, it’s possible to communicate in real-time with colleagues and customers from within SharePoint sites.

The following sections provide information about some of the key benefits and capabilities of Team Sites and the public-facing website in Office 365.

6.1 Public-Facing Website

You can easily create a well-designed, public-facing website and apply a custom domain name (for example, www.contoso.com) using the built-in Site Designer tool. The built-in Site Designer tool provides many out-of-the-box templates you can use to personalize your site. Public sites built using SharePoint Online are excellent for small businesses that need a simple and attractive site.

6.2 Manage Important Documents

When a single document has multiple contributors, versioning and control issues can quickly become problematic. SharePoint Online provides your Team Sites with built-in document check-in
and check-out capabilities that work directly in Microsoft Office 2007, Microsoft Office 2010, and Office Professional Plus. In addition, two or more people can co-author a document using Microsoft Office 2010 and Office Professional Plus or Office Web Apps.

SharePoint Online document libraries can be configured so that revision numbers for documents are automatically updated every time a user checks in a document. You can also easily return to any previous version. Document collaboration in SharePoint Online is a well-developed, flexible feature that you can adjust to meet your specific requirements.

6.3 Plenty of Space for Your Documents and Sites

Each subscription to Office 365 comes with a SharePoint Online site collection that can host multiple sub-sites starting with 10 GB of storage plus 500 MB for each subscriber. For example, if you have 10 users, you would have 15 GB total of storage. This is in addition to the 25 GB each user gets for his or her email.

6.4 External Sharing

You can share documents and information easily with trusted business partners. A team site gives your organization a single location to find the latest versions of files or documents. You can access your team sites and the documents they contain from your web browser and your mobile device and work directly with documents from your Office desktop applications. SharePoint Online allows you to share documents and information more securely with colleagues and customers inside or outside your company. Major benefits of SharePoint Online team sites include:

Manage and share important documents to help teams work together
Track key project milestones and schedules with shared-calendars
Create, edit, and review documents and proposals in real-time
Share documents and information easily with trusted business partners
Manage important meeting notes and project delivery schedules
Enable real-time communication with colleagues right from within SharePoint
Apply your own unique look and feel to team sites with custom theming and branding

6.5 Microsoft Office Integration

Microsoft Office and SharePoint Online now work better together. In addition to document collaboration and management, new capabilities now enable co-authoring—two or more users can simultaneously work on the same document. With Outlook 2010 or Office Outlook 2007, you can view or edit calendars and contact lists that are stored on SharePoint Online sites and create and manage sites for organizing meetings.

Some highlights of the new functionality in Microsoft Office 2010 and Microsoft Office Professional Plus that interoperate with SharePoint Online include:

Backstage View: The Microsoft Office Backstage™ view allows you to manage your documents and related data—you can create, save and send documents, inspect documents for hidden metadata or personal information, and set options such as turning on or off AutoComplete suggestions.
Document Co-Authoring: With new co-authoring capabilities, multiple users can edit the same document at the same time, even if they are in different locations. They can even communicate as they work directly from within the desktop application.
Outlook: Gain read/write access to SharePoint Online items such as calendars, tasks, contacts, and documents. See complete views of calendars and tasks across multiple lists and sites.
Outlook Alerts: You can stay updated on changes to documents and list items on SharePoint sites by receiving notifications of changes as alerts and Really Simple Syndication (RSS).
Hosted Access Databases: You can easily publish Access 2010 databases from your desktop up to SharePoint Online using Access Services. You now have a way to create Web-based Access databases that are easily accessible as any other site to your broader peer group.

6.7 Familiar Look and Feel

Microsoft understands the value of keeping a consistent look and feel to its menus across different applications. When using SharePoint Online, you will find the familiar Ribbon featured in Office 2007 and Office 2010. The Ribbon has the features and the functionality you expect, saving you the time and frustration you may experience working with different online services.

Figure 2: Familiar look and feel with the SharePoint Online Ribbon

6.8 Data Is Highly Secure

All documents that you or your colleagues add to SharePoint Online are scanned for malware using multiple scanning engines. You can control who can access your documents stored in your password-protected sites, and you can further control access within SharePoint Online to designate who can view and edit documents and information.

For a detailed feature summary of SharePoint Online, see Appendix B.

7. Office Web Apps

Hosted on Microsoft SharePoint Online

Key Features and Benefits

Office Web Apps help you work with Office documents directly in a browser when you are away from the office or at a shared PC.
Office Web Apps are convenient online companions to Word, Excel, PowerPoint, and OneNote that give you the freedom to view and edit your Office documents from virtually anywhere with a supported browser and to view your documents on a supported mobile device.

The following sections provide information about some of the key benefits and capabilities of Office Web Apps provided by Office 365.

7.1 Never Be without the Tools You Need

If you are away from your office or home, and you find yourself using a computer that doesn’t have Microsoft Office installed, you can use the Office Web Apps to view and edit documents in Word, Excel, PowerPoint, and OneNote. Microsoft SharePoint Online team sites use the Office Web Apps to allow you to access, view, edit, save, and share your stored files from almost any computer with an Internet connection. You can even access and view PowerPoint, Word, and Excel content from a browser on mobile devices.

7.2 Ensure Consistent Document Views

You spend a lot time making your content look its best and you want to know that those who view your content are seeing what you intended. Office Web Apps provides professional, high-fidelity viewing of Word, Excel, PowerPoint, and OneNote files. You can take advantage of the rich features in Microsoft Office on your desktops to create content and then share those files online with great document fidelity and consistent formatting.

7.3 Edit Content with Confidence

When you create documents with Microsoft Office on your desktop, you might use rich content and advanced features such as graphics, images, tables of content, and cross-references to add impact to important information. Keep document formatting intact as you edit between the Office Web Apps and the corresponding desktop application.

7.4 Work Easily with Others

Office Web Apps makes it simple to collaborate on documents with people who use different platforms or different versions of Microsoft Office or simply don’t have Office installed on their computer. When you give someone access to your Office documents on SharePoint Online, they can view Microsoft Office documents through a supported Web browser using the Office Web Apps.

8. Instant Messaging and Online Meetings

Powered by Microsoft Lync Online

Key Features and Benefits

Microsoft Lync Online is a next-generation online communications service that connects people in new ways anytime from virtually anywhere. Lync Online provides rich and intuitive communications capabilities including presence, IM, audio/video calling, and an online meeting experience that supports audio, video, and web conferencing.

Lync Online transforms interactions with colleagues, customers, and partners from today’s hit-and-miss communications to a more collaborative, engaging, and effective experience that can help your business function more efficiently and cost effectively.

The following sections provide information about some of the key benefits and capabilities of Lync Online provided by Office 365.

8.1 Find and Connect with Colleagues and Customers

Businesses often face communications problems because people must repeatedly attempt to reach each other by phone or email. The problem gets worse when people communicate across geographies and time zones. Lync Online enables you to know when a colleague or partner is available to communicate and enables you to choose the proper communications method (IM, audio/video call, and/or data sharing) in order to resolve critical business discussion or make time-sensitive decisions.

Figure 4: Lync Online meeting with PC audio, video conferencing, and screen sharing

8.2 Easily Conduct Professional Online Presentations or Spontaneous Online Meetings

With Lync Online, you can have more effective interactions with colleagues and partners by escalating IM sessions into spontaneous online meetings including audio, video, and screen sharing in just a few clicks. You can also conduct professional online presentations with external customers, partners, and colleagues that include data, video, and audio with the ability to control content, annotate, and use a virtual whiteboard.

External attendees can join online meetings to view or share a screen and IM through a web browser. Alternatively, attendees can download and install the free Lync attendee software, which provides full fidelity PC-audio, video, and content sharing capabilities.

8.3 Interoperability with 3rd Party Dial-in Audio Conferencing Services

Dial-in audio conferencing is the ability to dial into a scheduled Lync meeting/conference from fixed-lines or mobile phones. This capability is not provided natively in Lync Online, but can be achieved through leading third-party audio conferencing services. See the Office 365 marketplace listings for more information about this optional interoperability.

8.4 View Presence Status and Click-to-Communicate In Microsoft Office Applications

Collaborating with others can be challenging if your job requires constant use of business productivity applications. Lync Online connects presence and real-time collaboration capabilities with the Microsoft Outlook messaging and collaboration client. This enables higher productivity by allowing you to collaborate using the familiar programs you and your colleagues already use.

8.5 Communicate with Other Office 365 and Windows Live Users

The federation feature of Lync Online establishes trusted relationships between your organization and one or more external organizations. This allows your people to see user presence and communicate across organizational boundaries. Public IM connectivity (PIC) allows your organization to more securely connect its existing base of enterprise-enabled IM users to trusted contacts using public IM services that can be provided by Windows Live Messenger.

All federated communications are encrypted between the IM systems using access proxy servers. Microsoft does not control encryption after messages are passed to the federated partner’s network (if the partner is federated with an on-premises Lync Server or third-party network).

IM federation requires the consent and proper configuration of both parties of the federation relationship. Once the federation is set up by both sides, users in each company can start seeing presence and communicating with users in the other company. Table 2 shows how federation affects IM, presence, and PC-to-PC audio and video.

Table 2: Federation features by link type

IM and Presence

PC-to-PC Audio and Video

Other companies using Office 365/Lync Online

Yes

Yes

Lync Server 2010 or Office Communications Server on-premises (any version)

Yes

Yes

Windows Live Messenger

Yes

Yes

Works with Office

8.6 Presence with Microsoft Outlook and Other Office Applications

Lync Online can connect presence with Microsoft Office 2007 or Office 2010. You can instantly find and communicate with people from within Office Outlook. This connection occurs wherever you see a colored presence indicator that represents a person’s presence status. You can then click the presence icon and initiate a communications using Lync (this feature is called “click-to-communicate”).

8.7 Presence with Exchange Online

Lync Online connects presence with Exchange Online. This includes presence status in Outlook, presence status changes based on Exchange calendar information, IM, and presence in Outlook Web App, out-of-office messages in the Lync client, and click-to-communicate via Lync Communicator from Outlook.

9. Additional Office 365 Service Details

9.1 Administering Office 365

Office 365 is easy to set up and use. Because it was designed for organizations without IT staff, you can focus on your business rather than learning how to navigate menus and reading unfamiliar technical words. Administration is performed using an intuitive, web-based portal accessible only to those you designate.

As an owner of your organization’s account, you are considered a Global Administrator. Global Administrators can create new user accounts, assign administrative roles to others, and configure the different services included in Office 365. You do not need any special technical expertise to be an administrator for Office 365 for professionals and small businesses.

Figure 5: Office 365 Administration Website

9.2 Getting Help

Customers who purchase Microsoft Office 365 for professionals and small businesses have the Microsoft Office 365 Community (www.community.office365.com) available as the primary way to have technical and billing issues resolved. Telephone support for any technical questions is not provided in the cost of the subscription.

The Office 365 Community

The Microsoft Office 365 Community is a single destination for self-help support information and community discussion. The Microsoft Office 365 Community has the latest information to help customers find answers to a variety of technical, billing and service questions via support forums, wikis, and blogs.

The Office 365 Community is a public website (community.office365.com) and is available 24 hours a day, 7 days a week. The support forums are staffed and moderated by Microsoft Support Agents. Anyone can view and read the support forums, wikis, and blogs related to Microsoft Office 365. We encourage customers, Microsoft Partners and Microsoft Most Valuable Professionals (MVPs) to engage with the community and contribute to the ongoing discussions. To actively post and reply to discussions within the Community, an individual must register and sign in with a Microsoft Office 365 ID or with a Windows Live™ ID (Hotmail, MSN, Windows Live).

Community Resources

From the Community home page you can access the following resources:

Forums are intended to provide Community participants with an online destination where they can post technical support questions and discuss topics related to the Office 365 service. Forums include categories dedicated to each of the individual online services as well as individual topics that our customers find valuable.

Wikis include wiki pages created by Microsoft employees and authenticated Community members. This collaborative site encompasses the latest collective content about specific Microsoft Office 365 technical scenarios. Each individual wiki page typically includes links to websites, webcasts, troubleshooting videos, frequently asked questions (FAQ) pages, documents, and downloads about that specific technical scenario. Historical tracking of every revision date and author alias is provided along with the ability to compare versions.
Blogs are a good resource for obtaining current information about Microsoft Office 365 online services and for learning about the benefits of Microsoft Office 365 features and functions. Within the Community portal for Microsoft Office 365 are two basic types of blogs: the Microsoft Office 365 Blog and the Microsoft Office 365 Technical Blog.
Microsoft Office 365 Blog focuses on the latest news and product information about Microsoft Office 365. The target audience is people interested in Microsoft Office 365. Sample topics include product insights, new product announcements, customer interviews, and a guest blog series.
Microsoft Office 365 Technical Blog helps existing customers with technical tasks or in troubleshooting common issues. The target audience consists of people using, selling, supporting, and developing applications for Microsoft Office 365. Sample topics include troubleshooting videos, technical webcasts, announcements about product feature updates, and showcasing of Microsoft partner technical solutions.

Help with Your Bill

Although community is the primary support vehicle for Office 365 for professionals and small businesses, customers can get help with billing issues by submitting a ticket from the Support Overview page in the Office 365 portal. Customer billing support will respond as appropriate depending on the severity of the issue by calling the customer, e-mailing FAQs, or pointing to community support.

9.3 Additional Self-help Resources

Virtual Support Agent

The Virtual Support Agent is an automated support agent that provides online support around the clock, interacting in a natural, conversational style. It is located on the Microsoft Office 365 Support Overview page. Customers use a text-chat interface to type questions in their own words and receive immediate responses. The automated agent has access to a variety of databases built on current content about Microsoft Office 365.

Technical Support Videos

The growing library of English-language-only instructional troubleshooting videos has been developed based on the most commonly asked questions from customers.

To view these videos, go to the Community site and search for videos. Customers are encouraged to submit a request for a video through the Community portal. Customers can also navigate to the Microsoft Office 365 YouTube and Showcase channels.

Learn Through Social Media

Following Microsoft Office 365 on Facebook, Twitter, and LinkedIn provides a way for customers and partners to become more educated about Microsoft Office 365. This fast and easy way of learning about Microsoft Office 365 allows customers to listen to what others are saying and be able to add their own comments and tweets. Microsoft support professionals monitor the Microsoft-related Facebook and Twitter activity to assist with any support-related inquiries.

To find the most current Facebook feeds along with the most recent Tweets, go to the bottom of the Community home page to hear the daily discussions among customers and partners.

9.4 Countries Where Office 365 (Plan P1) Is Available

Office 365 is available in 38 countries: Australia, Austria, Belgium, Canada, Colombia, Costa Rica, Cyprus, Czech Republic, Denmark, Finland, France, Germany, Greece, Hong Kong, Hungary, India, Ireland, Israel, Italy, Japan, Luxembourg, Malaysia, Mexico, Netherlands, New Zealand, Norway, Peru, Poland, Portugal, Puerto Rico, Romania, Singapore, Spain, Sweden, Switzerland, Trinidad & Tobago, United States, and UK.

9.5 Languages

Table 3 summarizes the languages supported the Office 365 platform and related components.

Table 3: Supported languages for components related to Office 365

Component	Supported languages
Office 365 Portal	English, Japanese, German, French, Italian, traditional Chinese, simplified Chinese, Danish, Dutch, Finnish, Norwegian (Bokmal), Spanish, Swedish, Brazilian, Portuguese, Czech, Greek, Hungarian, Polish, Romanian
Help content	English, Japanese, German, French, Italian, traditional Chinese, simplified Chinese, Danish, Dutch, Finnish, Norwegian (Bokmal), Spanish Swedish, Brazilian, Portuguese, Czech, Greek, Hungarian, Polish, Romanian
Community	English, Japanese, German, French, Italian, Spanish, traditional Chinese, Korean, Russian
Office desktop set up	English, Japanese, German, French, Italian, traditional Chinese, simplified Chinese, Danish, Dutch, Finnish, Norwegian (Bokmal), Spanish Swedish, Brazilian, Portuguese, Czech, Greek, Hungarian, Polish, Romanian

9.6 Licensing

Office 365 for professionals and small businesses (Plan P1) is designed for 1 to 25 users, but you may purchase up to 50 users. You can add or remove users at any time, but you cannot add more than 50 users.

Office 365 for professionals and small businesses is not available under Microsoft Volume Licensing. Subscriptions are available on a month-to-month basis and automatically renew each month. You can cancel at any time with no early termination fee.

Office Professional Plus can be licensed separately from Office 365 for professionals and small businesses.

9.7 Buying your Office 365 Subscription

Office 365 for professionals and small businesses gives you the option to sign up for a 30 day trial period or to sign up for a paid subscription. Before signing up, you will be required to sign the Microsoft Online Subscription Agreement (MOSA).

The trial is a free period so you can experience Office 365 without having to purchase a subscription. The trial provides the full functionality of Plan P, with the exception that it is limited to 10 users. Customers have a couple of options at the end or during the trial period:

Convert an existing trial to paid subscription: If you choose to convert your trial subscription to a paid subscription of the same plan, end users on your trial subscription are automatically transferred (with their data) to the paid subscription.

Figure 6: Click from within Office 365 to purchase during the 30-day trial period

Purchase a new paid subscription: If you choose to purchase a new paid subscription unrelated (different plan) to your trial subscription, you will need to manually assign users to the paid subscription. Purchasing a new paid subscription will not automatically move their data.

Your subscription term will begin on the day you convert to or purchase the paid plan subscription. Your first bill will occur on the first day of your subscription and subsequent bills will occur on the same day of each subsequent month. Your subscription will auto-renew each month unless you cancel.

Canceling your Office 365 Subscription

You can cancel your Office 365 subscription at any time without a penalty. Cancelation is available through the portal under the manage subscriptions tab.

After cancelation the subscription/service is in an active state until the end of the month. At the end of the month that the subscription is canceled, the account enters a 7-day grace period. During the grace period, a warning message is displayed in the portal but end users can continue to access the service. After the 7-day grace period the service goes into a 90-day disabled state. During the disabled state the end users cannot access the service. The administrator can access the service and retrieve data.

Visit www.Office365.com for the latest pricing information.

9.8 Microsoft Office 365 Marketplace

The Microsoft Office 365 Marketplace is specifically designed to help customers find trusted Microsoft Office 365 experts as well as applications and services that enhance and easily integrate with the Microsoft Office 365 suite of products. For example, customers can find a partner to purchase a custom domain to associate with their Office 365 website and email or audio conferencing providers to add dial-in phone numbers to Lync online meetings. Partners can also help migrate data and set up Office 365 services, so customers can get up and running more quickly.

Visit the Microsoft Office 365 Marketplace at http://office365.pinpoint.microsoft.com.

9.9 Service Level Agreement

Microsoft Online Services guarantees 99.9 percent uptime for all paid Office 365 subscriptions. These service levels are financially backed. That means, if Microsoft does not meet the terms of the Service Level Agreement (SLA), you are eligible to receive service credits equal to a percentage off your total monthly bill.

The following are the service credit tiers for SLA violation:

Monthly Uptime Percentage	Service Credit
< 99.9%	25%
< 99%	50%
< 95%	100%

9.10 Data Center Locations

Microsoft data centers are strategically located throughout the world to provide more secure and seamless, around-the-clock access to your data. Data is replicated to a secondary backup data center within the region to help protect against failures across entire data centers. When your company signs up for Office 365, its hosted environment is automatically provisioned in the appropriate data center based on your company’s address. All users for the company are hosted from the same region.

Appendix A: Exchange Online Detailed Feature Summary

This section presents overviews of Exchange Online features and specifications.

Feature	Description
Mailbox size	25 GB
Message size limits (max email size)	25 MB
Recipient limits	1500 recepients/day for each cloud-based mailbox
Deleted item recovery	14 Days
Deleted mailbox recovery	30 Days
CLIENT ACCESS
Outlook 2010 support	Yes
Office Outlook 2007 support	Yes
Outlook Anywhere (RPC over HTTPS)	Yes
Outlook Web App Premium experience	Internet Explorer 7+, Safari 3+, Firefox, Chrome
Outlook Web App Light experience	Most other browsers not supported in the Outlook Web App Premium experience
Outlook Web App: session time-out	6 hours
WebReady document viewing	Yes
Instant messaging and presence integrated into web email client	Yes, with Lync Online
Macintosh support (rich client)	Outlook 2011 for Mac
IMAP	Yes
POP	Yes
MOBILITY
Mobile Devices	Windows Mobile and Windows Phone, Nokia E and N series devices, Palm devices, Apple iPhone and iPad, and Blackberry (using Blackberry Internet Service)
Remote device wipe (implementation varies by mobile device manufacturer)	Yes
Disable Exchange ActiveSync access	Yes
Mobile device allow/block/quarantine	Yes
Mobile SMS sync (through Exchange ActiveSync)	Yes
SMS (text messaging) notifications	Yes
EMAIL/INBOX
“Send on behalf of” and “send as”	Yes
Shared mailboxes	Yes
Inbox rules	Yes
Tasks	Yes
Conversation view and actions (such as ignore conversation)	Yes
Connected accounts (aggregate mail from multiple external email accounts)	Yes
CONTACTS/DIRECTORY
Personal contacts	Yes
Personal distribution groups	Yes
Offline Address Book	No
Global Address List (GAL) photos	Yes
External contacts (in GAL)	Yes
CALENDAR
Out-of-office auto replies	Yes
Federated calendar sharing	Yes
Side-by-side calendar view in web client	Yes
SECURITY
Anti-spam (AS)	Forefront Online Protection for Exchange
Antivirus (AV)	Forefront Online Protection for Exchange AV for inbound/oubound, Forefront AV internal
COMPLIANCE/ARCHIVING
Disclaimers	No
Personal archive	Yes
ADMINISTRATION
Administration through a web-based interface (Exchange Control Panel)	Yes
APPLICATION ACCESS/CUSTOMIZATION
Application connectivity through web services	Yes
SMTP relay	Yes
Outlook Web App Web Parts	Yes
Outlook add-ins and Outlook MAPI	Yes

Appendix B: SharePoint Online Detailed Feature Summary

This section presents overviews of SharePoint Online features and specifications.

SharePoint Online feature overview

Feature	Description
Storage	10 GB with additional 500 MB per user
Buy additional storage	No
Max Org Users	50
Partner Access Licenses (External Sharing)	Yes – up to 500 external users/month
File upload limit	250 MB
Works with Microsoft Office 2010	Access 2010 Excel 2010 Outlook 2010 OneNote 2010	PowerPoint 2010 Microsoft SharePoint Designer 2010 Word 2010 SharePoint Workspace 2010
Browser support	Internet Explorer 7 Internet Explorer 8	Firefox 3 Safari 3.1.2 on Macintosh OS X 10.5
Mobile device support	Windows Mobile 6.5.x Nokia E series and N series devices Apple iPhone 2.0
Team Sites	Yes
Simple Public-Facing Website	Basic public site included, vanity URLs (custom domains) are supported
Site Designer	Yes
Sandbox Solutions (Partially Trusted Code)	Yes
Access Services	Yes
Office Web Apps	Available (both read and write access); View only for invited external users

Features of Microsoft SharePoint

Feature

Access Services

Accessibility

Audience Targeting

Basic Sorting

Best Bets

Blogs

Browser-Based Customizations

Client Object Model (OM)

Cross-Browser Support

Discussions

Duplicate Detection

Enterprise Scale Search

External Sharing – Partner Access

Improved Governance

Integration with Lync Online and Exchange Online

Language Integrated Query (LINQ) for SharePoint

Large List Scalability and Management

Microsoft Visual Studio^® 2010 SharePoint Developer Tools (to build and package Sandbox Solutions)

Mobile Connectivity & Search

Multilingual User Interface

Multiple Team Site Templates

Office and Office Web Apps intergation

Out-of-the-Box Web Parts

Permissions Management

Phonetics and Nickname Search

Photos and Presence

Recently Authored Content

Ribbon and Dialog Framework

Sandboxed Solutions

Search Scopes

Service Application Platform

SharePoint Designer

SharePoint Lists

SharePoint Ribbon and Fluent UI

SharePoint Service Architecture

SharePoint Workspace

Silverlight Web Part

Simple Public-Facing Website

Single Site Collection Search

Support for Accessibility Standards

View in Browser within Search Results

Wikis

Windows 7 Support

Workflow

Workflow Models

Enterprise-oriented SharePoint capabilities such as My Sites, Custom-code Workflows, InfoPath Forms Services, Excel Services, Visio Services, Business Connectivity Services, Advanced Web Analytics, and full-trust code are not included in Office 365 Plan P1.