Sunday, February 28, 2010

Errors moving mailboxes to an Exchange 2010 DAG

I ran into this today, doing my first production DAG with a copy on a kind of slow connection.

Error: Move for mailbox '/o=First Organization/ou=First Administrative Group/cn=Recipients/cn=user' is stalled because DataMoveReplicationConstraint is not satisfied for the database 'Database' (agent MailboxDatabaseReplication). Failure Reason: Database a409ab86-ce24-4fcf-bd2a-14fd633090aa does not satisfy constraint SecondCopy. Some database copies are behind.

Sure enough, a quick check of get-MailboxDatabaseCopyStatus showed that my CopyQueueLength was fairly high on the server across the WAN. As a result, my mailbox moves were failing with the above error. However, they don't fail right away, the StatusDetail shows StalledDueToHA. Some stayed there up to two hours waiting for the log shipping to catch up on the remote server before failing.

To show a more detailed output on move progress, I was using:
Get-MoveRequest Get-MoveRequestStatistics | ft displayname,*stat*,perc*,totalmailboxsize

So what Exchange 2010 is doing here is smart. Exchange Active Manager doesn't want that CopyQueueLength to be over 10 files, or the replay queue length over 50. More constraints here.

The workaround is to disable this limit, so your moves can occur and the seeding can occur over time. This is one of three Microsoft recommended fixes. One is fix your database health, one is upgrade your WAN. This third one is a workaround that should be reconfigured after the initial mailbox moves.

Set-MailboxDatabase -DataMoveReplicationConstraint None

The default here is SecondCopy. More information on the other settings at the link above. This change DOES require a restart of the Exchange Replication Manager service. Be forewarned, if you have a queue length already, the replication manager will hang on stopping and attempt to complete the copies before stopping, so it might take some time.

Of course, once your moves are done, and your database's CopyQueueLength is normalized, you should re-enable this constraint using:

Set-MailboxDatabase -DataMoveReplicationConstraint SecondCopy

Labels: ,

Wednesday, November 25, 2009

Exchange 2010 - Moving mailboxes Video

I made this as a quick training piece internally and for customers.. Check it out!



Labels: ,

Tuesday, November 10, 2009

Exchange 2010 Storage Calculator released!!

YES! For those planning on moving to Exchange 2010, you can now begin planning your infrastructure's hardware plans!

Exchange team has the latest:
http://msexchangeteam.com/archive/2009/11/09/453117.aspx

I personally have 3 clients awaiting this - very excited to run through it!

I haven't read all of it yet - will re-post tomorrow with any findings of note.

Labels: ,

Wednesday, October 21, 2009

Exchange 2010 - Database Availability Groups - Part 2

In Part 1 of my DAG coverage, I gave some generic design considerations, and the how to configure a 2 server DAG. In part 2, I will cover adding an additional DAG copy of a database.

Before you begin, you need to rapidly deploy your OS and pre-requisites, and then you can deploy an additional DAG copy, with a few mouse clicks!

First, add your new server to the DAG group:
then, add the server we are adding to the DAG:

Then, add a database copy in the database management section of Organization Management:



And after some time seeding:

Now, let's test that UK server! First, I used OWA to send myself a recent email.

Then activate the EU server:

Choosing a lossless activation:

Now, this being a different AD site, I saw this:


Which was very concerning at first, but after some time (I have my VM's scaled down quite a bit) it synced up:


Confirming via the OWA CAS on EXCH2010EU, my latest email was intact:

That's it. We seeded, and failed to the UK site. Now we can seed back to the main server:

I performed another lossless activation, and it worked perfect:


And before I end part two, let's try and BREAK it � here I am hitting the hard power button on the activated DB server for geodb1.


After re-attaching RDP to the next server, and refreshing the view:


The local server mounted, and the UK site is healthy still. OK, let's plug in the power cord I "tripped" over and see how things go.
Few seconds after boot up:

At which point we can choose to move it back, or let Exchange manage it autonomously.

Labels: , , ,

Tuesday, October 13, 2009

Exchange 2010 - Database Availability Groups - Part 1

This is part one of a two part series on Database Availability Groups, which is the Exchange 2010 High Availability and Disaster Recovery technology.


DAGs are similar to Exchange 2007's SCR and CCR technologies, but mixed together, along with a few key benefits like:


Incremental Deployments - You can start with a single server environment and add hardware later to phase in a DR plan or additional high availability. However, you would need to deploy Exchange 2010 on Windows 2008 R2 Enterprise as DAG's require the Failover Clustering role. In a lab, I wouldn't mind an in place Standard to Enterprise in place upgrade, but your opinion may differ on such an upgrade on a server containing production data.


Automated Cluster Management - In Exchange 2007, CCR Cluster management needed to be pre-configured on server nodes, and there was additional complexity and confusion on what tasks you performed in Cluster Administrator, and which in Exchange Management Console or Shell. In Exchange 2010, this is ALL performed from native Exchange tools, and most can be done from a GUI.


Combined HA and DR plan - In Exchange 2007 CCR was the High Availability solution, while SCR was the Disaster Recovery solution. In order to effectively get both, deploying both was required. There was a combined solution known as "Stretched CCR" where a node would be in a different physical site, but this was fairly complicated and had high bandwidth requirements.


Lower Bandwidth requirements - In Exchange 2007, SCR used SMB (Server Message Block) for transaction log shipping. SMB is well known as being a "fat" network protocol. In Exchange 2010, Exchange Administrators can choose a single TCP port (configurable) for direct TCP connectivity for log shipping.


Less Powershell - If you checked out the Exchange 2007 SCR article I recently published, you see there is a LOT of PowerShell, which is why I wrote the article - there are a lot of steps and places for confusion, mistyping, and mistakes. And in the event of a diaster, that can be pretty harrowing. The Exchange 2010 DAG process, while of course is able to be managed from PowerShell, is not a requirement. Most of the screenshots I will post you will see are done in the GUI.


There is a LOT to be learned about scaling and planning DAG's, and I am leaving this to Microsoft for now. Below are some valuable links on DAGs, DAG planning, and planning disks for your member servers in a DAG.

Planning for High Availability and Site Resilience - the requirements here is pretty much required reading if you are considering implementing Database Availability Groups in your environment.
Mailbox Server Storage Design Recommendations - at the time of this posting, this was marked as "in tech review" - about 1/2 way down you can tell the article is talking about Exchange 2007 more than 2010. There also is still not an Exchange 2010 storage calculator, this will hopefully come soon.
Deploying High Availability and Site Resilience - much of this How to is based on information here.

First off, my environment - Exchange 2010 is ONLY 64 bit, so of course, that is the architecture in play here.

Server

OS

LAN Network

Replication Network

EXCH2010

Windows 2008 R2 Enterprise

192.168.201.62

10.1.1.62

EXCH2010DR

Windows 2008 R2 Enterprise

192.168.201.63

10.1.1.63


Currently, there is one database on EXCH2010 named geodb1 - it contains a single user mailbox. As you can see, I configured this database with non-default paths. While using the default path is fine, if and when you need to specify a path in PowerShell ever, it's easier if its not 50 characters of path to type.



Whatever server is the witness needs to be a member of the Exchange Servers domain group, or you will receive an error message. If you are using a domain controller as your FSW, you need to read this article by Devin Ganger. the DC's computer account to be a member of the Exchange Trusted Subsystem group, and your Exchange Trusted Subsystem group must be a member of your domain's BUILTIN\Administrators group. Thanks to Tom Pacyk for this information! If you receive the warning "Insufficient permissions to access file shares on witness server" please review Tom's article.



Once we create the DAG, we can now add servers to the DAG.


Adding both servers is as simple as right clicking the DAG and choosing "Manage Database Availability Group membership." This process may take a while depending on hardware, as during this step is when Failover Clustering is installed and configured on both servers.


This process also created the DAG networks for me, and now my DAG screen looks like this:



I do not want replication enabled on the 192.168.201.x network, I want it explicitly on the 10.1.1.x network, so I disabled replication for DAGNetwork01, and also told the DAG to ignore the network using Powershell:

Set-DatabaseAvailablityGroupNetwork -Identity GEODAG\DAGNetwork01 -IgnoreNetwork:$true -ReplicationEnabled:$false

Now we can add a database to our DAG!

Then right click the database you want a copy of and choose "Add Mailbox Database copy" and choose the server you want to add a copy on.





This will complete, And then you can see the database re-synchronizing, and then come Healthy:




Here, I decided to reboot the EXCH2010, and refreshed the EMC console on EXCH2010DR:



We can see that EXCH2010 failed, and EXCH2010DR now has my data. I logged into OWA and confirmed.

When EXCH2010 came back up, I waited and refreshed again:



Healthy! Very neat.

At this point, we can manage where the data is with the right click context menu on DB copies:



Choosing which, we have several options:



In an "available" scenario Lossless is the best choice.



We can see the status flip and copy queue stack in the other's favor.

In part two of this, I will add a third Exchange server at a DR location to add some DR planning and also show how this affects DAG network planning. In addition, I will hopefully dig in to more failure scenarios and how to address them. Keep in mind, like I said - this is a pretty basic DAG implementation, and no real planning of physical architecture lying underneath it.

Continue Part Two

Labels: , , ,

Thursday, July 09, 2009

Exchange 2010 Mailbox Migration Overview

I decided to dogfood Exchange 2010 internally at Simpler-Webb. This is the pilot group migration. If you are a customer, you may recognize some names here. I did this via the GUI, but Powershell can be used as well. The best part of mailbox moves, of course is that they are ONLINE moves if you are moving from Exchange 2007 SP2 or another 2010 server. One caveat, if you have to move anyone back to 2007 that will be an offline move!

First, select your users in the EMC, and then select the "New Local Move Request" action


This then re-displays your selected users, and allows you to select the mailbox database.

Skipped screen - asking about corrupted items, skipping messages - same screen on Exchange 2007 move-mailbox GUI.

This is the "Review" screen:

If you are used to seeing and waiting at this screen from 2007, when you click New here you will be pleasantly surprised.

Yep, 7 seconds. Great. So they are moving now. Hitting Finish, you now have to check in on progress. This is under the Recipient Configuration, under Move Request.

Under Recipient Configuration you can see the move request and status.

If you double click on one of these and view properties, you can get more detail on what the status is:


When user moves are completed, you get a status like this:


Armed with this, we can investigate the warnings if any, and then select and right click and "Clear move request" to complete the process. The powershell for this is "remove-moverequest"


Deleting it from here will remove the move request, but we can still view the history of the move using the get-mailboxstatistics with the -IncludeMoveHistory flag:

Hopefully this gives you the information you need to manage your initial pilot users successfully. Feel free to comment with your experiences.

Finally - while mailboxes are moving, rather than using the GUI to view the percentage complete, you can use the following powershell to output status of your moves:

Get-MoveRequest Get-MoveRequestStatistics ft Alias,Status,PercentComplete -auto

Labels: ,