This article explains the steps involved while using the SharePoint Online Migration API and the factors that influence migration speed at each phase.
The four steps of migration
With the new SharePoint Online Migration API, we have drastically changed the way migrations are done, which also impacts the speed you can expect when migrating your data. The goal of this article is not to give a detailed explanation of how the Migration API works but instead examine how much time is spent at each step of the migration process and what factors influence the speed.
Note: Steps 2, 3 and 4 are normally performed by using either the SharePoint Online SPO Migration PowerShell commands or a 3rd party migration tool. It is important to perform Step 1 to know what tool will fit best your need.
1. Scan the Source
The first rule of a good migration is to always know your source. Evaluate your data and triage what your needs are. What content really needs to move? What can be left behind? As you assess your data, it will give you a better idea of the speed to expect in the subsequent steps. Use this time to clean up your archives since the amount of content you are moving will determine to overall size of your project.
2. Package the content
This step is where the chosen tool creates a proper package for the content to be imported into the cloud. This corresponds to the New-SPOMigrationPackage and ConvertTo-SPOMigrationTargetedPackage in SharePoint PowerShell cmdlets for SPO Migration. The speed of this step depends on the efficiency of the tool and the type of content that you package. Splitting your packages in a smart way is something that will also greatly improve the last step.
3. Upload to Azure
As you move content into SharePoint Online using the new Migration API, Azure is leveraged as a temporary holding place. The network speed to upload to Azure is much faster and lets you choose your datacenter. If you have a good connection, you might want to choose the same datacenter location for your Azure and your O365 account. This corresponds to the Set-SPOMigrationPackageAzureSource command when using PowerShell. If your network is slow, consider using the Azure Datacenter the closest geographically to you. The last option is to ship physical hard drives to Azure. The speed of this step depends on your internet connection or the time it takes to ship the drives to Microsoft. Sites such as Microsoft Azure Storage Performance and Scalability Checklist can provide a good idea of what to expect.
4. The Migration API
The final step is the migration of data from Azure to SharePoint Online. This action is transparent when using a 3rd party tool, but corresponds to the Submit-SPOMigrationJob PowerShell command. Microsoft has control at this step, so we will go more in detail below on what to expect.
Note: For the purpose of this article every time a call is made to the API for a package to be ingested into SharePoint Online it will be called a “Migration Job”.
Once the call is made to the API, the migration job in placed in the queue. Normally the migration jobs are picked up within 1 minute. The number of migration jobs to the same O365 tenant can vary depending on the traffic, but 8 to 16 parallel Migration Jobs is what can be expected.
Speed of the API per Migration Job
Once the package is in Azure and the Migration Job has been picked up here is a graph explaining observed speed using the API per Migration Job.
For example: If 16 Packages are being imported at the same time with a spreading a number of types of content, the total reach in average 2X16 = 32 GB/H. If there are only SharePoint list items, we normally observe 0.5X16 = 8GB/H which is still a lot of list items since they are small items.
Optimizing your migration
Planning is the key to optimizing your migration. Your goal when using the API is to try to have as many migration jobs running in parallel all the time to maximize your throughput.
Some tools already do the splitting of the packages in a smart way and others leave it up to you to do the smart splitting of the jobs. It is important to look at the whole process and make sure to always address the bottleneck first. In some cases, the bottleneck will be their internet speed or the manual labor involved into preparing the content.
There will still be a limit to how many jobs can be run against the same site collection. This is why it is very important to run parallel jobs against different site collections as much as possible. You should make sure to have pre partitioned your site collections so that your content is evenly spread out.
Microsoft Consultant Services has already completed migrations for customers coming from SharePoint on premise to SharePoint Online using the new migration API. For one Customer here was the duration of things. The scanning phase took 2 weeks per sources going into SPO. Then on top of this they add another 3 weeks for enablement and remediation. Basically, a good job at Analysing the sources and cleaning up should take more than a month.
An observed speed when coming from on premise to the cloud was about 500GB per week. This include the content move but also things like fixing issues as they occur or others kind of remediation. Overall the movement of the content accounts for a very small portion of the time spent on the migration.
However, when bringing a file share into SharePoint Online or OneDrive for Business, things tend to go muchfaster since things are less likely to hit Sharepoint specific issues and customizations. We have observed tenants bringing 500GB. file share in a normal work day without even calling support for more parallel jobs.
CSOM and Throttling
While the API supports content to be brought over, some interactions still must be made using CSOM. It is important to note that the API is not throttled, the CSOM is throttled. We can’t reduce the throttling for CSOM since it keeps the service healthy for everyone. We have seen successful large scale migrations avoid being throttled by using the Migration API. They would have been throttled if they had used CSOM.