Dataverse Clone Wars: Bulk-Merging Duplicates with SSIS (KingswaySoft Guide)

A long time ago in a database far, far away… 😁
It is a time of chaos. The mighty Dataverse is under siege by an overwhelming army of duplicate records. Accounts and contacts have been cloned endlessly, threatening to destabilize the CRM galaxy.
Armed only with an SSIS (SQL Server Integration Services) package, a brave data engineer rises to bring balance to the system. Their mission: to bulk merge duplicates, unify scattered data, and restore peace to the Dataverse.
At this point, you might be asking yourself:
“Is SSIS really the only choice? I never heard of SSIS, but Power Automate always makes my dreams come true.”
And I tell you No, Power Automate can do it, but you might want to go for a walk while it finishes…
TL;DR:
Use the ready-made KingswaySoft SSIS package to bulk-merge duplicates in Dataverse.
Connect via SOAP (or OAuth), define your match logic (e.g.,emailaddress1), adjust comparison settings, preview results with Data Viewer, and always test safely in a sandbox before going full Jedi.
Why not do it in Power Automate?
Power Automate can trigger actions that can merge Records — I tried this myself, but making it efficient is a challenge I’ve yet to accept.
The main issue isn’t functionality — it’s scale. When you’re comparing thousands of records, Power Automate becomes more of a Padawan than a Jedi Master.
So if your duplicates are manageable in number, Power Automate might still be the right tool. I recommend this article:
👉 Automatic Duplicate Account Merge in Dynamics 365 Sales with Power Automate
But if you’re facing 50,000+ contacts or accounts, SSIS will handle the load like a true Sith Lord of data.
Requirements
To prepare your SSIS package for battle, you’ll need the following tools:
- Visual Studio 2019/2022 (Professional or Enterprise edition)
- SQL Server Data Tools (SSDT) for Visual Studio
- KingswaySoft SSIS Productivity Pack
- KingswaySoft SSIS Integration Toolkit for Microsoft Dynamics 365
If it’s your first time working with SSIS — I’m proud of you! 🫡
Start here: How to install SSIS, KingswaySoft, and Visual Studio for Dynamics 365
⚠️ Safety first
- Run in a sandbox environment first.
- Export a CSV of candidate duplicates (source + master).
- Keep an audit log of merges (
winnerid → loserid) so you can restore data if needed.- Start small: test with Top 100 records before a full-scale run.
The Battle
Once we’re all set, the nice guys of Kingswaysoft already offered a packaged solution to your problems, you just have to set it up 😊
Click here to download the Kingswaysoft SSIS Package
Open Visual Studio and create a new project: Integration Services Project.
Ok, now unzip the KingswaySoft package and add it to your project. In the Solution Explorer, right-click SSIS Packages → Add Existing Package.

Choose File System as the Package Location, and then browse to the unzipped package path.

When you see “Succeeded in upgrading the package” means you’re on the right track 😉
Next, open the MergeCRMDuplicates.dtsx package, and double-click the Find and Merge CRM Duplicate Contacts task in the Control Flow. You’ll land right in the command center:
Setting up the Connection
Then we can Click on “Dynamics CRM Connection Manager” and set up the Connection to our Dynamics environment:
Connection tip: Even though to use the Dynamics 365/Dataverse Connection Manager with OAuth (Web API) seems more appealing than SOAP. In my experience the SOAP connection technology is more developed than the Web API with Kingswaysoft. I had cases in which the logics built in SSIS were working perfectly with SOAP but not with WebAPI. For reference: Using the CRM/Dataverse/CDS Connection Manager
The Duplicate Logic
In my example, contacts are duplicated by email address.
The KingswaySoft package already retrieves all contacts, so I opened the Duplicate Detection on name component and will rename it to Duplicate Detection on email for clarity.
Here you can define:
- The column to match on (e.g.,
emailaddress1) - The match type (Exact Match, Fuzzy Match, etc.)
- The similarity threshold (percentage of similarity to be considered a match)
The real power of this step lies in the comparison settings. These options normalize your column values before matching — for example:
- Ignore extra spaces
- Treat uppercase/lowercase as equal
- Trim trailing punctuation
You can also set a Ranking Strategy — this determines which record “wins” when multiple duplicates are found.
Translation: when multiple records match, the ranking defines which one becomes the master.
By default, it uses Similarity Score, but when you use exact matching, they’ll all have a score of 1.
I like to set a custom strategy where the oldest record wins, assuming it’s the original.
In other cases, I prioritize contacts that aren’t “marketing-only” — merging those into the primary one.

Testing and Merging the data
Want to see what’s happening before executing the merge?
Right-click on a connector line in your control flow and select Enable Data Viewer.
This lets you preview the matching results, review scores, and ensure your logic works as expected.

Now, let the package run — you’ll see the number of merged contacts rising heroically as duplicates fall one by one.

Final Pointers
You merged contacts you weren’t supposed to and you can’t find them?
Don’t panic. They’re not gone — just inactivated.
You can filter by “Last Modified” to the date/time you ran the SSIS package and find them easily.

You want to unmerge them?
Simply reactivate the contact. It won’t be linked anymore — though relationships or field values may already have been consolidated.
Conclusion
You might be thinking: “How many times did he write KingswaySoft?”
They were 10 times, counting this one — and no, I don’t work for them 😁
But their SSIS solution is truly plug-and-play. You don’t need to configure much to wipe out duplicates efficiently.
Even though SSIS isn’t the newest tool in the Power Platform arsenal, it remains a battle-tested solution and a reference for Dataverse data migrations and cleansing.
Did you do something similar with duplicates? Have a better approach? Or ran into issues setting this up?
I’d love to hear your story — email me here or tag me on LinkedIn! 🚀