![](https://i.postimg.cc/Zqw1ZX29/1fd82ce7-ee0a-4903-9664-5336cbe61145.png)
File size: 2.8 MB
Duplicate records in databases do not only cause high costs, they can also lead to many other problems.
Not least when consolidating different data inventories, e.g. during fusion or adding value from acquired data, the duplicate search is an indispensable and critical business process. This is, why data quality more and more becomes a significant value for every business.
Fuzzy Duplicate Search
When searching for such duplicate records, you must distinguish between sharp (with exact accordance) and similar duplicates.
Grouping for exact matches is something, what any DBMS can do in seconds.
On the other hand is the detection of similar records something, that can be hardly achieved and what is computationally intensive. This only succeeds with specialized tools. A problem is, that without the capable methods you cannot even estimate, in which dimension such duplicates exist in your data. You just can't see it.
On the Search for similar Records
Over years, phonetic algorithms (e.g. SoundEx) are used to find what sounds similar. This approach already brings up some results, which go beyond any sharp comparison. Thereby permutations of the strings (like twisting and mirroring) stay unconsidered and a too strong emphasis is layed upon the first letter.
Much better is the use of pattern-matching algorithms (e.g. Levenshtein Distance). Such algorithms can consider permutations, but are highly calculation cost intensive.
So another problem is the total running of the calculation: When using pattern-matching algorithms, in principle each record must be compared with each other. This means for n records the total number of (n - 1) * n / 2 comparisons. That are ½ trillion (1012) complex calculations for a datatable with 1 million records. The calculation could last for years.
The FuzzyDupes Method
Twins 2
The FuzzyDupes method was developed within 9 years by now, completely and originally done by Kroll-Software. The calculation kernel contains more than 7.000 lines of code.
FuzzyDupes makes use of a Tam-Hashidex for building clusters. This is a mathematical exact and reliable way to preselect good candidates for the deeper search, which does not depend on phonetical algorithms. The deep search pattern-matching algorithm was also developed by Kroll-Software and it can better consider all permutations than any other known algorithm.
All used algorithms are based completely on pattern-matching and are therefore language- and culture independent. Unicode is fully supported and so this works not only with latin characters. It is mathematically verifiable, that all similarities are detected consistently and reliably.
General Features
Fast fuzzy duplicate search in many data sources
Fuzzy merge of two lists
Fuzzy match with external list (Robinson list)
Notebly higher power and speed through parallel execution and 64-bit
Thereby practically unlimited size of data searchable
Full usage of modern Core-iX cpu's and 64-bit systems
Uses DotNet 4.0 Framework
Display of the match-factor in the results
Support of MS-Access and MS-Excel data sources even on 64-bit systems
The new 32-bit launcher offers access to 32-bit data sources on 64-bit systems
MS-Access, MS-Access 2007* and 2010* on 32-bit and 64-bit systems
MS-Excel, MS-Excel 2007* and 2010* on 32-bit and 64-bit systems
MS SQL-Server
Text/CSV Files
Other Datasources with ODBC-Driver or OLEdb Provider, e.g. Oracle, IBM DB2, MySQL, dBase, Foxpro, Paradox, FileMaker, Cache, PostgreSQL, etc.
Search and deletion from MS-Outlook contact folders.
This makes FuzzyDupes the solution for cleansing Outlook contacts
Windows Addressbook
MS-Sharepoint Server
BulkMailer Address Database
32-bit data sources can be accessed on 64-bit systems using the FuzzyDupes 32-bit launcher
New - FuzzyDupes 2020 - fuzzy duplicate search - dedupe and data cleansing software 64-bit ODBC data sources
New - FuzzyDupes 2020 - fuzzy duplicate search - dedupe and data cleansing software Windows Contacts / Windows Mail
Updates: official site does not provide any info about changes in this version.
DOWNLOAD
uploadgig
Kod:
https://uploadgig.com/file/download/534ead1d98DEA1Cd/9KdaFgQY_FuzzyDupes2.rar
rapidgator
Kod:
https://rapidgator.net/file/39799d079f795034675b1927b13e2170/9KdaFgQY_FuzzyDupes2.rar
nitroflare
Kod:
http://nitroflare.com/view/D72A68D4F100553/9KdaFgQY_FuzzyDupes2.rar
Konuyu Favori Sayfanıza Ekleyin