Ways to Clean-up messy records in sql -
i have following sql data:
id company name customer address 1 city state zip date 0108500 aaa test mish~sara newa claims chtiana co 123 06fe0046 0108500 aaa.test mish~sara newa claims chtiana co 123 06fe0046 1802600 aaa test company ban, adj.~gorge po box 83 moulaurel ca 153 09js0025 1210600 aaa test company biwel~brce 97kehst ve jacn ca 153 04js0190
aaa test, aaa.test , aaa test company considered 1 company.
since data messy i'm thinking either this:
- is there way search records in db wherein search company name same name re-name longest name?
in case, aaa test , aaa.test aaa test company.
- or there way filter record company name same can have option change it?
if there's no way via sql query, suggestions can clean-up records? there 1 million records in database , it's hard clean manually.
thank in advance.
you use string matching algorithm jaro-winkler. i've written sql version used daily deduplicate people's names have been typed in differently. can take awhile work fuzzy match you're looking for.
Comments
Post a Comment