Ways to Clean-up messy records in sql -


i have following sql data:

id              company name        customer            address 1       city                        state  zip   date 0108500         aaa test            mish~sara           newa claims     chtiana                     co     123   06fe0046         0108500         aaa.test            mish~sara           newa claims     chtiana                     co     123   06fe0046         1802600         aaa test company    ban, adj.~gorge     po box 83       moulaurel                      ca      153   09js0025         1210600         aaa test company    biwel~brce          97kehst ve      jacn                        ca     153   04js0190 

aaa test, aaa.test , aaa test company considered 1 company.

since data messy i'm thinking either this:

  1. is there way search records in db wherein search company name same name re-name longest name?

in case, aaa test , aaa.test aaa test company.

  1. or there way filter record company name same can have option change it?

if there's no way via sql query, suggestions can clean-up records? there 1 million records in database , it's hard clean manually.

thank in advance.

you use string matching algorithm jaro-winkler. i've written sql version used daily deduplicate people's names have been typed in differently. can take awhile work fuzzy match you're looking for.


Comments

Popular posts from this blog

c++ - No viable overloaded operator for references a map -

java - Custom OutputStreamAppender not run: LOGBACK: No context given for <MYAPPENDER> -

java - Cannot secure connection using TLS -