DataBlasting

Datablasting refers principally to the extraction of information (the data) from sources that don’t lend themselves to giving up the data easily. For example, if you wanted to send a letter to all the “Smiths” in the phone book, you would logically enough look for Smith’s in the white pages, or maybe look them up at a web site. Now, that you see the listings, how can you use them conveniently? You could retype the listings into a spreadsheet, with separate columns for First Name, Last Name, Address, etc. That’s a lot of work.

Automated Data Creation

A faster method is to “blast” the data by scanning it, then letting the computer retype it (using “OCR” -optical character recognition) into a spreadsheet, and then applying parsing formulas which put everything where it belongs; names in the name column, addresses in the address column, etc. From there it is a very short hop to nicely addressed documents, envelopes, labels and so on. Typically, clubs and organizations, and some businesses as well, have lists of members, customers, vendors, etc. that are stored on paper. Our datablasting service will pipeline the information into usable spreadsheets.

Normalization – Attributes and Structure

Getting data where it needs to be is one thing, “normalizing” it is another. Normalized data follows predictable patterns both in terms of the attributes of the data and in terms of the structure in which it is stored.

Attributes refer to the way each item of data is formatted.  Proper names, for example, are always capitalized, physicians carry an “MD” (two upper case letters) as a degree but “Dr.” (that’s a “D,” then an “r,” and then a period) as a title, and so on.

Structure, on the other hand, maps out the way data is stored, so that first name, last name and middle initial are stored separately from one another, allowing them to be individually retrieved and manipulated.  Thus, data for the physician named “Louise Anderson,” might be stored under separate data points to carry her title, first name, middle initial, last name, generation, degree, formal salutation, and informal salutation as shown at right.

That breakdown of data is just for the member’s name! Applying the same approach to the address and other information about Dr. Anderson creates a great many “points” of data. As you can see, however, practically any form of communication, from business letter to informal note can use this data. Handling information in this way means you won’t be sending letters that open with “Dear Dr. Louise S. Anderson MD,”. No matter how slick the graphics and color work, the appearance of a name in this way on an envelope, for example, is a cue to toss it aside -it’s just another another form letter. Better to say, “Dear Madam or Sir.” Taking another look at Dr Anderson’s information, you may notice that the formal salutation doesn’t really have any data. The reason is that whenever a formal salutation is used, it can be made up from information which is already on file; in this case, the title and last name.

This Bears Repeating: “Don’t Repeat.”

And that brings us to a cardinal rule of database design and administration: never enter the same information twice if at all possible. Its not just more work when you put it in; it’s more work later when changes need to be made in more than one place. If they are NOT made everywhere, you have the same item appearing differently although referring to the same person or thing, a situation that can be actually dangerous under some circumstances (take medical information, for example…).  

Raw Data is a Diamond in the Rough-
How You Polish it Makes it Valuable

Dry as the subject matter may appear at first, the fact is that data management can be applied very creatively to enhance relationships with customers you have, and pull in the ones you don’t. In fact, the field is filled with opportunities because in general, data is used very poorly -if not insultingly- most of the time. How often have you received an application or form which requires you to provide information the organization already has on file? You would probably be well disposed to anyone who gave you a form already filled with what they do know and asking only what they don’t know. Likewise, how often have you been addressed as “DEAR ALICE SMITH” (ALL IN CAPITAL LETTERS) instead of, “Dear Mrs. Smith?” What does that tell you about the care taken by the sender?

Sadly, many organizations actually have the information they need on hand -and much of it already “normalized,” but don’t know how to get to it and use it: another mission for datablasting. Call me for a consult in this essential area if you suspect you’re not making the best use of what you already have.

Home     Original Art     Sources and Tools    Image Optimization   Web Design & Hosting

By user