« archives

August 2003
S M T W T F S
« Jul   Sep »
 12
3456789
10111213141516
17181920212223
24252627282930
31  

recently

news from around the web

» view all

A new spam tactic…

August 9th 2003

Here’s one I haven’t seen before. Of course it makes no sense to a computer, it looks like this to the computer: ________________________R___V_______U____P____E___I___A___N____R____A___A___T
___R____I____L___G_______E____C________R_______A____E________A_______L____S__

which is absolutely nonsensical. But it looks like this to us, who can read any letters in sequence…

______________________
__R___V_______U____P__
__E___I___A___N____R__
__A___A___T___R____I__
__L___G_______E____C__
______R_______A____E__
______A_______L____S__

Which is readable, of course.

How do we combat this? A simple program could parse for the text:

1) Find the character most used (spacer) in the block. In this case, "_"
2) Find out the width (number of spacers) between each character on each line. Example: on the first line, the space-array could be (2, 3, 7, 4 , 2)
3) Go through each line and find this array. Example: in the 2nd line of text, we find a new word starts.
4) Take the nth+1 character of each line, where n is the 1st index in the space-array. Remove n+1 characters from each line after the process is succesful.
5) The result is a new array, where each index is a word, and the words are in order, from left to right.

Simple. I could write it if I wanted… but I have to pack. Blah. ::


This entry was posted on Saturday, August 9th, 2003 at 2:17 pm and is filed under Life. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

Leave a Reply

Some XHTML allowed.