Have you ever wondered what happens when you combine Microsoft Research’s Stroke Width Transform algorithm and Google’s open-source Tesseract OCR engine? No? Well neither had I until a few days ago when I came across something called Project Naptha. Still not sure what I’m talking about? That’s OK; take a look and see for yourself (and then come back to read the rest of this!) – http://projectnaptha.com/
Project Naptha is a plugin for Google Chrome (and soon Firefox) that takes flat images on the web and separates the text, making it editable. Have you ever seen a great photo on a website that’s ruined by having text stamped all over it? Perhaps, worryingly for professional photographers, it has the author’s name or initials stamped into the corner. Well now, with absolutely no knowledge or software other than a web browser, you can remove the unwanted text, or even edit it yourself! Now this is clearly not a new technique – users of Adobe Photoshop have been doing this for years – but the big news is that now virtually anyone can do this with absolute ease.
While removing text notices from copyrighted works is one obvious (mis)use of this technology, there are also some really positive uses for this. For one, it means you can capture text from images and paste it elsewhere. Particularly in the ecommerce world, product information is, unfortunately, frequently distributed on the web in image form, and the only simple way to separate this information from the image it’s burned into is by retyping it yourself. Now with this plugin, it’s just a matter of copying and pasting.
Why not try it out for yourself? Check it out here.