Why Visual Classification Will Change the World

Why Visual Classification Will Change the World

I’ll never forget watching IBM’s Watson computer beat Ken Jennings and Brad Rutter on Jeopardy to win the $1 million prize in February of 2011.  I was still in library school when Watson set a new benchmark for Artificial Intelligence (AI) and made science fiction a bit more of a reality.  Despite the rise of the machines being more eminent after Watson’s highly publicized win, I couldn’t help as a librarian, but cheer on the ultimate oracle.  Watson changed my life that day.  When machine beat man, I realized that I should probably double major in a technology related field, which I ultimately did.

Yet, I realized almost instantly that Watson, despite its ability to analyze and search through vast tomes of data at the time, could really only relate to text-based content.  A computer was obviously programmed to process binary code, which correlated to text and text-based commands.  But what about the rest of the data out there that went beyond the traditional 0s and 1s?  What about all of the data being imaged as organizations started frantically scanning in endless sheets of paper to capitalize on the purported mantra of the 2000s – virtual storage is cheap!  What about all of the graphs and charts saved/converted to .jpgs or. tiffs, etc.?  And what about being able to classify and retrieve a painting or a photograph without relying on metadata entered by a human?  So many what ifs.

Biorythimics, particularly facial recognition and fingerprint analysis, provided some answers to a non-textual classification approach by relying on spatial positioning to accurately identify and classify images. OCR (Optical Character Recognition) and ICR (Intelligent Character Recognition) also furthered the ability for an image to be translated into something of meaning that a computer could understand and classify.  Sight existed before any language, and certainly before any written method of communication.  Nature for now has still prevailed in form, now it’s up to us to commonly agree on how to universally classify content.

Leave it to our fellow attorneys to take on quickly classifying documents through a concept known as TAR (Technology Assisted Retrieval) and the resulting, profitable eDiscovery (electronic discovery) boom that followed.  What was known as Natural Language Processing (NLP) and standard Boolean search factors were elevated to the next level as computer algorithms advanced and fuzzy logic prevailed.  But what about languages outside of the standard Romance based vernaculars?  Russian, Japanese, Tagalog?  Did NLP do well with different character sets?  Maybe, depending on the system utilized and the amount of time and effort spent training the computer to learn the data, but even then, accuracy rates vary.

So, while we know that TAR and NLP are not perfect, we do know that there are other ways of analyzing and interpreting data.  Even with the known limitations of VC, visual classification can typically process data with a higher accuracy and speed than traditional NLP technologies for non-textual/imaged objects.  Although OCR is sometimes still utilized in conjunction with VC, a multi-tiered classification approach can be taken to identify if, e.g. a document has certain text along with other markings, even down to staple marks and/or hole punches as a classification/retrieval concept.  Think of a kicked-up forms recognition tool that can capture handwritten notes or other anomalies in addition to text or images.

Visual classification is a subset of visual analytics (VA), which includes graphical representations of data such as heat maps, weather maps and many other graphs and charts that are based on a visual collection of data.  In other words, VC can bring to light patterns and groupings that the human eye’s analysis of data may have completely glanced over.  VC is being able to visualize the big data trend that we in the industry have been watching for years.

I believe in the concept of visual classification so much, that I am pursuing my doctoral work in the area because I know that there is a better, faster and more accurate way that we as industry practitioners and end users can process data while still meeting regulatory and compliance standards.  Even after decades of NLP, the time and effort spent training a computer to understand our spoken nuances and colloquialisms is complex and time consuming, with still too many inaccuracies.  So, what if, we instead learned how computers think and spoke to them in the algorithms that they can understand.  We in essence have to begin to put Dewy and Library of Congress classification systems to the side and learn to classify, not just by words; instead we have to learn to classify from within a digital object.  We have to accept that vectors can speak louder than words.  We have to understand that sometimes the white space on a document may mean more than the text.

VC can eventually be utilized to identify stolen artwork, assess DNA sequences, identify potential medical conditions and even quickly cluster and classify an organization’s data in weeks to months vs. years.  The impact of visual classification and analytics technology have the potential to positively change the way in which the world processes its data.  Trust me, I’ve taken on the scholar/practitioner dichotomy on this one to establish my commitment to the future of visual classification technology as a whole.  I’m willing to bet my PhD on VC!

I may not have created Watson or even the concept of visual classification, but I know that further research and technological advancements in the VC field can give AI a set of virtual eyes.  So not only will computers be able to compute but imagine being able to teach Watson how to see.  That’s even better than winning on Jeopardy!

Ilona Koti

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.