WakaPic: an intelligent photo gallery

Over the last year, the 1000 billion photo barrier was broken. An impressive figure that indicates not only that people care about storing their images, but also that they need a tool to manage them efficiently and intuitively.

For this reason, the WakaPic development team has built a photo gallery that can recognise content such as people, objects, shapes and colours, and contexts such as places, times of year and events, allowing the user to search their images based on these elements.

WakaPic is based on Presago’s artificial intelligence solutions and is now a cross-platform cloud storage service offered as SaaS (Software As A Service). WakaPic uses state-of-the-art machine learning to automatically manage personal photos and those shared by friends. Thanks to the powerful search engine developed, WakaPic indexes large amounts of digital media, interprets searches expressed in natural language and returns the requested photos by semantic and “emotional” relevance.

But let’s see how this result was achieved.

The algorithms that index the files are able to “read the images” and extract keywords to represent them. In particular:

  • Neural networks for computer vision that classify photos and videos uploaded by users in real time, automatically extracting tags. The algorithms are trained to recognise more than 20,000 objects, animals, plants, places and monuments;
  • Facial recognition systems that identify faces in images with over 90% accuracy, associate them with users in the owner’s network of friends, and tag them;
  • Algorithms for advanced content analysis to detect violent, offensive or adult content, in order to provide automated support for moderation of publicly shared files;
  • Deep Learning algorithms based on language models to obtain a natural language description of the photo from the extracted tags.

Aside from indexing, another challenge was enabling the search engine to interpret complex queries by resolving the ambiguities inherent in natural language, so as to always return results that are relevant to the users’ requests.

In order to contextualise and correctly interpret user queries, NLP (Natural Language Processing) techniques were used to manage the problems inherent in natural interaction, such as the use of singulars and plurals, complex verbal forms and different ways of expressing the same query.

A pipeline of machine learning algorithms was designed to extract entities from complex queries, combine them with each other as well as with the user’s profile, in order to return the images that best match the original request.

But there’s more to building an intelligent photo gallery. In fact, it also means returning personalised results. For this reason, an additional self-learning system has been implemented that creates a user profile based on behaviours, habits and events involving him and his network of friends. This allows search results to be significantly more accurate as the user adds new photos, sets personalised tags or performs new searches.

We plan to add a new algorithm that computes the ’emotional score’ of each photo. This algorithm will be based on the sentiment analysis of the images, the emotions in the faces of the protagonists and the profiling system.

Presago can implement advanced computer vision algorithms, which can interpret images and videos in real time to extract strategic information for your business.

Contact us to talk about your project and receive a free consultation.