Lead-generation platform with an email pipeline operated manually.


Sentiment analyzer is one of the main tools to measure how people participate in products, services, and brand in general. It is great to have huge volumes of input text which needs to be classified (email campaigns, satisfaction surveys, review comments). Multi-language support, admin interface, continuous learning module are all additional features which can be customized and implemented together with a basic solution.

In this particular case, client approached to us with several questions:

How is possible to classify emails by sentiment?

How to collect personal data for CRM?

Easiest way for non-invasively integrate with current tools?

The main issue in the email classification task that we were presented with was parsing the emails. The dataset contained emails that were differently formatted and therefore difficult to automatically parse.

What we did

  • Trained classifiers
  • Integrated Google DLP and built custom NER detectors
  • Built modular APIs


In order to create data that can be used in an analysis, we implemented a number of data cleaning and processing techniques.  A number of algorithms was used to test email classification. Data set was balanced before training and the highest accuracy was achieved using the XGB - Extreme Gradient Boosting algorithm.

Email parsing proved to play a big role in the email classification accuracy and classifier performance.

Our solution saved up to 80hrs/month from manual classification and gathering personal data, and additional outcome was improved toolkit and automated email pipeline.

A set of yellow smiley stickers
Photo by Nick Fewings / Unsplash

“Smartcat have delivered our project successfully and on time, they are a pleasure to work with.”

James Isilay CEO, Cognism