Accueil

Version Française Newsletter
Beneficts compared to other approaches

Sight’Up’s “text-mining” technology employs a set of algorithms based on the genetic behaviour of the documents. This represents a real technological breakthrough compared to traditional technologies which have existed for several decades such as linguistic approaches, based on dictionaries and rules, or statistical approaches.

  • Requires very few learning documents. The main advantage of Sight’Up algorithms is their incredible capacity to generalize from an extremely reduced corpus of examples. Just 10 product offers make it possible to manage more than 10 000 products of the same family with a precision of around 98%. This implies a reduction in learning, maintenance and installation times.

  • High speed installation and modification compared to systems based on rules. Since learning is limited only to a small set of examples, there is no need to conceive Boolean rules which are complex and difficult to maintain. Maintenance is extremely fast when organisational requirements change. It is simple to modify the learning corpus by adding new examples or by removing old ones.

  • More robust than systems based on rules. The emphasis in our approach is on learning how to generalize from a finite set of examples, rather than on formulating an algorithmic solution based on previous knowledge and a heuristic set. Adding a rule, in the traditional approach, to integrate a new example, can call into question the whole system, whereas with Sight’Up technology it is enough just to add the counterexample to the learning corpus.

  • Monitoring without any particular expertise.  Management of the system is easy. Staff do not need to have data-processing expertise. Knowledge of the product family being managed and common sense  are enough to ensure the correct functioning of the engines.

  • Faster processing. On average, Sight’Up technology processes between 1 and 2 million documents per hour, whereas traditional approaches reach their ceilings at a maximum of a few tens of thousand. This gain in productivity makes it possible to limit investment in equipment and to satisfy the ever-increasing desire of customers for more rapid access to information.

  • Precision is privileged. Today, search engines in general and e-commerce engines in particular are often blamed for generating a lot of noise. Our approach privileges precision to the detriment of recall while preserving a result often higher than 95%.

    This approach is used for the following Sight’Up products.

    Sightis : Categorization Engine
    Taggis : Characteristic extraction Engine
    Dis : Glossary construction Engine

    MailRelation : Management of incoming e-mails