Generative vs Extractive AI

In the wake of the ChatGPT and the large language model ("LLM") boom, most companies are focused on deploying generative AI tools within their enterprises. However, after going through a due diligence process and procuring a specific generative AI product(s) for internal use, companies often find that that the AI product does not yield a viable return on investment ("ROI") as the AI product is not aligned to the enterprise's use case. In this article, we explore both generative and extractive AI and the benefits which can be harnessed through the use of extractive AI.

Generative AI

The main application behind generative AI products is the generation of content based on a user prompt or set of instructions. Generative AI is efficient at generating content and tailoring such content to the user's query or instruction. However, generative AI systems are prone to hallucinations and data biases, which can lead to incorrect or inaccurate outputs. Some of the use cases for generative AI include content generation, chatbots, virtual assistants and customer support.

The applicability of a generative AI Product can be determined by questioning whether the company relies on the generation of new content (of which accuracy is not one of the key performance metrics for the enterprise) or whether the enterprise is more focused on the accuracy of information. We argue below that generative AI products are more suitable in the latter's case.

Extractive AI

Extractive AI is a form of NLP which focuses on (i) identification and extraction of key information from databases, documents, or other sources of information and (ii) summarising the extracted content with a high level of accuracy. Essentially, extractive AI products are trained on specific data and are aimed at extracting and presenting data and information. Some of the main applications for extractive AI include data and information extraction, identification of keywords or phrases, and summarising information or documents.

There is a common misconception that generative AI products are effective tools for information extraction and summarisation. Whilst one can provide a text from a document to such a tool and it will likely summarise the content (fairly accurately), it does not mitigate the fact that generative AI tools are not necessarily for precise accuracy and are prone to hallucinations and data biases. Where companies rely on the accuracy of information, certain use cases, such as extraction of specific information, are better suited for extractive AI products.

Benefits of extractive AI

The benefits of extractive AI when compared against generative AI include,inter alia:

higher levels of accuracy: extractive AI products are focused on the extraction of information in a reliable and accurate format as opposed to generative AI products which are focused on content generation (and less so accurate content);
greater transparency and auditability: the underlying data sources can be traced and accounted for. This greater level of transparency can assist enterprises in complying with any audit requirements imposed on them;
more robust security and data protection: extractive AI products are likely to be ringfenced to enterprise data, insulating personal, sensitive, and commercial data;
customisability: extractive AI products will be implemented within a specific enterprise environment and fine-tuned to the enterprise's data. The enterprise will have greater control over the information shared with the extractive AI product, especially, when compared to employees inputting information sensitive information into generative AI products; and
ethical considerations: less scope for job displacement as the outputs from extractive AI products are likely to be incorporated into human workflows and deliverables;

Therefore, the key benefit of corporate use of extractive AI products is the accuracy and precision with which it can identify relevant information within specific datasets or documents and extract such information. However, extractive AI does require a greater degree of implementation and potential customisation in order to fit within an enterprise's environment. Furthermore, enterprises will need to ensure that they have high-quality training and testing data available to train the extractive AI product. Once the extractive product is deployed, enterprises will need to maintain high standards of information quality, completeness, and accuracy throughout their organisation (as inaccurate or incomplete information can impact the performance of the extractive AI product).

It is clear that where an enterprise's use case is dependent more on the extraction of accurate and reliable information and data as opposed to generating new content, the use case falls within the scope of generative AI. It is this misconception, which is largely attributed to the growing popularity of generative AI products, which causes tunnel vision on procuring and deploying generative AI products, when extractive AI may be more suitable to the enterprise's use case. This could result in a low yield on ROI due to a fundamental disconnect between the use case and the procured AI product.

Furthermore, one should not view these AI products in isolation as it is possible that an enterprise's use case could require both generative and extractive functionality. An example of a potential application of both AI products would be where extractive AI is used to extract enterprise information (with high levels of accuracy) and then this extracted information is inputted into a generative AI product to generate specific content based on such extracted information.

Therefore, companies should consider whether their AI use case falls either within (i) generative AI; (ii) extractive AI; or (iii) a combination of both. Only after this step has been completed, should companies engage in vendor selection processes.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Contributor

Technology

Contributor

South Africa