LLM Use Cases = 80% data + 20% LLMs.
Modelling data in a way your LLM will understand.
We've been in the field of Natural Language Processing long before ChatGPT, and pioneered open-source data-centric AI. With our experience we know, that reliable AI models are built on reliable data. That's why we offer the industry's leading data management capabilities.
Why do LLMs fail?
To understand how to make LLMs more reliable and trustworthy, we first need to understand why they fail.
Example question
I have the following car damage: [...], is it covered by my insurance?
First, you need to tell me how many pets you have.
- Mix-up of data
- Since animals often cause damage to cars, the semantic search used by the AI grabs the wrong data to answer your question and gets confused, resulting in "funny" answers.
- Wrong data model
- Your AI model will identify 2 out of 5 relevant documents to answer your question, leading to incomplete answers (e.g. answering via the base tariff when there is also a special tariff).
- Blackbox answers
- Your AI model will answer your question, but you will not know why it answered the way it did.
- Errors in your data
- A lot of your data is in unstructured documents, which are not machine-readable. This leads to errors in your data, and thus leading to wrong answers.
Solving this, your AI becomes trusworthy
Piece by piece, we solve the above problems to make your AI trustworthy.
Example question with better answer
I have the following car damage: [...], is it covered by my insurance?
Yes, according to §2.3.41 of your insurance policy, your car is covered for damages caused by a collision with another vehicle.
1: §2.3.4 refers to the section in your insurance policy that covers damages caused by a collision with another vehicle.
- Modeling a mindmap-like structure
- With our approach, the underlying data sources are first modeled in a mindmap-like structure to highlight the relationships between the data.
- Infering filters from the question
- We apply customizable intent AI and psychology AI to infer what the user is looking for and filter the data accordingly via the mindmap-like structure.
- Quoting the data
- Given the filtered data, we use a language model to quote the data in a way that is understandable to the user and can be validated.
- Ongoing, automated data quality checks
- We use a data quality AI to ensure that the data is up-to-date and correct - always.