Our Story

The Origins of Kim

From the first idea to a productive AI of our own, it was a long journey. Kim did not emerge from simply connecting OFORK to external services, but from the clear goal of building an independent, technically robust and credible AI solution for OFORK.

Why We Developed Kim in the First Place

The idea behind Kim was never to connect OFORK to an external AI service and present it as our own solution. Our approach was different from the beginning: we wanted an AI that truly fits OFORK, is technically understandable, and is built on a foundation that we ourselves can understand, develop further and take responsibility for.

This exact standard made the path longer, more demanding and technically more complex. But that is also why Kim is now more than just a showcase for someone else’s services.

What Mattered to Us

  • building an independent AI for OFORK
  • not marketing a mere interface to large third-party providers
  • keeping control over technology, training and further development
  • creating a solution that is actually useful in everyday support work
  • taking data security and credibility seriously
The Beginning

The Search for the Right Foundation

At the beginning, the key question was which base model would be suitable. With OpenHermes-2.5-Mistral-7B we found an initial technical foundation. It quickly became clear, however, that base models in this form are primarily intended for GPU environments.

Our goal, however, was a solution that also makes sense for customers who do not necessarily operate a large GPU infrastructure. That made one thing clear very early on: we needed to find a path that is both powerful and flexible in deployment.

September 2024

The First Stable CPU Tests

  • Kim ran stably on a CPU server for the first time – still slowly, but functional.
  • However, this basis was not yet sufficient for productive use.
  • Parallel processing, training effort and higher performance requirements made a move to GPU servers necessary.
  • An important factor was a server location in Europe, in line with our requirements for security and availability.
  • With our long-standing provider, we ultimately found a suitable solution.
October 2024

The Right Hardware

Productive GPU Server:

  • AMD EPYC™ 7313P
  • Zen 3 (Milan)
  • 16 C / 32 T
  • 3.0–3.7 GHz
  • 128 GB DDR4 ECC
  • 960 GB NVMe SSD (2 × 960 GB, hardware RAID 1)
  • NVIDIA® A10 GPU

The purchase price for such a server is around €7,000 to €15,000. On this foundation, Kim runs fast and can process many parallel requests.

Additional Test Server

CPU Operation with Smaller Hardware

Test Server with 32 GB:

  • IX6-32 NVMe
  • Intel® Xeon® E-2356G
  • Rocket Lake
  • 6 C / 12 T | 3.2–5.0 GHz
  • 32 GB DDR4 ECC
  • 512 GB NVMe SSD (2 × 512 GB, software RAID 1)

Kim also runs quickly on this server, although parallel requests are only possible there to a limited extent.

November 2024

Training Began

  • the actual training of Kim began
  • many parameters had interdependencies with one another
  • time-consuming tests and constant fine-tuning were required
February 2025

First Real Signs of Learning

  • for the first time, Kim showed behavior that had actually been learned
  • the results were encouraging, but not yet satisfactory
  • datasets, training parameters and prompts had to work together precisely
  • prompts became an important building block for quality
Summer 2025

Refinement and Clarity

  • our understanding of AI training had grown significantly in the months before
  • the refinement of our own AI began
  • at the same time, it became increasingly clear to us how many vendors in the market simply relabel third-party AI services
  • that is exactly why our focus on independence remained central

Technical Facts About Kim

  • Our base model is called Llama-3.1-8B-Instruct.
  • The model has 8 billion parameters.
  • It answers questions, supports spell checking and works in multiple languages.
  • Our trained model is called Llama-OFORK.
  • The CPU version for servers with at least 32 GB is called Llama-OFORK.Q4_K_M.gguf.

Qdrant and Infrastructure

  • In addition, we use Qdrant as a RAG component.
  • This improves Kim’s answers and supports ticket search.
  • Qdrant also searches attachments and runs locally on the same server where OFORK is installed.
  • Questions in the Kim chat and spell-check requests are sent to our GPU server.
  • Kim runs there productively – without permanent storage of the content.

Kim Was Built as an Independent Solution by Conviction

The journey was technically demanding and much more complex than simply connecting to external AI services. That is exactly why Kim now stands for something that matters to us: independence, transparency and a solution that truly fits OFORK.

View Features Request a Demo