N8N and Qdrant Vector Store: A Tutorial on Retrieval-Augmented Generation

Iqbal Ali
By
Updated February 19, 2026 ·

How do you create effective workflows that deal with large volumes of user text data? Like, say you want an LLM to analyse large volumes of user feedback for insights as a workflow.

In this instance, what you need to learn is a process called Retrieval Augmented Generation (RAG).

The problem that RAG solves

A typical approach to prompting with LLMs is to throw user feedback into a chat interface, along with your prompt, like this:

Typical prompt, stuffing user feedback into the prompt window

On the face of it, this should scale well, especially as models like Gemini claim to be able to handle the equivalent of all seven Harry Potter books.

Gemini promises it can cope with entire 7 volumes of Harry Potter as context

But the truth is a little more complicated because, in real-world practical usage, Gemini is optimal for only three pages. That’s a heck of a difference, right?

Gemini is optimal for 3 pages of a single book

One way to fix this problem is to send only relevant data to the LM for processing. Wouldn’t it be nice to retrieve relevant text that matches some search criteria? Not strict, like a traditional database, but a database that is able to handle semantic similarities.

Vector storage databases

This is where a vector storage database comes in. You have your texts (or documents), and all that gets broken into chunks. Then those chunks are converted into embeddings, which is like a numerical representation of that text. These are then stored in a database you can search using natural language.

Documents broken into chunks, converted to embeddings, and stored in vector storage

Once these text chunks are stored, you can then search using natural language:

1) user searches, 2) search goes into vector store, 3) returns text chunks with similarity scores

For example, you might search for text matching “checkout or payment-related feedback”, and you would get returned something like this:

Returned text. Text and similarity scores. 1.0 is exact match, 0.0 is all documents

Searching for that text will retrieve the text chunks, along with a “similarity score” stating how closely each text matches the search term. A score of 1.0 is a direct match, while 0.0 will return everything.

Having access to a tool like this improves our ability to work with large volumes of text efficiently. So, how do we do this?

Step 1: Install required components

There are two things we need to install. Looking at our diagram again, the following is what we need.

What we need. An embedding model and vector database

We need an embedding model to convert our text to numbers, and we need a vector storage database. Let’s start with the vector storage database.

Install the embedding model

As mentioned, embedding is a special kind of model that will convert text to lists of numbers. The numbers represent text relationships and can be quite deep. This depth is represented as “dimensions”.

The model we’ll use is the Qwen3 embedding model. We’ve been using Qwen3 for the other stuff, so why not? We can install the embedding model into our Ollama instance.

Just head over to ollama.com and search for “qwen3 embedding”:

Ollama search for qwen embedding

The one we’re interested in is the 0.6b variant. It’s small enough for our job. Click on “0.6b” and copy the name to use in Ollama.

Ollama website: Copy model name for qwen3-embedding:0.6b

Now, before we go ahead and pull this model, let’s find out a bit more about it. Scroll down and we see that it states that the embedding dimensions supported are up to 4096 dimensions.

Ollama site says 4096 for embedding dimensions

This is wrong, as it is for the 8B version of the model. So, here’s a quick tip for you: always check the model details in Hugging Face. So, head to huggingface.co and search for “Qwen3 embedding 0.6b” to get the model-specific details.

Hugginface is more reliable, identifying 0.6B as 1024 dimensions

The Qwen3 0.6b model actually supports up to 1024 dimensions. This will be useful for us later as we do our setup. But just know that Hugging Face tends to be more reliable for this sort of information, from my experience.

Now, let’s pull this model into our Ollama server. Open the terminal or command line and type:

ollama pull [pasted name]

So, for my example, it’s:

ollama pull qwen3-embedding:0.6b

ollama pull qwen3-embedding:0.6b

And that’s it! Our model is pulled and ready to use. Let’s now install our Qdrant vector store.

Install Qdrant Vector Store

There are a number of vector storage options out there. I’m qoing to use one called Qdrant just because I’m more familiar with that one. We install Qdrant using Docker.

In your Docker Studio, search for “qdrant” and install the one with the highest rating and downloads (this should appear at the top).

Docker Studio. Search for Qdrant

When you hit Run, in the pop-up, select “Optional settings” and make sure the port numbers match. You’ll need to copy the number from the right and paste it into the “host port” field. You can optionally give the container a name. Then hit Run again.

Once docker does its magic, it’ll give you an address to access the web UI.

Copy the URL for the dashboard: http://localhost:6333/dashboard

This should be http://localhost:6333/dashboard. Clicking the link will open the dashboard. Qdrant allows us to create multiple databases (or “collections”). We need to create a collection that is set up for embedding model.

Click “Create collection”, then give it a simple name. I don’t use spaces, just a habit of mine. So, perhaps pick a name and use dashes instead of spaces. Example settings:

  • Collection name: my-test-collection
  • What’s the use case? Global search
  • What to use for search? Simple Single embedding
  • Choose dimensions: 1024 (what our embedding model supports)
Qdrant collection settings screen

Now, we should have our database ready to populate.

Step 2: Set up your n8n workflow

We previously created a workflow to fetch user feedback text data from Google Sheets and split out the rows, ready for processing:

Previous workflow fetching sheets data. Now we need to feed data into the vector

Our goal is to insert each row of user feedback into the vector store, so we can conduct searches on that text data. But first, since we’re just testing this out, let’s put a limit on the data we fetch from Google Sheets, just to make it easier and faster to develop.

Insert Limit node between Get row(s) in sheet and Split Out

Add a node between Google Sheets and Split Out, and add the “Limit” node:

Set Limit node to 3 max items

We can set the Max items to 3. So, then our workflow looks like this.

Completed workflow with inserted limit

Next, we want to add our main Qdrant node. The node we want is “Qdrant Vector Store”, and then “Add documents to vector store”.

Add Qdrant Vector Store node

For settings, we don’t need an API key; we just need to enter the Qdrant URL.

Qdrant connection: http://host.docker.internal:6333

We’re using http://host.docker.internal:6333 as n8n is running inside Docker. And we want to tell n8n to look at the Docker host (our local machine) and then port 6333.

Select the collection from the list

We also need to select the Qdrant collection we’re going to be saving our data to. If we execute, we’ll get an error, because there are two subnodes that we need to add:

Anatomy of Qdrant Vector Store node. Need to add embedding and document

For the embedding, pick the “Embeddings Ollama” subnode, and pick your embedding model (i.e., “qwen3-embedding:0.6b”).

Embeddings Ollama subnode. Select the embedding model

And for the “Document” subnode, pick “Default Data Loader” and stick with the defaults:

Document loader node. Leave as default settings

Our final workflow should look like this:

Final workflow with data flowing into Qdrant

And when we run the workflow, and head over to our Qdrant dashboard, we should see some documents. Yay!

Google Sheet data is going into the database!

Now that we’ve validated it works, we can remove the limit and feed all the data into Qdrant to populate. Then we’re ready for the fun part: search our database.

Step 3: Search the vector store

Here is what we build in order to fetch items from the vector store.

Chat trigger with another Qdrant Vector Store node, sharing the Embeddings Ollama subnode

In short, we create a new Chat trigger, add a new Qdrant node, this time with the action: “Get ranked documents from vector store”. Then we share the embeddings subnode.

Here’s the Qdrant action node for retrieval:

Use Qdrant Vector Store → Get ranked documents from vector store

For the embedding subnode, we can just share the Embeddings Ollama we set up for our previous Qdrant node. It’s just easier, as they both need to be identical anyway.

Sharing the Embeddings Ollama subnode, as they both need to be the same

Now, trigger the chat with something, like a simple “my prompt”, then edit the retrieval Qdrant node, using the chat input as the prompt (drag your message into the prompt field). You can also set the number of items you want Qdrant to return.

Qdrant retrieve data node screen. Drag ‘chatInput’ to the prompt field. We’ll see text and similarity scores as output

Once you execute, you’ll get an output with the text matching the search, along with the similarity score! Obviously, in the example above, the score is low as we were using some placeholder text as an example.

Summary

And there we have it! In this guide, we’ve tackled one of the most powerful concepts in modern AI workflows: Retrieval-Augmented Generation (RAG). We’ve explored how to create effective workflows that handle large volumes of user text data. We’ve tackled the problem of scaling LLM prompts by using a vector storage database to retrieve relevant text based on semantic similarity. It’s kind of like a smart library of all our user feedback.

Come back next time to build on these tools and techniques.

Editor’s note: This guide is part of a broader series on building practical AI systems. To understand the full workflow progression, explore our guides on getting started with AI automation in n8nbuilding your first AI agentconnecting Google Sheets to n8nextracting themes from user feedback with n8n, quantifying feedback themes, and connecting chat interfaces to real tools.

AI Playbook CTA
Mobile reading? Scan this QR code and take this blog with you, wherever you go.
Originally published December 10, 2025 - Updated February 19, 2026
Written By
Iqbal Ali
Iqbal Ali
Iqbal Ali
Experimentation consultant and coach.
Edited By
Carmen Apostu
Carmen Apostu
Carmen Apostu
Content strategist and growth lead. 1M+ words edited and counting.
Start your 15-day free trial now.
  • No credit card needed
  • Access to premium features
You can always change your preferences later.
You're Almost Done.
What Job(s) Do You Do at Work? * (Choose Up to 2 Options):
Convert is committed to protecting your privacy.

Important. Please Read.

  • Check your inbox for the password to Convert’s trial account.
  • Log in using the link provided in that email.

This sign up flow is built for maximum security. You’re worth it!