Vector search index contains incorrect number of rows

Ensure that the Delta table has a unique primary key.

Written by brock.baurer

Last published at: September 12th, 2024

Problem

You find your vector search index does not contain the expected number of rows.

Example

You upload your data, housed in different spreadsheets, to a Unity Catalog Volume. The data is parsed out of each spreadsheet using LangChain and each individual record is then loaded to a Delta table with Apache Spark.

Your Delta table contains 475 rows and two columns. However, when you create a new vector search index, the resulting Delta table only contains six rows instead of the expected 475.

Cause

The vector search index requires a unique column for the primary key.

Solution

Ensure that the source Delta table has a pre-existing unique column or add a new column prior to index creation.

For more information on creating a vector search index, please review How to create and query a Vector Search index (AWSAzure) as well as Mosaic AI Vector Search (AWSAzure) documentation.

Was this article helpful?