Tuesday, April 16, 2024

Me and ChromaDB a Series! - How Do I Backup and Restore ChromaDB?

Hi Friends,

Thanks for continuing to go with me on my ChromaDB journey.  

If you haven't had a chance to read the first two parts, here they are:

Me and ChromaDB - A Series Maybe... Part One - A.I. is Your Friend!

Today I've got a great topic that started me on this ChromaDB, AI, LLM, Vector database journey.  How do I backup and restore my AI database?

Humans are funny, we leap before we look and it tends to get us into trouble.  Case in point, new application technology.  I love it, you love it, we all love new toys!  Think of backup and recovery as the batteries for your new toy.  A lot of times we're so excited about the new toy we forget the boring essentials like batteries.














Unfortunately backup and recovery is often seen as a boring essential, until the data is needed and then you're the most important person in the world.  

Inevitably a test application becomes production over night and now it's your responsibility to protect that data.

So let's get ahead of the curve with this new application data and figure out how to backup and recover your new application before it becomes production!

With ChromaDB you have a few options, you can:

1. Run memory resident.

2. Create a persistent data file.

3. Run in client/server mode.

I'm currently running my ChromaDB with persistence and writing to a data directory so if something happens all my data will be saved.  If you don't specify what type of relational database you want to use for persistence, ChromaDB will use SQLite >3.35 as the default database.

For my first test I wanted to try ChromaDB out of the box, so I used SQLite.  Make sure you create the filesystem first and when you let ChromaDB what client you're going to use, type:

persistentClient = chromadb.PersistentClient(path="/where_your_data_filesystem_is")

That's awesome, I've got my ChromaDB setup, it's persistent, I can query my data, but now what?  There are a bunch of different ways to backup your SQLite database, but if you have Veritas NetBackup it's SUPER simple to integrate this new technology into your enterprise backup and restore technology.

The cool thing about this is the SQLite agent is already built right into the NetBackup client software and has been since around 10.2.

Let me guide you through the process.

If you haven't downloaded and installed the Veritas NetBackup Client software on your ChromaDB box, you'll need to do that now.  You can download the Veritas software from HERE.

1. Log into the NetBackup WebUI and navigate to Protection > Policies and click on the +Add button.



2. There's going to be a lot of choices in the next section, but here's the ones I want you to focus on:
    Attributes:
    a. The name of the Policy.
    b. The Policy Type = DataStore
    c. Policy Storage = An active storage unit that you have available.



    Schedules:
    This schedule will look a little different from your Standard policy since we're going to initiate the 
    backup and restore from the CLI. 



    Clients:
    Let's add our ChromaDB database box as the client to our backup.  Click on the +Add button.

















Enter the name of the ChromaDB machine's name.  I like to click on the "Detect client operating system".


















   Backup Selections:
   Now we'll choose what we want backed up.  Click on the +Add button.












Select the persistent data path you chose earlier when you told ChromaDB what type of client you were using.


Alright we now have our policy to backup our ChromaDB SQLite database!

Let's kick off a backup from the CLI on the ChromaDB box.  

1. Log into your ChromaDB box where your database resides and navigate to:

/usr/openv/netbackup/bin

2.  Run the following command:
./nbsqlite -o backup -S nameofyourprimaryserver.com -P nameofyourpolicy -s Default-Application-Backup -z 10M -d /data/chroma.sqlite3

Let's break this down:
   -o backup tells NetBackup we're ready to do a backup
   -S put in the FQDN of your NetBackup Primary server.
   -P the name of the policy you just created. For me it is "chromadb2".
   -s this is the name of the schedule you're using.  I'm using the default one.
   -z here's some Linux magic here.  You don't need this setting for Windows, but for Linux you need to tell NetBackup how big you want your LVM snapshot to be.  You can set it in KB, MB or GB.
   -d point NetBackup to where your sqlite3 file is located.

Hit Enter and you should see this:
Backup initiated from XBSA ...
The SQLite database backup is in progress...
File backed up:  /SQLite/chroma.sqlite3
SQLite database backup is successful!
Completed the  backup  operation

Now back to the NetBackup WebUI!

Under the Activity Monitor check the status of the backup:


WOO HOO You've successfully backed up your ChromaDB SQLite database!



So let's say we want to do a restore, so we'll query to see what backup images are available to us:
1. Go back to the CLI of your ChromaDB box and navigate to:
/usr/openv/netbackup/bin

2. Type the following:
./nbsqlite -o query -S nameofyourprimaryserver.com -C thenameofyourchromadbbox -P chroma2

chromadb2
nameofyourchromabox
==================================================================================

==================================================================================
1713299663      Linux           SQLite            Tue Apr 16 15:34:23 2024
Completed the  query  operation


Check it out, we've got a backup that we can restore!

1. Go back to the CLI of your ChromaDB box and navigate to:
/usr/openv/netbackup/bin

2. Enter the following to restore:

 ./nbsqlite -o restore -S nameofyourprimaryserver.com -t /data/restore
  -o restore  tells NetBackup you'd like to do a restore.
  -S then give the name of your Primary NetBackup server.
  -t is the directory you want to restore the file to.

Restore initiated from XBSA
The SQLite database restore is in progress
File restored: /data/restore/chroma.sqlite3
SQLite database restore is successful!
Completed the  restore  operation

Let's go check out the NetBackup Activity Monitor to see how the job did there.

Looking good!  Let's go check out our /data/restore folder:



And there's our ChromaDB SQLite backup restored and ready to be used!

Hey that was fun wasn't it?!

Stay tuned for the next episode where I'm not sure what I'm going to do yet....

No comments:

Post a Comment