Pinecone, a vector database for scaling AI, is introducing a new bulk import feature to make it easier to ingest large amounts of data into its serverless infrastructure.
According to the company, this new feature, now in public preview, is useful in scenarios when a team would want to import over 100 million records (though it currently has a 200 million record limit), onboard a known or new tenant, or migrate production workloads from another provider into Pinecone.
The company claims that bulk import results in six times lower ingestion costs than comparable upsert-based processes. It costs $1.00/GB, and, for instance, ingesting 10 million records of 768-dimension costs $30 with bulk import.
RELATED: Pros and cons of 5 AI/ML workflow tools for data scientists today
Because it is an asynchronous, long-running process, customers don’t have to performance tune or monitor the status of their imports; Pinecone takes care of it in the background.
During the import process, data is read from a secure bucket in the customer’s object storage, which provides them with control over data access, including the ability to revoke Pinecone’s access whenever.
While in public preview, Pinecone is limiting bulk import to writing records into a new serverless namespace, meaning that data cannot currently be imported into existing namespaces. Additionally, bulk import is limited to Amazon S3 for serverless AWS regions, but the company will be adding support for Google Cloud Storage and Azure Blob Storage in a couple of weeks.
Pinecone serverless now GA on Google Cloud, Microsoft Azure
Adding to the existing AWS support, Pinecone serverless is now generally available on both Google Cloud and Microsoft Azure.
Google Cloud support is available in us-central1 (Iowa) and europe-west4 (Netherlands), and Microsoft Azure support is available in eastus2 (Virginia), with additional regions coming soon to both clouds.
This availability also comes with new features in public preview, such as backups for serverless indexes for all three clouds available for Standard and Enterprise users, and more granular access controls for the Control Plane and Data Plane, including NoAccess, ReadOnly, and ReadWrite. Pinecone will also add more user roles — Org Owner, Billing Admin, Org Manager, and Org Member — at the Organization and Project levels in a couple of weeks.
“Bringing Pinecone’s serverless vector database to Google Cloud Marketplace will help customers quickly deploy, manage, and grow the platform on Google Cloud’s trusted, global infrastructure,” said Dai Vu, managing director of Marketplace & ISV GTM Programs at Google Cloud. “Pinecone customers can now easily build knowledgeable AI applications securely and at scale as they progress their digital transformation journeys.”