MySQL HeatWave increases ease-of-use for customers with vector store, AutoML and Lakehouse enhancements, JSON, and JavaScript support
Oracle has announced significant enhancements to MySQL HeatWave, including support for vector store, generative AI, new in-database machine learning features, MySQL Autopilot enhancements, new HeatWave Lakehouse capabilities, support for JavaScript, acceleration of JSON queries, and support for new analytic operators. Currently in private preview, the vector store will enable customers to leverage the power of large language models (LLMs) with their proprietary data to get answers that are more accurate than using models which have been trained on public data only. With generative AI and vector store capabilities, customers can interact with MySQL HeatWave in natural language and efficiently search documents in various file formats in HeatWave Lakehouse.
“Today’s enhancements to MySQL HeatWave are another significant step on our journey to address pressing customer data, analytics, and AI issues,” said Edward Screven, chief corporate architect, Oracle. “We’ve previously added real-time analytics with the best price-performance in the industry, automated machine learning, lakehouse, and multicloud capabilities to HeatWave. Now vector store and generative AI bring the power of LLMs to customers, providing them with an intuitive way to interact with data in their enterprise and get the accurate answers that they need for their business.”
For customers looking to perform analytics, transaction processing, machine learning, and generative AI across a variety of data types and sources, additional capabilities have been added to MySQL HeatWave—for both MySQL-compatible and non-MySQL workloads.
Generative AI and vector store (private preview)
The vector store ingests documents in a variety of formats such as PDF and stores them as embeddings generated via an encoder model. For a given user query, the vector store identifies the most similar documents by performing a similarity search over the stored embeddings and the embedded query. These documents are used to augment the prompt given to the LLM so that it provides a more contextual answer.
MySQL HeatWave AutoML
MySQL HeatWave provides in-database machine learning with a fully automated pipeline for training models. Customers don’t need to move data to a separate machine learning service; they can easily and securely apply machine learning training, inference, and explanation to data stored inside MySQL HeatWave. The following new capabilities have been added:
- Support for HeatWave Lakehouse: Customers can now leverage HeatWave AutoML for training, inference, and explanations on data in object storage in addition to data in the MySQL database—and use a much wider set of data for machine learning.
- Text column support: Enables customers to perform machine learning tasks—anomaly detection, forecasting, classification, regression, and recommender system—on text columns, further broadening the corpus of data on which customers can leverage HeatWave AutoML.
- Enhanced recommender system: With support for Bayesian Personalized Ranking (BPR), HeatWave AutoML can now consider both implicit feedback (past purchases, browsing behavior) and explicit feedback (ratings, likes) to generate personalized recommendations. As an example, analysts can predict items a user will like, users who will like a specific item, and ratings items will receive.
- Training Progress monitor: Customers can now monitor the progress of the model training with HeatWave AutoML, allowing them to better manage resources.
MySQL Autopilot
MySQL Autopilot is a built-in capability of MySQL HeatWave that uses machine learning-powered automation to help improve performance and scalability without requiring database tuning expertise. It learns from the execution of queries to improve the execution plan of future queries. The latest enhancements to MySQL Autopilot include:
- MySQL Autopilot indexing (limited availability): Helps customers eliminate the time-consuming tasks of creating optimal indexes for their OLTP workloads and maintaining those over time as workloads evolve. MySQL Autopilot automatically determines the indexes customers should create or drop from their tables to optimize their OLTP throughput, using machine learning to make a prediction based on individual application workloads. In addition, Autopilot indexing predicts the expected improvement with the recommended indexes without creating those indexes and without incurring compute or storage overhead on the users’ tenancy.
- Auto compression: Helps customers determine the optimal compression algorithm for each column, which improves load and query performance with faster data compression and decompression. By reducing memory usage, customers can cut costs by up to 25 percent.
- Adaptive query execution: Helps customers optimize the execution plan of a query after the query has started to execute, improving the performance of ad hoc queries by up to 25 percent. It uses information obtained from the partial execution of the query to adjust data structures and system resources and then independently optimizes query execution for each HeatWave node based on actual data distribution at run time.
- Auto load and unload: Autopilot automatically loads the columns being used in an application workload to HeatWave and automatically unload tables that were never or rarely queried. This helps free up memory and reduce costs for customers, without having to manually perform this task.
Additional MySQL HeatWave Enhancements
JavaScript support (limited availability): Enables customers to write stored procedures and functions in JavaScript and execute them inside MySQL HeatWave. This makes it easier for developers to write rich application logic in JavaScript and get high performance by executing the program inside the MySQL database. The performance of JavaScript applications is improved since data is not transferred from the database to the client and code is just-in-time (JIT) compiled in the GraalVM runtime.
- JSON acceleration: Developers and DBAs can now take advantage of HeatWave for real-time analytics on JSON documents stored in the MySQL database, accelerating queries by orders of magnitude.
- New analytic operators: With support for new analytic operators including CUBE, Hyper Log Log, Qualify, and Table sample, customers can migrate more workloads to MySQL HeatWave.
- Bulk ingest into MySQL HeatWave: Support for parallel building of index sub-trees while bulk loading data from CSV files helps customers achieve a 10X improvement in data ingestion performance over Amazon Aurora. As a result, data can be queried sooner and the system resources used for loading data are freed up much faster, lowering costs for customers.
“The MySQL HeatWave engineering team is clearly doubling down on AI and machine learning innovation,” said Steve McDowell, principal analyst and founding partner, NAND Research. “Not only can customers now train ML models on data both in the database and in object storage with full automation, but with the new generative AI and vector store capabilities they’ll be able to interact with HeatWave in natural language, and they’ll receive accurate answers for their own business purposes only—based on their own enterprise data in addition to publicly available data. The flexibility to use whichever LLMs organizations prefer continues to demonstrate the open and collaborative approach of the MySQL HeatWave engineering team.”
MySQL HeatWave is the only cloud service that provides transaction processing, real-time analytics, machine learning, data lake querying, and machine learning-powered automation within a single MySQL database service. A core part of Oracle’s distributed cloud strategy, MySQL HeatWave is available natively on OCI and Amazon Web Services, as part of the Oracle Database Service for Azure, and in customers’ data centers with OCI Dedicated Region.