It runs on the Azure cloud optimized for low latency real-time AI/ML use cases, supporting both batch and streaming sources, as well as integration into the Azure Data & AI ecosystem. You can have a look at its architecture here. So the feature store cleans itself automatically and there are no stale entities or features taking up valuable online storage.Īnother interesting implementation of Feast is the Microsoft Azure Feature Store. leads expire after 48 hours, and this is implemented in the Redis online store by simply setting time to live (TTL) to 48 hours, to expire the entity (lead) and associate feature vectors after 48 hours. At the same time that it’s ingested into the online store, we may want to re-rank it soon after. As soon as there is a new lead it is ingested and scored by the model. The features come from many different sources, and both the entities (the leads) and the features used to score them get updated all the time, thus, the leads get ranked and re-ranked. Supporting streaming data sources is very important for real-time AI/ML use cases as these use cases rely on fresh live data.Īs an example, in a lead scoring use case for, new leads are being ingested continuously throughout the day. Feast recently added support for streaming data sources (in addition to batch data sources) which is currently supported only for Redis. In addition to that, features are also ingested into the online store from streaming sources (Kafka topics). As presented by Vitaly Sergey, Senior Software Engineer at, the features are materialized from the offline stores (S3, Snowflake, and Redshift) to the online store (Redis). In the diagram above you can see an example of how the online mortgage company implemented its lead scoring ranking system using the open-source Feast feature store. The bottom line: Feast found it was by far the most performant using the Java gRPC server and with Redis as the online store. You can find the full benchmark setup and its results in this blog post. Java gRPC server, Python HTTP server, lambda function, etc.). It also compared the speed of using different mechanisms for extracting the features (e.g. Feast recently did a benchmark to compare its feature serving latency when using different online stores (Redis vs. Let’s first have a look at benchmarking data and then the data architecture of the Feast open source feature store. In this post, we will review architectures and benchmarks from both DIY feature stores built by companies successfully deploying real-time AI/ML use cases and open source and commercial feature stores. It’s no wonder that oftentimes companies, before choosing their online feature store, perform thorough benchmarking to see which choice of architecture or online feature store is the most performant and cost-effective. Successful feature stores can meet stringent latency requirements ( measured in milliseconds), consistently (think p99) and at scale (up to millions of queries per second, with gigabytes to terabytes-sized datasets) while at the same time maintaining a low total cost of ownership and high accuracy.Īs we will see in this post, the choice of an online feature store, as well as the architecture of the feature store, play important roles in determining how performant and cost-effective it is. According to popular open source feature store Feast, one of the most common questions users ask in their community Slack is: how scalable/performant is Feast? This is because the most important characteristic of a feature store for real-time AI/ML is the feature serving speed from the online store to the ML model for online predictions or scoring. Real-time artificial intelligence / machine learning (AI/ML) use cases, such as fraud prevention and recommendation, are on the rise, and feature stores play a key role in deploying them successfully to production.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |