Articles tagged with "machine-learning"

Embedded Embeddings Database: Building a low cost serverless RAG solution

Retrieval-Augmented Generation (RAG) solutions are an impressive way to talk to one’s data. One of the challenges of RAG solutions is the associated cost, often driven by the vector database. In a previous blog article I presented how to tackle this issue by using Athena with Locality Sensitive Hashing (LSH) as a knowledge database. One the of the main limitations with Athena is the latency and the low number of concurrent queries. In this new blog article, I present a new low-cost serverless solution that makes use of an embedded vector database, SQLite, to achieve a low cost while maintaining high concurrency.

Building a low cost serverless Retrieval-Augmented Generation (RAG) solution

Large language models (LLMs) can generate complex text and solve numerous tasks such as question-answering, information extraction, and text summarization. However, they may suffer from issues such as information gaps or hallucinations. In this blog article, we will explore how to mitigate these issues using Retrieval Augmented Generation (RAG) and build a low-cost solution in the process.

An unsung hero of Amazon SageMaker: Local Mode

Amazon SageMaker offers a highly customizable platform for machine learning at scale. Job execution within Amazon SageMaker can take some time to set up, which can be inconvenient or even time consuming during development and debugging phases. Running training and processing jobs locally can greatly increase the speed of development and debugging before running them at scale on AWS.

Understanding Iterations in Ray RLlib

Recently I’ve been engaged in my first reinforcement learning project using Ray’s RLlib and Sagemaker. I had dabbled in machine learning before, but one of the nice things about this project is that it allows me to dive deep into something unfamiliar. Naturally, that results in some mistakes being made. Today I want to share a bit about my experience in trying to improve the iteration time for the IMPALA algorithm in Ray’s RLlib.