Pondhouse Data Blog
Azure OpenAI Content Filters: The Good, The Bad, and The Workarounds
Azure OpenAI is one of the most used AI platforms for enterprise users. The Azure environments is relatively well-trusted and provides access to all OpenAI LLM models. By default, Azure offers 'Content Filters' for their AI API. What are these and how to properly use them to net get in your way?
Crawl4AI Tutorial: Build a Powerful Web Crawler for AI Applications Using Docker
Learn how to set up and use Crawl4AI's web scraping capabilities using Docker. This step-by-step tutorial shows you how to set up, configure, and deploy your first AI-powered web crawler in minutes. Free, open-source, and faster than many paid alternatives.
Late Chunking: Improving RAG Performance with Context-Aware Embeddings
Searching for the right documents and chunks is the most important aspect of a RAG application. However, different chunking methods come with different drawbacks. Using long-context embedding models, we can use 'late chunking' to get the best chunking method for search quality so far.
DSPy: Build Better AI Systems with Automated Prompt Optimization
Learn how to use DSPy to automate and optimize your LLM prompts. This step-by-step tutorial shows you how to build more reliable AI systems without manual prompt engineering and serves as introduction to DSPy.
What is Libre Chat? The best alternative to Open WebUI and ChatGPT
Looking for an open-source alternative to ChatGPT or Open WebUI? LibreChat offers a versatile, free AI platform that supports multiple AI models like OpenAI, Anthropic or self-hosted ones.. It’s fully customizable, secure, and perfect for self-hosting. Learn in this post how LibreChat stands out as a flexible, privacy-focused AI chat solution.
How to create LLM tools from any Python SDK using langchain-autotools
LLMs can use Tools to interact with other technology, like databases or APIs. Creating such tools however requires to semantically match the semantics of the LLM with the tool. Learn here how langchain-autotools makes this very easy.
How to crawl websites for LLMs - using Firecrawl
Learn how to crawl websites and extract data perfectly prepared for LLM usage. We'll introduce Firecrawl - a powerful tool to crawl websites and extract data from these sites. We'll also demonstrate, how to use Firecrawl in combination with LangChain
Azure AI Studio: How to evaluate and upgrade your models, using the Prompt Flow SDK
Learn how to effectively evaluate and upgrade your AI models using Azure AI Studio and Prompt Flow, making sure that the new model does not introduce any regressions.
RAG, Embeddings and Vector Search with Google BigQuery
Google BigQuery recently got support for vector embeddings. Learn how to use them and how to create embeddings of your texts without leaving BigQuery. No external systems required.