How do Azure, Databricks, and UbiOps fit together?
In this article, we will analyze some of the drawbacks of Databricks when it comes to machine learning – specifically when it comes to the deployment stage of a model. Following this, we’ll explain...
View ArticleOpenAI vs. open-source LLM: Which model is best for your use case?
Introduction In a recent workshop on Large Language Models (LLMs), we asked attendees which LLMs they’re using or considering for their use cases. It turns out everyone was primarily focused on...
View ArticleDeploy Gemma 2B in under 15 minutes for free using UbiOps
What can you get out of this guide? In this guide, we explain how to: Create a UbiOps trial account Retrieve your Hugging Face token with access to Gemma Create your Gemma deployment Create a Gemma...
View ArticleRevolutionizing AI model serving: unlocking the power of global cloud-based...
<18_03_2024> <Amsterdam> UbiOps, a leading AI and machine learning deployment and serving platform, is thrilled to announce its strategic partnership with US- based GMI Cloud, a global...
View ArticleWhat is model serving?
Model deployment or model serving designates the stage in which a trained model is brought to production and readily usable. A model-serving platform allows you to easily deploy and monitor your...
View ArticleFine-tune a model on your own documentation
In this article, we will be creating a chatbot which is fine-tuned on custom documentation. We’ll use UbiOps—which is an AI deployment, serving and management platform—to fine-tune and deploy the...
View ArticleNew UbiOps features April 2024
On the 9th of April 2024 we have released new functionality and made improvements to our UbiOps SaaS product. An overview of the changes is given below.Python client library version for this release:...
View ArticleDeploy Gemma 7B in under 15 minutes with UbiOps
What can you get out of this guide? In this guide, we explain how to: Create a UbiOps trial account Create a code environment Retrieve your Hugging Face token and accept Google’s license Create your...
View ArticleHow to build a RAG query engine with LlamaIndex and UbiOps
Large Language Models (LLMs) are trained on vast datasets with data sourced from the public internet. But these datasets of course do not include specific datapoints regarding your business or use...
View ArticleDeploy Llama 3 8B in under 15 minutes using UbiOps
What can you get out of this guide? In this guide, we explain how to: Create a UbiOps account Create a code environment Accept Meta’s license agreement and retrieve your Hugging Face token Create...
View ArticleHow to benchmark and optimize LLM inference performance (for data scientists)
Introduction Optimizing inference is a machine learning (ML) engineer’s task. In a lot of cases, though, it tends to fall into the hands of data scientists. Whether you’re a data scientist deploying...
View ArticleHow to optimize inference speed using batching, vLLM, and UbiOps
In this guide, we will show you how to increase data throughput for LLMs using batching, specifically by utilizing the vLLM library. We will explain some of the techniques it leverages and show why...
View ArticleCreating a front-end for your Mistral RAG
In a previous article we showed how you can set up a Retrievel Augmented Generation (RAG) framework for the Mistral-7B-v.02 Instruct LLM using the UbiOps WebApp. In this article we’ll go a step...
View ArticleReducing inference costs for GenAI
For users of GenAI models, especially large language models (LLMs), inference costs remain one of the largest costs of using GenAI for business operations. What is inference? Inference in the GenAI...
View ArticleDeploy Mistral 7B v0.3 (Function Calling)
When Mistral released their Mistral 7B v0.2 model it was claimed to be the most powerful 7B Large Language Model (LLM) at that time. Now Mistral has released a new version, called Mistral 7B v0.3. The...
View ArticleManaging and monitoring your LLM applications
How UbiOps and Arize help you stay in control LLMs are all the rage at the moment, and the APIs of closed source models like GPT-4 have made it easier than ever to leverage the power of AI. However,...
View ArticleWhat is multi-model routing?
Multi-model routing is a process of linking multiple AI models together. The routing can either be done in series or in parallel, meaning that you use a router to send prompts to specific models....
View ArticleAccelerate time-to-results for European NVIDIA AI Sovereign-Hybrid Cloud,...
June, 27.06.2024 Amsterdam UbiOps, an AI serving and orchestration platform, has partnered with NEBUL, Modern HPC solutions provider and an official NVIDIA Partner for NVIDIA DGX, GPU and...
View ArticleNew UbiOps features July 2024
On the 11th of July 2024 we have released new functionality and made improvements to our UbiOps SaaS product. An overview of the changes is given below. Python client library version for this release:...
View ArticleUbiOps vs standard Model Serving Platforms
What UbiOps delivers more than standard Model Serving Platforms? Model serving is the process of providing access to production-level models for end-users or applications. Meaning that they will be...
View Article