
Analysis and metrics are the cornerstones of understanding what is happening in deployment. Are your reasoning endpoints overloaded? How many requests do they handle? Having relevant metrics that are properly visualized and displayed in real time is essential for monitoring and debugging.
I realized that my analytics dashboard needs a refresh. I’ve been feeling the same pain as the user because I debug many endpoints myself. That’s why we have planned some improvements and made some improvements to provide you with a better experience.
realime real-time metrics: Data is updated in real time to ensure accurate and up-to-date views of endpoint performance. Even if you are monitoring requests delays, response times, or error rates, you can now see events occur. We also recreated the backend for the Analytics dashboard to ensure that data is loaded quickly, especially for busy traffic endpoints. No more waiting for the metric to appear. Just open the dashboard and get instant insights.
The browser does not support video tags.
Customizable Customizable Time Ranges & auto-refresh: Knowing that different users need different views makes it easier to zoom in over a specific time range and track long-term trends. You can also enable Autorefresh to ensure that your dashboard is up to date without having to manually reload.
The browser does not support video tags.
Replica Lifecycle View: Because it is important to understand what’s going on with replicas, we’ve introduced a detailed view of each replica’s lifecycle. Now you can track the replica from initialization to finish and observe all state transitions in between. This helps you understand what’s going on at the endpoint, even if you have some moving parts.
We have deployed these updates, but we are actively repeating them. Things continue to improve and all feedback is welcome.
Tell us what works, what doesn’t, and what you want to see next! 🙌
Head to the inference endpoint to see the changes!