The Beginning As I've been advancing technologies of my AI-powered product knowlege base chatbot which based on Django/LangChain/OpenAI/Chroma/Gradio which is sitting on AI application/framework layer, I also have kept an eye on how to build a pipeline for assessing the accuracy of machine learning models which is a part of AI Devops/infra. But I realized that I have no idea how to meature a model's accuracy. This makes me upset. Then I started looking for answers. My first google search on this is "how to measure llm accuracy", it brought me to Evaluating Large Language Models (LLMs): A Standard Set of Metrics for Accurate Assessment, it's informative. It's not a lengthy article and I read through it. This opens a new world to me. There are standard set of metrics for evaluating LLMs, including: I don't know all of them and where to start! I have to tell meself, "Man, you don't know machine learning..." So my next search was "machine learning course", Andrew Ng's Supervised Machine Learning: Regression and Classification now came on top of the google search results! It's so famous and I knew this before! Then I made a decision, I want to take action now and finish it thoroughly! I immedially enrolled into the course. Now let's start the journey! Day 1 Started Basics 1. What is ML? Defined by Arthur Samuel back in the 1950 😯 "Field of study that gives computers the ability to learn without being explicitly programmed." The above claims gaves the key point (The highlighted part) which could answer the question from…