Microsoft on Tuesday released its smallest language artificial intelligence (AI) model Phi-3 to date. Small AI models are important because they have the ability to run on smartphones. The latest AI model is the successor to Phi-2, which was released in December 2023, and comes with a higher training database and larger parameters. The increased parameters help the AI model understand and answer more complex questions than its predecessor. It is also claimed that this is equivalent to a model trained on more than 10 times the number of parameters used for Phi-3.
a preprint paper Details of the Small Language Model (SLM) have been published on arXiv. However, since arXiv does not conduct peer review, the validity of the claims has not yet been ascertained. AI enthusiasts can test AI models through Azure and Ollama. hug face List Also made for Fi-3-Mini but weight not released yet.
Fi-3 is here, and it’s… good :-).
I made a quick short demo to give you a feel of what the Fi-3-Mini (3.8B) can do. Stay tuned for the open weight release tomorrow morning and more announcements!
(And usually it wouldn’t be complete without the usual table of benchmarks!) pic.twitter.com/AWA7Km59rp
– Sebastien Buebek (@Sebastian Buebek) 23 April 2024
Based on the demonstration, the AI model has been trained on 3.3 trillion tokens – units of data consisting of words, phrases or sub-segments of words that are fed into the system to train the AI model. It also includes 3.8 billion parameters, highlighting the level of complexity that a chatbot can understand. They are essentially neural connections where each point has knowledge about a certain topic, and it connects to various other points that contain information related to the original point.
Based on internal benchmarking, Microsoft claims that Chabot competes with models like Mixtral 8x7b and GPT-3.5, which are much larger than SML. The AI is aligned for the chat format, meaning it can answer conversational questions. “We also provide some preliminary parameter-scaling results with 7B and 14B models trained for 4.8T tokens, called phi-3-small and phi-3-medium, both compared to phi-3-mini Much more efficient,” says the tech veteran.
reuters reports AI models designed to perform simple tasks are also hosted on Microsoft Azure and Ollama. The company has not yet shared details about the open source license of the Phi-3-Mini. In particular, the Apache 2.0 license, which Grok AI recently released, allows both academic and commercial use.