Inference: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 3: Line 3:
In [[machine learning]], [[inference]] is when you make [[prediction]]s or [[generate content]] by applying a [[trained model]] to [[new data]] such as [[unlabeled examples]] or [[prompts]].
In [[machine learning]], [[inference]] is when you make [[prediction]]s or [[generate content]] by applying a [[trained model]] to [[new data]] such as [[unlabeled examples]] or [[prompts]].


==Inference Process=
==Inference Process==
Inference in machine learning involves several steps. First, the trained model is loaded into memory and then new data is fed into it. Afterward, the model utilizes [[parameters]] and [[functions]] learned from its [[training data]] to make predictions or decisions about this new data.
Inference in machine learning involves several steps. First, the trained model is loaded into memory and then new data is fed into it. Afterward, the model utilizes [[parameters]] and [[functions]] learned from its [[training data]] to make predictions or decisions about this new data.


==Types of Inference=
==Types of Inference==
In machine learning, there are two main types: [[real-time inference]] and [[batch inference]].  
In machine learning, there are two main types: [[real-time inference]] and [[batch inference]].  


Line 12: Line 12:
#[[Batch inference]] on the other hand involves making predictions for a large [[dataset]] at once and is commonly employed when models don't need to respond in real-time like [[recommendation system]]s do.
#[[Batch inference]] on the other hand involves making predictions for a large [[dataset]] at once and is commonly employed when models don't need to respond in real-time like [[recommendation system]]s do.


==Considerations for Inference=
==Considerations for Inference==
Speed and accuracy of inference are critical factors when using machine learning models. Speed of inference is especially crucial in real-time applications since it determines the model's capability to respond rapidly to changing data. On the other hand, accuracy inference has an impact on all applications since it determines usefulness and dependability of predictions made by the model.
Speed and accuracy of inference are critical factors when using machine learning models. Speed of inference is especially crucial in real-time applications since it determines the model's capability to respond rapidly to changing data. On the other hand, accuracy inference has an impact on all applications since it determines usefulness and dependability of predictions made by the model.