Machine learning - Mott.blog

**Machine learning** (**ML**) is a subfield of [[Artificial intelligence|artificial intelligence]] (AI) concerned with the development of algorithms and [[Statistical model|statistical models]] that enable computer systems to learn from and make predictions or decisions based on [[Data|data]], without being explicitly programmed for each task. Rather than following a fixed set of rules, machine learning systems identify patterns in training data and generalize from those patterns to perform tasks on new, unseen inputs. The field draws on [[Statistics|statistics]], [[Mathematical optimization|mathematical optimization]], [[Probability theory|probability theory]], and [[Computer science|computer science]], and has applications across a wide range of domains including [[Computer vision|computer vision]], [[Natural language processing|natural language processing]], [[Recommender system|recommender systems]], and [[Medical diagnosis|medical diagnosis]]. Machine learning approaches are broadly categorized by the nature of the training signal available to the learning algorithm. In **[[Supervised learning|supervised learning]]**, models are trained on labeled datasets in which each input is paired with a known output, enabling the system to learn mappings from inputs to outputs for tasks such as [[Statistical classification|classification]] and [[Regression analysis|regression]]. **[[Unsupervised learning|Unsupervised learning]]** involves finding structure in unlabeled data, with common tasks including [[Cluster analysis|clustering]], [[Dimensionality reduction|dimensionality reduction]], and [[Anomaly detection|anomaly detection]]. **[[Reinforcement learning|Reinforcement learning]]** trains agents to make sequences of decisions by rewarding desired behaviors and penalizing undesired ones, and has achieved notable results in game playing and [[Robotics|robotics]]. **[[Semi-supervised learning|Semi-supervised learning]]** and **[[Self-supervised learning|self-supervised learning]]** occupy intermediate positions, leveraging small quantities of labeled data alongside larger unlabeled datasets. The field has been shaped by several major algorithmic developments. [[Decision tree|Decision trees]], [[Support vector machine|support vector machines]], and [[Bayesian network|Bayesian]] methods were prominent from the 1980s through the 2000s. The resurgence of [[Artificial neural network|artificial neural networks]] through [[Deep learning|deep learning]]—characterized by networks with many layers trained using [[Backpropagation|backpropagation]] and large datasets—produced dramatic advances beginning in the early 2010s, particularly in image recognition following the success of [[AlexNet]] in the [[ImageNet]] competition in 2012. The development of the [[Transformer (machine learning model)|transformer]] architecture in 2017 enabled breakthroughs in [[Natural language processing|natural language processing]], leading to large [[Language model|language models]] such as [[GPT (language model)|GPT]], [[BERT]], and subsequent systems that demonstrated general-purpose language understanding and generation capabilities. These advances have driven widespread adoption of machine learning in industry, research, and public-facing products, while also raising significant questions regarding [[Algorithmic bias|algorithmic bias]], [[Explainability|model interpretability]], [[AI safety|AI safety]], and the [[Environmental impact of computing|environmental impact]] of large-scale model training.