{"id":2080,"date":"2024-07-09T11:28:19","date_gmt":"2024-07-09T11:28:19","guid":{"rendered":"https:\/\/www.w3computing.com\/articles\/?p=2080"},"modified":"2024-07-09T11:28:22","modified_gmt":"2024-07-09T11:28:22","slug":"implementing-support-vector-machines-svms-from-scratch","status":"publish","type":"post","link":"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/","title":{"rendered":"Implementing Support Vector Machines (SVMs) from Scratch"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Support Vector Machines (SVMs) are a powerful set of supervised learning methods used for classification, regression, and outlier detection. This tutorial will guide you through implementing SVMs from scratch, focusing on classification. By the end, you&#8217;ll understand the theory behind SVMs and how to code them without relying on external libraries. This tutorial assumes you have a good grasp of Python and linear algebra.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1. Introduction to Support Vector Machines<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">What is a Support Vector Machine?<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A Support Vector Machine (SVM) is a supervised machine learning algorithm that can be used for classification or regression challenges. It works by finding the hyperplane that best divides a dataset into classes. In a two-dimensional space, this hyperplane is a line dividing a plane into two parts where each class lies on either side.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Why Use SVM?<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Effective in high-dimensional spaces<\/strong>: Particularly useful when the number of dimensions exceeds the number of samples.<\/li>\n\n\n\n<li><strong>Memory efficient<\/strong>: Uses a subset of training points in the decision function (called support vectors).<\/li>\n\n\n\n<li><strong>Versatile<\/strong>: Different kernel functions can be specified for the decision function. Common kernels include linear, polynomial, RBF (Gaussian), and sigmoid.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">2. Mathematical Foundation of SVM<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Hyperplanes and Support Vectors<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A hyperplane in an <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=n&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"n\" class=\"latex\" \/>-dimensional space is a flat affine subspace of dimension <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=n-1&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"n-1\" class=\"latex\" \/>. For a 2D space, the hyperplane is a line. In SVM, we aim to find a hyperplane that maximizes the margin between the two classes. The points lying closest to the hyperplane are called support vectors.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Margin<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The margin is the distance between the hyperplane and the closest data points from either class. Maximizing the margin helps improve the model&#8217;s generalization ability.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Mathematical Formulation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Given a training dataset <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5C%7B%28x_i%2C+y_i%29%5C%7D_%7Bi%3D1%7D%5En&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;{(x_i, y_i)&#92;}_{i=1}^n\" class=\"latex\" \/> where <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x_i+%5Cin+%5Cmathbb%7BR%7D%5En&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"x_i &#92;in &#92;mathbb{R}^n\" class=\"latex\" \/> and <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=y_i+%5Cin+%5C%7B-1%2C+1%5C%7D&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"y_i &#92;in &#92;{-1, 1&#92;}\" class=\"latex\" \/>, the decision function for a linear SVM is:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=f%28x%29+%3D+%5Cmathbf%7Bw%7D%5ET+%5Cmathbf%7Bx%7D+%2B+b&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"f(x) = &#92;mathbf{w}^T &#92;mathbf{x} + b\" class=\"latex\" \/><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The goal is to find <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cmathbf%7Bw%7D&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;mathbf{w}\" class=\"latex\" \/> and <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=b&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"b\" class=\"latex\" \/> such that the margin is maximized. This can be formulated as a constrained optimization problem:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Minimize <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B1%7D%7B2%7D+%7C%7C%5Cmathbf%7Bw%7D%7C%7C%5E2&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;frac{1}{2} ||&#92;mathbf{w}||^2\" class=\"latex\" \/><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Subject to <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=y_i+%28%5Cmathbf%7Bw%7D%5ET+%5Cmathbf%7Bx%7D_i+%2B+b%29+%5Cgeq+1&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"y_i (&#92;mathbf{w}^T &#92;mathbf{x}_i + b) &#92;geq 1\" class=\"latex\" \/> for all <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"i\" class=\"latex\" \/>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The Dual Problem<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Using Lagrange multipliers, the above problem can be converted into its dual form, which is easier to solve:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Maximize:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=L%28%5Calpha%29+%3D+%5Csum_%7Bi%3D1%7D%5En+%5Calpha_i+-+%5Cfrac%7B1%7D%7B2%7D+%5Csum_%7Bi%3D1%7D%5En+%5Csum_%7Bj%3D1%7D%5En+%5Calpha_i+%5Calpha_j+y_i+y_j+%5Cmathbf%7Bx%7D_i%5ET+%5Cmathbf%7Bx%7D_j&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"L(&#92;alpha) = &#92;sum_{i=1}^n &#92;alpha_i - &#92;frac{1}{2} &#92;sum_{i=1}^n &#92;sum_{j=1}^n &#92;alpha_i &#92;alpha_j y_i y_j &#92;mathbf{x}_i^T &#92;mathbf{x}_j\" class=\"latex\" \/><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Subject to:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Csum_%7Bi%3D1%7D%5En+%5Calpha_i+y_i+%3D+0&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;sum_{i=1}^n &#92;alpha_i y_i = 0\" class=\"latex\" \/><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Calpha_i+%5Cgeq+0+%5Ctext%7B+for+all+%7D+i&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;alpha_i &#92;geq 0 &#92;text{ for all } i\" class=\"latex\" \/><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">where <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Calpha_i&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;alpha_i\" class=\"latex\" \/> are the Lagrange multipliers.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Kernel Trick<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The kernel trick allows SVM to create non-linear decision boundaries by transforming the input space into a higher-dimensional space where a linear separation is possible. Common kernels include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Linear Kernel: <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=K%28%5Cmathbf%7Bx%7D%2C+%5Cmathbf%7Bx%7D%27%29+%3D+%5Cmathbf%7Bx%7D%5ET+%5Cmathbf%7Bx%7D%27&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"K(&#92;mathbf{x}, &#92;mathbf{x}&#039;) = &#92;mathbf{x}^T &#92;mathbf{x}&#039;\" class=\"latex\" \/><\/li>\n\n\n\n<li>Polynomial Kernel: <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=K%28%5Cmathbf%7Bx%7D%2C+%5Cmathbf%7Bx%7D%27%29+%3D+%28%5Cmathbf%7Bx%7D%5ET+%5Cmathbf%7Bx%7D%27+%2B+c%29%5Ed&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"K(&#92;mathbf{x}, &#92;mathbf{x}&#039;) = (&#92;mathbf{x}^T &#92;mathbf{x}&#039; + c)^d\" class=\"latex\" \/><\/li>\n\n\n\n<li>Radial Basis Function (RBF) Kernel: <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=K%28%5Cmathbf%7Bx%7D%2C+%5Cmathbf%7Bx%7D%27%29+%3D+%5Cexp%28-%5Cgamma+%7C%7C%5Cmathbf%7Bx%7D+-+%5Cmathbf%7Bx%7D%27%7C%7C%5E2%29&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"K(&#92;mathbf{x}, &#92;mathbf{x}&#039;) = &#92;exp(-&#92;gamma ||&#92;mathbf{x} - &#92;mathbf{x}&#039;||^2)\" class=\"latex\" \/><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">3. Implementing SVM from Scratch<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Data Preprocessing<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Before we start coding the SVM, let&#8217;s preprocess the data. We&#8217;ll use the Iris dataset for simplicity.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-1\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">import<\/span> numpy <span class=\"hljs-keyword\">as<\/span> np\n<span class=\"hljs-keyword\">from<\/span> sklearn.datasets <span class=\"hljs-keyword\">import<\/span> load_iris\n<span class=\"hljs-keyword\">from<\/span> sklearn.model_selection <span class=\"hljs-keyword\">import<\/span> train_test_split\n<span class=\"hljs-keyword\">from<\/span> sklearn.preprocessing <span class=\"hljs-keyword\">import<\/span> StandardScaler\n\n<span class=\"hljs-comment\"># Load the Iris dataset<\/span>\niris = load_iris()\nX = iris.data&#91;:<span class=\"hljs-number\">100<\/span>, :<span class=\"hljs-number\">2<\/span>]  <span class=\"hljs-comment\"># We will use only the first two features and two classes for simplicity<\/span>\ny = iris.target&#91;:<span class=\"hljs-number\">100<\/span>]\n\n<span class=\"hljs-comment\"># Convert the labels to {-1, 1}<\/span>\ny = np.where(y == <span class=\"hljs-number\">0<\/span>, <span class=\"hljs-number\">-1<\/span>, <span class=\"hljs-number\">1<\/span>)\n\n<span class=\"hljs-comment\"># Split the data into training and testing sets<\/span>\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=<span class=\"hljs-number\">0.2<\/span>, random_state=<span class=\"hljs-number\">42<\/span>)\n\n<span class=\"hljs-comment\"># Standardize the features<\/span>\nscaler = StandardScaler()\nX_train = scaler.fit_transform(X_train)\nX_test = scaler.transform(X_test)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-1\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h3 class=\"wp-block-heading\">Kernel Functions<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Let&#8217;s implement some kernel functions:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-2\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">linear_kernel<\/span><span class=\"hljs-params\">(x1, x2)<\/span>:<\/span>\n    <span class=\"hljs-keyword\">return<\/span> np.dot(x1, x2)\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">polynomial_kernel<\/span><span class=\"hljs-params\">(x1, x2, degree=<span class=\"hljs-number\">3<\/span>, coef0=<span class=\"hljs-number\">1<\/span>)<\/span>:<\/span>\n    <span class=\"hljs-keyword\">return<\/span> (np.dot(x1, x2) + coef0) ** degree\n\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">rbf_kernel<\/span><span class=\"hljs-params\">(x1, x2, gamma=<span class=\"hljs-number\">0.1<\/span>)<\/span>:<\/span>\n    <span class=\"hljs-keyword\">return<\/span> np.exp(-gamma * np.linalg.norm(x1 - x2) ** <span class=\"hljs-number\">2<\/span>)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-2\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h3 class=\"wp-block-heading\">Optimization Problem<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The optimization problem involves finding the values of $latex \\alpha) that maximize the dual problem. We will use the Sequential Minimal Optimization (SMO) algorithm to solve this.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Sequential Minimal Optimization (SMO)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">SMO breaks the problem into smaller subproblems, which are then solved analytically. This approach significantly simplifies the optimization process.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-3\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-class\"><span class=\"hljs-keyword\">class<\/span> <span class=\"hljs-title\">SVM<\/span>:<\/span>\n    <span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">__init__<\/span><span class=\"hljs-params\">(self, kernel=linear_kernel, C=<span class=\"hljs-number\">1.0<\/span>, tol=<span class=\"hljs-number\">1e-3<\/span>, max_passes=<span class=\"hljs-number\">5<\/span>)<\/span>:<\/span>\n        self.kernel = kernel\n        self.C = C\n        self.tol = tol\n        self.max_passes = max_passes\n\n    <span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">fit<\/span><span class=\"hljs-params\">(self, X, y)<\/span>:<\/span>\n        n_samples, n_features = X.shape\n        self.alpha = np.zeros(n_samples)\n        self.b = <span class=\"hljs-number\">0<\/span>\n        self.X = X\n        self.y = y\n\n        passes = <span class=\"hljs-number\">0<\/span>\n        <span class=\"hljs-keyword\">while<\/span> passes &lt; self.max_passes:\n            num_changed_alphas = <span class=\"hljs-number\">0<\/span>\n            <span class=\"hljs-keyword\">for<\/span> i <span class=\"hljs-keyword\">in<\/span> range(n_samples):\n                E_i = self._decision_function(X&#91;i]) - y&#91;i]\n                <span class=\"hljs-keyword\">if<\/span> (y&#91;i] * E_i &lt; -self.tol <span class=\"hljs-keyword\">and<\/span> self.alpha&#91;i] &lt; self.C) <span class=\"hljs-keyword\">or<\/span> (y&#91;i] * E_i &gt; self.tol <span class=\"hljs-keyword\">and<\/span> self.alpha&#91;i] &gt; <span class=\"hljs-number\">0<\/span>):\n                    j = np.random.randint(<span class=\"hljs-number\">0<\/span>, n_samples)\n                    <span class=\"hljs-keyword\">while<\/span> j == i:\n                        j = np.random.randint(<span class=\"hljs-number\">0<\/span>, n_samples)\n                    E_j = self._decision_function(X&#91;j]) - y&#91;j]\n\n                    alpha_i_old = self.alpha&#91;i]\n                    alpha_j_old = self.alpha&#91;j]\n\n                    <span class=\"hljs-keyword\">if<\/span> y&#91;i] != y&#91;j]:\n                        L = max(<span class=\"hljs-number\">0<\/span>, self.alpha&#91;j] - self.alpha&#91;i])\n                        H = min(self.C, self.C + self.alpha&#91;j] - self.alpha&#91;i])\n                    <span class=\"hljs-keyword\">else<\/span>:\n                        L = max(<span class=\"hljs-number\">0<\/span>, self.alpha&#91;j] + self.alpha&#91;i] - self.C)\n                        H = min(self.C, self.alpha&#91;j] + self.alpha&#91;i])\n\n                    <span class=\"hljs-keyword\">if<\/span> L == H:\n                        <span class=\"hljs-keyword\">continue<\/span>\n\n                    eta = <span class=\"hljs-number\">2<\/span> * self.kernel(X&#91;i], X&#91;j]) - self.kernel(X&#91;i], X&#91;i]) - self.kernel(X&#91;j], X&#91;j])\n                    <span class=\"hljs-keyword\">if<\/span> eta &gt;= <span class=\"hljs-number\">0<\/span>:\n                        <span class=\"hljs-keyword\">continue<\/span>\n\n                    self.alpha&#91;j] -= y&#91;j] * (E_i - E_j) \/ eta\n                    self.alpha&#91;j] = np.clip(self.alpha&#91;j], L, H)\n\n                    <span class=\"hljs-keyword\">if<\/span> abs(self.alpha&#91;j] - alpha_j_old) &lt; <span class=\"hljs-number\">1e-5<\/span>:\n                        <span class=\"hljs-keyword\">continue<\/span>\n\n                    self.alpha&#91;i] += y&#91;i] * y&#91;j] * (alpha_j_old - self.alpha&#91;j])\n\n                    b1 = self.b - E_i - y&#91;i] * (self.alpha&#91;i] - alpha_i_old) * self.kernel(X&#91;i], X&#91;i]) - y&#91;j] * (self.alpha&#91;j] - alpha_j_old) * self.kernel(X&#91;i], X&#91;j])\n                    b2 = self.b - E_j - y&#91;i] * (self.alpha&#91;i] - alpha_i_old) * self.kernel(X&#91;i], X&#91;j]) - y&#91;j] * (self.alpha&#91;j] - alpha_j_old) * self.kernel(X&#91;j], X&#91;j])\n\n                    <span class=\"hljs-keyword\">if<\/span> <span class=\"hljs-number\">0<\/span> &lt; self.alpha&#91;i] &lt; self.C:\n                        self.b = b1\n                    <span class=\"hljs-keyword\">elif<\/span> <span class=\"hljs-number\">0<\/span> &lt; self.alpha&#91;j] &lt; self.C:\n                        self.b = b2\n                    <span class=\"hljs-keyword\">else<\/span>:\n                        self.b = (b1 + b2) \/ <span class=\"hljs-number\">2<\/span>\n\n                    num_changed_alphas += <span class=\"hljs-number\">1<\/span>\n\n            <span class=\"hljs-keyword\">if<\/span> num_changed_alphas == <span class=\"hljs-number\">0<\/span>:\n                passes += <span class=\"hljs-number\">1<\/span>\n            <span class=\"hljs-keyword\">else<\/span>:\n                passes = <span class=\"hljs-number\">0<\/span>\n\n    <span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">_decision_function<\/span><span class=\"hljs-params\">(self, X)<\/span>:<\/span>\n        <span class=\"hljs-keyword\">return<\/span> np.dot((self.alpha\n\n * self.y), self.kernel(self.X, X)) + self.b\n\n    <span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">predict<\/span><span class=\"hljs-params\">(self, X)<\/span>:<\/span>\n        <span class=\"hljs-keyword\">return<\/span> np.sign(self._decision_function(X))<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-3\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h3 class=\"wp-block-heading\">Training and Testing the SVM<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Now, let&#8217;s train our SVM model and test it on the test data.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-4\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-comment\"># Initialize the SVM with a linear kernel<\/span>\nsvm = SVM(kernel=linear_kernel, C=<span class=\"hljs-number\">1.0<\/span>)\n\n<span class=\"hljs-comment\"># Train the SVM<\/span>\nsvm.fit(X_train, y_train)\n\n<span class=\"hljs-comment\"># Predict the test data<\/span>\ny_pred = svm.predict(X_test)\n\n<span class=\"hljs-comment\"># Calculate the accuracy<\/span>\naccuracy = np.mean(y_pred == y_test)\nprint(<span class=\"hljs-string\">f'Accuracy: <span class=\"hljs-subst\">{accuracy * <span class=\"hljs-number\">100<\/span>:<span class=\"hljs-number\">.2<\/span>f}<\/span>%'<\/span>)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-4\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h2 class=\"wp-block-heading\">6. Evaluation Metrics<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">While accuracy is a common metric, it&#8217;s important to consider other metrics, especially when dealing with imbalanced datasets.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Precision, Recall, and F1-Score<\/h3>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-5\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">from<\/span> sklearn.metrics <span class=\"hljs-keyword\">import<\/span> precision_score, recall_score, f1_score\n\nprecision = precision_score(y_test, y_pred)\nrecall = recall_score(y_test, y_pred)\nf1 = f1_score(y_test, y_pred)\n\nprint(<span class=\"hljs-string\">f'Precision: <span class=\"hljs-subst\">{precision:<span class=\"hljs-number\">.2<\/span>f}<\/span>'<\/span>)\nprint(<span class=\"hljs-string\">f'Recall: <span class=\"hljs-subst\">{recall:<span class=\"hljs-number\">.2<\/span>f}<\/span>'<\/span>)\nprint(<span class=\"hljs-string\">f'F1-Score: <span class=\"hljs-subst\">{f1:<span class=\"hljs-number\">.2<\/span>f}<\/span>'<\/span>)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-5\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h3 class=\"wp-block-heading\">Confusion Matrix<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A confusion matrix provides a detailed breakdown of prediction results.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-6\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">from<\/span> sklearn.metrics <span class=\"hljs-keyword\">import<\/span> confusion_matrix\n\nconf_matrix = confusion_matrix(y_test, y_pred)\nprint(<span class=\"hljs-string\">'Confusion Matrix:'<\/span>)\nprint(conf_matrix)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-6\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h2 class=\"wp-block-heading\">7. Optimizations and Practical Tips<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Handling Imbalanced Data<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">When dealing with imbalanced data, you can adjust the class weights.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-7\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-class\"><span class=\"hljs-keyword\">class<\/span> <span class=\"hljs-title\">SVM<\/span>:<\/span>\n    <span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">__init__<\/span><span class=\"hljs-params\">(self, kernel=linear_kernel, C=<span class=\"hljs-number\">1.0<\/span>, tol=<span class=\"hljs-number\">1e-3<\/span>, max_passes=<span class=\"hljs-number\">5<\/span>, class_weight=None)<\/span>:<\/span>\n        self.kernel = kernel\n        self.C = C\n        self.tol = tol\n        self.max_passes = max_passes\n        self.class_weight = class_weight\n\n    <span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">fit<\/span><span class=\"hljs-params\">(self, X, y)<\/span>:<\/span>\n        n_samples, n_features = X.shape\n        self.alpha = np.zeros(n_samples)\n        self.b = <span class=\"hljs-number\">0<\/span>\n        self.X = X\n        self.y = y\n\n        <span class=\"hljs-keyword\">if<\/span> self.class_weight:\n            weight = np.vectorize(self.class_weight.get)(y)\n        <span class=\"hljs-keyword\">else<\/span>:\n            weight = np.ones(n_samples)\n\n        passes = <span class=\"hljs-number\">0<\/span>\n        <span class=\"hljs-keyword\">while<\/span> passes &lt; self.max_passes:\n            num_changed_alphas = <span class=\"hljs-number\">0<\/span>\n            <span class=\"hljs-keyword\">for<\/span> i <span class=\"hljs-keyword\">in<\/span> range(n_samples):\n                E_i = self._decision_function(X&#91;i]) - y&#91;i]\n                <span class=\"hljs-keyword\">if<\/span> (y&#91;i] * E_i &lt; -self.tol <span class=\"hljs-keyword\">and<\/span> self.alpha&#91;i] &lt; self.C * weight&#91;i]) <span class=\"hljs-keyword\">or<\/span> (y&#91;i] * E_i &gt; self.tol <span class=\"hljs-keyword\">and<\/span> self.alpha&#91;i] &gt; <span class=\"hljs-number\">0<\/span>):\n                    j = np.random.randint(<span class=\"hljs-number\">0<\/span>, n_samples)\n                    <span class=\"hljs-keyword\">while<\/span> j == i:\n                        j = np.random.randint(<span class=\"hljs-number\">0<\/span>, n_samples)\n                    E_j = self._decision_function(X&#91;j]) - y&#91;j]\n\n                    alpha_i_old = self.alpha&#91;i]\n                    alpha_j_old = self.alpha&#91;j]\n\n                    <span class=\"hljs-keyword\">if<\/span> y&#91;i] != y&#91;j]:\n                        L = max(<span class=\"hljs-number\">0<\/span>, self.alpha&#91;j] - self.alpha&#91;i])\n                        H = min(self.C * weight&#91;j], self.C * weight&#91;j] + self.alpha&#91;j] - self.alpha&#91;i])\n                    <span class=\"hljs-keyword\">else<\/span>:\n                        L = max(<span class=\"hljs-number\">0<\/span>, self.alpha&#91;j] + self.alpha&#91;i] - self.C * weight&#91;j])\n                        H = min(self.C * weight&#91;j], self.alpha&#91;j] + self.alpha&#91;i])\n\n                    <span class=\"hljs-keyword\">if<\/span> L == H:\n                        <span class=\"hljs-keyword\">continue<\/span>\n\n                    eta = <span class=\"hljs-number\">2<\/span> * self.kernel(X&#91;i], X&#91;j]) - self.kernel(X&#91;i], X&#91;i]) - self.kernel(X&#91;j], X&#91;j])\n                    <span class=\"hljs-keyword\">if<\/span> eta &gt;= <span class=\"hljs-number\">0<\/span>:\n                        <span class=\"hljs-keyword\">continue<\/span>\n\n                    self.alpha&#91;j] -= y&#91;j] * (E_i - E_j) \/ eta\n                    self.alpha&#91;j] = np.clip(self.alpha&#91;j], L, H)\n\n                    <span class=\"hljs-keyword\">if<\/span> abs(self.alpha&#91;j] - alpha_j_old) &lt; <span class=\"hljs-number\">1e-5<\/span>:\n                        <span class=\"hljs-keyword\">continue<\/span>\n\n                    self.alpha&#91;i] += y&#91;i] * y&#91;j] * (alpha_j_old - self.alpha&#91;j])\n\n                    b1 = self.b - E_i - y&#91;i] * (self.alpha&#91;i] - alpha_i_old) * self.kernel(X&#91;i], X&#91;i]) - y&#91;j] * (self.alpha&#91;j] - alpha_j_old) * self.kernel(X&#91;i], X&#91;j])\n                    b2 = self.b - E_j - y&#91;i] * (self.alpha&#91;i] - alpha_i_old) * self.kernel(X&#91;i], X&#91;j]) - y&#91;j] * (self.alpha&#91;j] - alpha_j_old) * self.kernel(X&#91;j], X&#91;j])\n\n                    <span class=\"hljs-keyword\">if<\/span> <span class=\"hljs-number\">0<\/span> &lt; self.alpha&#91;i] &lt; self.C * weight&#91;i]:\n                        self.b = b1\n                    <span class=\"hljs-keyword\">elif<\/span> <span class=\"hljs-number\">0<\/span> &lt; self.alpha&#91;j] &lt; self.C * weight&#91;j]:\n                        self.b = b2\n                    <span class=\"hljs-keyword\">else<\/span>:\n                        self.b = (b1 + b2) \/ <span class=\"hljs-number\">2<\/span>\n\n                    num_changed_alphas += <span class=\"hljs-number\">1<\/span>\n\n            <span class=\"hljs-keyword\">if<\/span> num_changed_alphas == <span class=\"hljs-number\">0<\/span>:\n                passes += <span class=\"hljs-number\">1<\/span>\n            <span class=\"hljs-keyword\">else<\/span>:\n                passes = <span class=\"hljs-number\">0<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-7\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h3 class=\"wp-block-heading\">Feature Scaling<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Feature scaling can significantly impact the performance of SVMs. Ensure that your data is standardized or normalized before training.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Parameter Tuning<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Tuning parameters like <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=C&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"C\" class=\"latex\" \/>, kernel type, and kernel parameters is crucial for achieving optimal performance. Use grid search or randomized search to find the best parameters.<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-8\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">from<\/span> sklearn.model_selection <span class=\"hljs-keyword\">import<\/span> GridSearchCV\n<span class=\"hljs-keyword\">from<\/span> sklearn.svm <span class=\"hljs-keyword\">import<\/span> SVC\n\n<span class=\"hljs-comment\"># Define the parameter grid<\/span>\nparam_grid = {\n    <span class=\"hljs-string\">'C'<\/span>: &#91;<span class=\"hljs-number\">0.1<\/span>, <span class=\"hljs-number\">1<\/span>, <span class=\"hljs-number\">10<\/span>],\n    <span class=\"hljs-string\">'kernel'<\/span>: &#91;<span class=\"hljs-string\">'linear'<\/span>, <span class=\"hljs-string\">'poly'<\/span>, <span class=\"hljs-string\">'rbf'<\/span>],\n    <span class=\"hljs-string\">'gamma'<\/span>: &#91;<span class=\"hljs-number\">0.001<\/span>, <span class=\"hljs-number\">0.01<\/span>, <span class=\"hljs-number\">0.1<\/span>, <span class=\"hljs-number\">1<\/span>],\n    <span class=\"hljs-string\">'degree'<\/span>: &#91;<span class=\"hljs-number\">2<\/span>, <span class=\"hljs-number\">3<\/span>, <span class=\"hljs-number\">4<\/span>]\n}\n\n<span class=\"hljs-comment\"># Initialize the SVM model<\/span>\nsvc = SVC()\n\n<span class=\"hljs-comment\"># Initialize the grid search<\/span>\ngrid_search = GridSearchCV(svc, param_grid, cv=<span class=\"hljs-number\">5<\/span>, scoring=<span class=\"hljs-string\">'accuracy'<\/span>)\n\n<span class=\"hljs-comment\"># Fit the grid search<\/span>\ngrid_search.fit(X_train, y_train)\n\n<span class=\"hljs-comment\"># Print the best parameters and score<\/span>\nprint(<span class=\"hljs-string\">f'Best Parameters: <span class=\"hljs-subst\">{grid_search.best_params_}<\/span>'<\/span>)\nprint(<span class=\"hljs-string\">f'Best Score: <span class=\"hljs-subst\">{grid_search.best_score_}<\/span>'<\/span>)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-8\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h2 class=\"wp-block-heading\">8. Conclusion<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">In this tutorial, we&#8217;ve implemented a Support Vector Machine (SVM) from scratch. We&#8217;ve covered the mathematical foundations, implemented the SMO algorithm, and explored how to handle real-world challenges like imbalanced data and feature scaling. By understanding the inner workings of SVMs, you can better leverage their power and apply them effectively in various machine learning tasks.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Further Reading<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Pattern Recognition and Machine Learning<\/strong> by Christopher M. Bishop<\/li>\n\n\n\n<li><strong>The Elements of Statistical Learning<\/strong> by Trevor Hastie, Robert Tibshirani, and Jerome Friedman<\/li>\n\n\n\n<li><strong>Support Vector Machines: Concepts and Applications<\/strong> by Ingo Steinwart and Andreas Christmann<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Practice Problems<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Implement SVM with a different kernel (e.g., polynomial or RBF) from scratch.<\/li>\n\n\n\n<li>Apply your SVM implementation to a different dataset (e.g., the digits dataset).<\/li>\n\n\n\n<li>Experiment with different values of <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=C&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"C\" class=\"latex\" \/> and observe how it affects the decision boundary and performance.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">By following this tutorial and experimenting further, you&#8217;ll gain a deep understanding of SVMs and be well-equipped to use them in your machine learning projects.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Support Vector Machines (SVMs) are a powerful set of supervised learning methods used for classification, regression, and outlier detection. This tutorial will guide you through implementing SVMs from scratch, focusing on classification. By the end, you&#8217;ll understand the theory behind SVMs and how to code them without relying on external libraries. This tutorial assumes you [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_genesis_hide_title":false,"_genesis_hide_breadcrumbs":false,"_genesis_hide_singular_image":false,"_genesis_hide_footer_widgets":false,"_genesis_custom_body_class":"","_genesis_custom_post_class":"","_genesis_layout":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[18,4,6],"tags":[],"class_list":["post-2080","post","type-post","status-publish","format-standard","category-artificial-intelligence","category-programming-languages","category-python","entry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.6 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Implementing Support Vector Machines (SVMs) from Scratch<\/title>\n<meta name=\"description\" content=\"Support Vector Machines (SVMs) are a powerful set of supervised learning methods used for classification, regression, and outlier detection.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Implementing Support Vector Machines (SVMs) from Scratch\" \/>\n<meta property=\"og:description\" content=\"Support Vector Machines (SVMs) are a powerful set of supervised learning methods used for classification, regression, and outlier detection.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/\" \/>\n<meta property=\"article:published_time\" content=\"2024-07-09T11:28:19+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-07-09T11:28:22+00:00\" \/>\n<meta name=\"author\" content=\"w3compadmin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"w3compadmin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"TechArticle\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/implementing-support-vector-machines-svms-from-scratch\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/implementing-support-vector-machines-svms-from-scratch\\\/\"},\"author\":{\"name\":\"w3compadmin\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/#\\\/schema\\\/person\\\/a550b3e20d78bb4f79b7c6b7b53f0561\"},\"headline\":\"Implementing Support Vector Machines (SVMs) from Scratch\",\"datePublished\":\"2024-07-09T11:28:19+00:00\",\"dateModified\":\"2024-07-09T11:28:22+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/implementing-support-vector-machines-svms-from-scratch\\\/\"},\"wordCount\":944,\"articleSection\":[\"Artificial Intelligence\",\"Programming Languages\",\"Python\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/implementing-support-vector-machines-svms-from-scratch\\\/\",\"url\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/implementing-support-vector-machines-svms-from-scratch\\\/\",\"name\":\"Implementing Support Vector Machines (SVMs) from Scratch\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/#website\"},\"datePublished\":\"2024-07-09T11:28:19+00:00\",\"dateModified\":\"2024-07-09T11:28:22+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/#\\\/schema\\\/person\\\/a550b3e20d78bb4f79b7c6b7b53f0561\"},\"description\":\"Support Vector Machines (SVMs) are a powerful set of supervised learning methods used for classification, regression, and outlier detection.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/implementing-support-vector-machines-svms-from-scratch\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/implementing-support-vector-machines-svms-from-scratch\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/implementing-support-vector-machines-svms-from-scratch\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Articles Home\",\"item\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Artificial Intelligence\",\"item\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/artificial-intelligence\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Implementing Support Vector Machines (SVMs) from Scratch\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/#website\",\"url\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/\",\"name\":\"Developer Articles Hub\",\"description\":\"\",\"alternateName\":\"Developer Articles\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/#\\\/schema\\\/person\\\/a550b3e20d78bb4f79b7c6b7b53f0561\",\"name\":\"w3compadmin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/wp-content\\\/litespeed\\\/avatar\\\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266\",\"url\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/wp-content\\\/litespeed\\\/avatar\\\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266\",\"contentUrl\":\"https:\\\/\\\/www.w3computing.com\\\/articles\\\/wp-content\\\/litespeed\\\/avatar\\\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266\",\"caption\":\"w3compadmin\"},\"sameAs\":[\"http:\\\/\\\/w3computing.com\\\/articles\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Implementing Support Vector Machines (SVMs) from Scratch","description":"Support Vector Machines (SVMs) are a powerful set of supervised learning methods used for classification, regression, and outlier detection.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/","og_locale":"en_US","og_type":"article","og_title":"Implementing Support Vector Machines (SVMs) from Scratch","og_description":"Support Vector Machines (SVMs) are a powerful set of supervised learning methods used for classification, regression, and outlier detection.","og_url":"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/","article_published_time":"2024-07-09T11:28:19+00:00","article_modified_time":"2024-07-09T11:28:22+00:00","author":"w3compadmin","twitter_card":"summary_large_image","twitter_misc":{"Written by":"w3compadmin","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"TechArticle","@id":"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/#article","isPartOf":{"@id":"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/"},"author":{"name":"w3compadmin","@id":"https:\/\/www.w3computing.com\/articles\/#\/schema\/person\/a550b3e20d78bb4f79b7c6b7b53f0561"},"headline":"Implementing Support Vector Machines (SVMs) from Scratch","datePublished":"2024-07-09T11:28:19+00:00","dateModified":"2024-07-09T11:28:22+00:00","mainEntityOfPage":{"@id":"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/"},"wordCount":944,"articleSection":["Artificial Intelligence","Programming Languages","Python"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/","url":"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/","name":"Implementing Support Vector Machines (SVMs) from Scratch","isPartOf":{"@id":"https:\/\/www.w3computing.com\/articles\/#website"},"datePublished":"2024-07-09T11:28:19+00:00","dateModified":"2024-07-09T11:28:22+00:00","author":{"@id":"https:\/\/www.w3computing.com\/articles\/#\/schema\/person\/a550b3e20d78bb4f79b7c6b7b53f0561"},"description":"Support Vector Machines (SVMs) are a powerful set of supervised learning methods used for classification, regression, and outlier detection.","breadcrumb":{"@id":"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.w3computing.com\/articles\/implementing-support-vector-machines-svms-from-scratch\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Articles Home","item":"https:\/\/www.w3computing.com\/articles\/"},{"@type":"ListItem","position":2,"name":"Artificial Intelligence","item":"https:\/\/www.w3computing.com\/articles\/artificial-intelligence\/"},{"@type":"ListItem","position":3,"name":"Implementing Support Vector Machines (SVMs) from Scratch"}]},{"@type":"WebSite","@id":"https:\/\/www.w3computing.com\/articles\/#website","url":"https:\/\/www.w3computing.com\/articles\/","name":"Developer Articles Hub","description":"","alternateName":"Developer Articles","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.w3computing.com\/articles\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.w3computing.com\/articles\/#\/schema\/person\/a550b3e20d78bb4f79b7c6b7b53f0561","name":"w3compadmin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.w3computing.com\/articles\/wp-content\/litespeed\/avatar\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266","url":"https:\/\/www.w3computing.com\/articles\/wp-content\/litespeed\/avatar\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266","contentUrl":"https:\/\/www.w3computing.com\/articles\/wp-content\/litespeed\/avatar\/bd481d404e42caa2763662a3bfe825f8.jpg?ver=1780141266","caption":"w3compadmin"},"sameAs":["http:\/\/w3computing.com\/articles"]}]}},"featured_image_src":null,"featured_image_src_square":null,"author_info":{"display_name":"w3compadmin","author_link":"https:\/\/www.w3computing.com\/articles\/author\/w3compadmin\/"},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/posts\/2080","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/comments?post=2080"}],"version-history":[{"count":11,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/posts\/2080\/revisions"}],"predecessor-version":[{"id":2091,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/posts\/2080\/revisions\/2091"}],"wp:attachment":[{"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/media?parent=2080"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/categories?post=2080"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.w3computing.com\/articles\/wp-json\/wp\/v2\/tags?post=2080"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}