{"id":32,"date":"2015-10-09T04:09:07","date_gmt":"2015-10-09T04:09:07","guid":{"rendered":"http:\/\/www.aarondefazio.com\/tangentially\/?p=32"},"modified":"2016-11-10T08:42:16","modified_gmt":"2016-11-10T08:42:16","slug":"the-many-ways-to-analyse-gradient-descent","status":"publish","type":"post","link":"https:\/\/www.aarondefazio.com\/tangentially\/?p=32","title":{"rendered":"The Many Ways to Analyse Gradient Descent"},"content":{"rendered":"<p>Consider the classical gradient descent method:<\/p>\n<div class=\"par-math-display\"><!--l. 32--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                      <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-rel\">=<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 34--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 37--><\/p>\n<p class=\"indent\" >   It&#x2019;s a thing of beauty isn&#x2019;t it? While it&#x2019;s not used directly in practice any more,<br \/>\nthe proof techniques used in its analysis are the building blocks behind the theory of<br \/>\nmore advanced optimization methods. I know of 8 di\ufb00erent ways of proving its<br \/>\nconvergence rate. Each of the proof techniques are interesting in their own right, but<br \/>\nmost books on convex optimization give just a single proof of convergence, then move<br \/>\nonto greater things. But to do research in modern convex optimization you should<br \/>\nknow them all.\n<\/p>\n<p><!--l. 46--><\/p>\n<p class=\"indent\" >   The purpose of this series of posts is to detail each of these proof techniques and<br \/>\nwhat applications they have to more advanced methods. This post will cover the<br \/>\nproofs under strong convexity assumptions, and the next post will cover the<br \/>\nnon-strongly convex case. Unlike most proofs in the literature, we will go into detail<br \/>\nof every step, so that these proofs can be used as a reference (don&#x2019;t cite this post<br \/>\ndirectly though, cite the original source preferably, or the technical notes version). If<br \/>\nyou are aware of any methods I&#x2019;ve not covered, please leave a comment with a<br \/>\nreference so I can update this post.<\/p>\n<p><!--l. 56--><\/p>\n<p class=\"indent\" >   For most of the proofs we end with a statement like<br \/>\n<!--l. 56--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>A<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b3<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><msub><mrow \n><mi \n>A<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/math>,<br \/>\nwhere <!--l. 57--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>A<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/math><br \/>\nis some quantity of interest, like distance to solution or function value<br \/>\nsub-optimality. A full proof requires chaining these inequalities for each<br \/>\n<!--l. 59--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>k<\/mi><\/math>, giving something<br \/>\nof the form <!--l. 59--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>A<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <msup><mrow \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b3<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msup \n><msub><mrow \n><mi \n>A<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/math>.<br \/>\nWe leave this step as a given.\n<\/p>\n<h3 class=\"likesectionHead\"><a \n id=\"x1-1000\"><\/a>Basic lemmas<\/h3>\n<p><!--l. 65--><\/p>\n<p class=\"noindent\" >These hold for any <!--l. 65--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>x<\/mi><\/math><br \/>\nand <!--l. 65--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>y<\/mi><\/math>. Here<br \/>\n<!--l. 65--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03bc<\/mi><\/math> is the strong convexity<br \/>\nconstant and <!--l. 66--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>L<\/mi><\/math><br \/>\nthe Lipschitz smoothness constant. These are completely<br \/>\nstandard, see Nesterov&#x2019;s book [<a \nhref=\"#Xnes-book\">7<\/a>] for proofs. We use the notation<br \/>\n<!--l. 68--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/math> for the unique<br \/>\nminimizer of <!--l. 68--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>f<\/mi><\/math><br \/>\n(for strongly convex problems).\n<\/p>\n<table class=\"equation\">\n<tr>\n<td> <a \n id=\"x1-1001r1\"><\/a><br \/>\n<!--l. 71--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" class=\"equation\">\n                <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfrac><mrow \n><mi \n>L<\/mi><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/math><\/td>\n<td class=\"eq-no\">(1)<\/td>\n<\/tr>\n<\/table>\n<table class=\"equation\">\n<tr>\n<td> <a \n id=\"x1-1002r2\"><\/a><\/p>\n<p><!--l. 76--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" class=\"equation\">\n                <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/math><\/td>\n<td class=\"eq-no\">(2)<\/td>\n<\/tr>\n<\/table>\n<table class=\"equation\">\n<tr>\n<td> <a \n id=\"x1-1003r3\"><\/a><br \/>\n<!--l. 81--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" class=\"equation\">\n             <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">+<\/mo>  <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><mi \n>L<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/math><\/td>\n<td class=\"eq-no\">(3)<\/td>\n<\/tr>\n<\/table>\n<table class=\"equation\">\n<tr>\n<td> <a \n id=\"x1-1004r4\"><\/a><br \/>\n<!--l. 86--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" class=\"equation\">\n             <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">+<\/mo>  <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><mi \n>\u03bc<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/math><\/td>\n<td class=\"eq-no\">(4)<\/td>\n<\/tr>\n<\/table>\n<table class=\"equation\">\n<tr>\n<td> <a \n id=\"x1-1005r5\"><\/a><br \/>\n<!--l. 91--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" class=\"equation\">\n                <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/math><\/td>\n<td class=\"eq-no\">(5)<\/td>\n<\/tr>\n<\/table>\n<table class=\"equation\">\n<tr>\n<td> <a \n id=\"x1-1006r6\"><\/a><\/p>\n<p><!--l. 96--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" class=\"equation\">\n                    <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mi \n>\u03bc<\/mi><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/math><\/td>\n<td class=\"eq-no\">(6)<\/td>\n<\/tr>\n<\/table>\n<p><!--l. 102--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"sectionHead\"><span class=\"titlemark\">1   <\/span> <a \n id=\"x1-20001\"><\/a>Function Value Descent<\/h3>\n<p><!--l. 104--><\/p>\n<p class=\"noindent\" >There is a very simple proof involving just the function values. We start<br \/>\nby showing that the function value descent is controlled by the gradient<br \/>\nnorm:\n<\/p>\n<div class=\"newtheorem\">\n<!--l. 107--><\/p>\n<p class=\"noindent\" ><span class=\"head\"><br \/>\n<a \n id=\"x1-2001r1\"><\/a><br \/>\n<span \nclass=\"ecbx-1000\">Lemma 1.<\/span>  <\/span><span \nclass=\"ecti-1000\">For any given <\/span><!--l. 108--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi><\/math><span \nclass=\"ecti-1000\">,<\/span><br \/>\n<span \nclass=\"ecti-1000\">the change in function value between steps can be bounded as follows:<\/span>\n<\/p>\n<div class=\"math-display\"><!--l. 110--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                        <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><mi \n>\u03b1<\/mi><mi \n>L<\/mi><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 112--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 115--><\/p>\n<p class=\"indent\" >   <span \nclass=\"ecti-1000\">in particular, if <\/span><!--l. 115--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/math><br \/>\n<span \nclass=\"ecti-1000\">we have <\/span><!--l. 115--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><mi \n>L<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/math><span \nclass=\"ecti-1000\">.<\/span>\n<\/p>\n<\/p><\/div>\n<p><!--l. 116--><\/p>\n<p class=\"indent\" >\n<div class=\"proof\">\n<!--l. 117--><\/p>\n<p class=\"indent\" >   <span class=\"head\"><br \/>\n<span \nclass=\"ecti-1000\">Proof.<\/span> <\/span>We start with (<a \nhref=\"#x1-1001r1\">1<!--tex4ht:ref: eq:lip-upper --><\/a>), the Lipschitz upper bound about <!--l. 118--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/math>:\n<\/p>\n<div class=\"math-display\"><!--l. 119--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfrac><mrow \n><mi \n>L<\/mi><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 121--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 124--><\/p>\n<p class=\"indent\" >   Now          we          plug          in          the          step          equation<br \/>\n<!--l. 124--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-rel\">=<\/mo> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mi \n>\u03b1<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-punc\">:<\/mo><\/math>\n<\/p>\n<div class=\"par-math-display\"><!--l. 126--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                      <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mfrac><mrow \n><mi \n>L<\/mi><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 128--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 131--><\/p>\n<p class=\"indent\" >   Negating and rearranging gives:\n<\/p>\n<div class=\"math-display\"><!--l. 132--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                        <mo \nclass=\"MathClass-rel\">\u2234<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><mi \n>\u03b1<\/mi><mi \n>L<\/mi><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 134--><\/p>\n<p class=\"nopar\" >\n                                                                 \u25a1\n<\/p>\n<\/p><\/div>\n<p><!--l. 137--><\/p>\n<p class=\"indent\" >   Now since we are considering strongly convex problems, we actually have found<br \/>\na bound on the gradient norm in terms of function value. We apply (<a \nhref=\"#x1-1004r4\">4<!--tex4ht:ref: eq:sc-upper --><\/a>):<br \/>\n<!--l. 139--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">+<\/mo>  <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><mi \n>\u03bc<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/math> using<br \/>\n<!--l. 140--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>x<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/math>,<br \/>\n<!--l. 140--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>y<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/math>:\n<\/p>\n<div class=\"math-display\"><!--l. 141--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                 <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">+<\/mo>  <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><mi \n>\u03bc<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 143--><\/p>\n<p class=\"nopar\" >\n<div class=\"math-display\"><!--l. 144--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                           <mo \nclass=\"MathClass-rel\">\u2234<\/mo> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mn>2<\/mn><mi \n>\u03bc<\/mi> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 146--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 149--><\/p>\n<p class=\"indent\" >   So combining these two results:\n<\/p>\n<div class=\"math-display\"><!--l. 150--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                  <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><mi \n>L<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 152--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 155--><\/p>\n<p class=\"indent\" >   We then negate, add &#x0026; subtract <!--l. 155--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math>,<br \/>\nthen rearrange:\n<\/p>\n<div class=\"par-math-display\"><!--l. 157--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                          <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo><mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow>\n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 159--><\/p>\n<p class=\"nopar\" >\n<div class=\"math-display\"><!--l. 160--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n            <mo \nclass=\"MathClass-rel\">\u2234<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo><mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow>\n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 162--><\/p>\n<p class=\"nopar\" >\n<div class=\"math-display\"><!--l. 163--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                    <mo \nclass=\"MathClass-rel\">\u2234<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 165--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 168--><\/p>\n<p class=\"indent\" >   Note that this function value style proof requires the step size<\/p>\n<p><!--l. 168--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/math> or smaller,<br \/>\ninstead of <!--l. 169--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo>   <mfrac><mrow \n><mn>2<\/mn><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mi \n>L<\/mi><\/mrow><\/mfrac><\/math>,<br \/>\nwhich we shall see gives the fastest convergence when using some of the other proof<br \/>\ntechniques below.\n<\/p>\n<p><!--l. 174--><\/p>\n<p class=\"noindent\" >\n<h4 class=\"likesubsectionHead\"><a \n id=\"x1-30001\"><\/a>Comments<\/h4>\n<p><!--l. 176--><\/p>\n<p class=\"noindent\" >This proof (when <!--l. 176--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/math><br \/>\nis used) treats gradient descent as an upper bound minimization scheme. Such<br \/>\nmethods, sometimes known under the Majorization-Minimization nomenclature [<a \nhref=\"#Xmm\">3<\/a>],<br \/>\nare quite widespread in optimization. They can be applied to non-convex problems<br \/>\neven, although the convergence rates in that case are necessarily weak. Likewise this<br \/>\nproof gives the weakest convergence rate of the proof techniques presented in this<br \/>\npost, but it is perhaps the simplest. Upper bound minimization techniques<br \/>\nhave recently seen interesting applications in 2nd order optimization, in the<br \/>\nform of Nesterov&#x2019;s cubicly regularized Newton&#x2019;s method [<a \nhref=\"#Xnes-cubic\">9<\/a>]. For stochastic<br \/>\noptimization, the MISO method is also a upper bound minimization scheme [<a \nhref=\"#Xmiso2\">6<\/a>]. For<br \/>\nnon-smooth problems, an interesting application of the MM approach is<br \/>\nin minimizing convex problems with non-convex regularizers of the form<br \/>\n<!--l. 189--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03bb<\/mi><mo class=\"qopname\">log<\/mo><!--nolimits--> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mfenced separators=\"\" \nopen=\"|\"  close=\"|\" ><mrow><mi \n>x<\/mi><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">+<\/mo> <mn>1<\/mn><\/mrow><\/mfenced><\/math>, in<br \/>\nthe form of reweighted L1 regularization [<a \nhref=\"#Xreweighted-l1\">5<\/a>].\n<\/p>\n<p><!--l. 193--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"sectionHead\"><span class=\"titlemark\">2   <\/span> <a \n id=\"x1-40002\"><\/a>Iterate Descent<\/h3>\n<p><!--l. 195--><\/p>\n<p class=\"noindent\" >There is also a simple proof involving just the distance of the iterates<br \/>\n<!--l. 196--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/math> to the solution. Using<br \/>\nthe de\ufb01nition of the step <!--l. 196--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-rel\">=<\/mo> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mi \n>\u03b1<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math>:<\/p>\n<p><!--tex4ht:inline--><\/p>\n<p><!--l. 197--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\"><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n>                     <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">           <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\"><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>2<\/mn><mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">+<\/mo> <msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>          <\/mtr><\/mtable>\n<\/math><br \/>\n<!--l. 200--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 203--><\/p>\n<p class=\"indent\" >   We now apply both the inner product bounds (<a \nhref=\"#x1-1005r5\">5<!--tex4ht:ref: eq:ip-lip --><\/a>)<br \/>\n<!--l. 203--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" > <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/math>and (<a \nhref=\"#x1-1006r6\">6<!--tex4ht:ref: eq:ip-sc --><\/a>)<br \/>\n<!--l. 204--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" > <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mi \n>\u03bc<\/mi><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/math> , in the following<br \/>\nnegated forms, using <!--l. 205--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>0<\/mn><\/math>:\n<\/p>\n<div class=\"par-math-display\"><!--l. 207--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                              <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><mo \nclass=\"MathClass-rel\">\u2264<\/mo><mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mn>1<\/mn><\/mrow>\n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 209--><\/p>\n<p class=\"nopar\" >\n<div class=\"math-display\"><!--l. 210--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                         <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><mo \nclass=\"MathClass-rel\">\u2264<\/mo><mo \nclass=\"MathClass-bin\">\u2212<\/mo><mi \n>\u03bc<\/mi><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 212--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 215--><\/p>\n<p class=\"indent\" >   The inner product term has a weight<br \/>\n<!--l. 215--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mn>2<\/mn><mi \n>\u03b1<\/mi><\/math>, and we apply each<br \/>\nof these with weight <!--l. 216--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi><\/math>,<br \/>\ngiving:\n<\/p>\n<div class=\"par-math-display\"><!--l. 218--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n><msup><mrow \n>\n                 <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><mi \n>\u03bc<\/mi><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 220--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 223--><\/p>\n<p class=\"indent\" >   Now if we take <!--l. 223--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><mo \nclass=\"MathClass-punc\">,<\/mo><\/math><br \/>\nthen the last term cancels and we have:\n<\/p>\n<div class=\"math-display\"><!--l. 225--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n><msup><mrow \n>\n                               <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 227--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 230--><\/p>\n<p class=\"indent\" >   This proof is not as tight as possible. Instead of splitting the inner product term<br \/>\nand applying both bounds (5) and (6), we can apply the following stronger combined<br \/>\nbound from Nesterov&#x2019;s Book [<a \nhref=\"#Xnes-book\">7<\/a>]: <\/p>\n<table class=\"equation\">\n<tr>\n<td> <a \n id=\"x1-4002r7\"><\/a><br \/>\n<!--l. 233--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" class=\"equation\">\n       <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced> <mo \nclass=\"MathClass-rel\">\u2265<\/mo> <mfrac><mrow \n><mi \n>\u03bc<\/mi><mi \n>L<\/mi><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo>    <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/math><\/td>\n<td class=\"eq-no\">(7)<\/td>\n<\/tr>\n<\/table>\n<p><!--l. 238--><\/p>\n<p class=\"indent\" >   Doing so yields:\n<\/p>\n<div class=\"par-math-display\"><!--l. 240--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n><msup><mrow \n>\n            <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mn>2<\/mn><mi \n>\u03b1<\/mi><mi \n>\u03bc<\/mi><mi \n>L<\/mi><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo>  <mfrac><mrow \n><mn>2<\/mn><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 242--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 245--><\/p>\n<p class=\"indent\" >   Now clearly to cancel out the gradient norm term we can take<br \/>\n<!--l. 245--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo>   <mfrac><mrow \n><mn>2<\/mn><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mi \n>L<\/mi><\/mrow><\/mfrac><\/math>,<br \/>\nwhich yields the convergence rate:<br \/>\n<!--tex4ht:inline--><\/p>\n<p><!--l. 247--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">\u2264<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo>  <mfrac><mrow \n><mn>4<\/mn><mi \n>\u03bc<\/mi><mi \n>L<\/mi><\/mrow> \n<mrow \n><msup><mrow \n> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow><\/mfrac><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">           <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">\u2248<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mn>4<\/mn><mi \n>\u03bc<\/mi><\/mrow> \n <mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac> <\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>    <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>                             <\/mtr><\/mtable>\n<\/math><br \/>\n<!--l. 250--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 254--><\/p>\n<p class=\"noindent\" >\n<h4 class=\"likesubsectionHead\"><a \n id=\"x1-50002\"><\/a>Comments<\/h4>\n<p><!--l. 256--><\/p>\n<p class=\"noindent\" >This proof technique is the building block of the standard stochastic<br \/>\ngradient descent (SGD) proof. The above proof is mostly based on<br \/>\nNesterov&#x2019;s book, I&#x2019;m not sure what the original citation is. It has a<br \/>\nnice geometric interpretation, as the bound on the inner product term<br \/>\n<!--l. 259--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" > <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/math> can<br \/>\neasily be illustrated in 2 dimensions, say on a white-board. It&#x2019;s e\ufb00ectively a statement<br \/>\non the angles that gradients in convex problems can take. To get the strongest<br \/>\nbound using this technique, the complex bound in Equation <a \nhref=\"#x1-4002r7\">7<!--tex4ht:ref: eq:nes-ip-bound --><\/a> has to be<br \/>\nused. That stronger bound is not really straight-forward, and perhaps too<br \/>\ntechnical (in my opinion) to use in a textbook proof of the convergence<br \/>\nrate.<\/p>\n<p><!--l. 268--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"sectionHead\"><span class=\"titlemark\">3   <\/span> <a \n id=\"x1-60003\"><\/a>Using the Second Fundamental Theorem of Calculus<\/h3>\n<p><!--l. 270--><\/p>\n<p class=\"noindent\" >Recall the second fundamental theorem of calculus:\n<\/p>\n<div class=\"math-display\"><!--l. 271--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                  <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">=<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">+<\/mo><msubsup><mrow \n><mo \nclass=\"MathClass-op\"> \u222b\n <!--nolimits--><\/mo><!--nolimits--><\/mrow><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>y<\/mi><\/mrow><\/msubsup \n><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>z<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mi \n>d<\/mi><mi \n>z<\/mi><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 273--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 276--><\/p>\n<p class=\"indent\" >   This can be applied along intervals in higher dimensions.<br \/>\nThe case we care about is applying it to the \ufb01rst derivatives of<br \/>\n<!--l. 277--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>f<\/mi><\/math>,<br \/>\ngiving an integral involving the Hessian:\n<\/p>\n<div class=\"math-display\"><!--l. 279--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                     <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">=<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">+<\/mo><msubsup><mrow \n><mo \nclass=\"MathClass-op\"> \u222b\n <!--nolimits--><\/mo><!--nolimits--><\/mrow><mrow \n><mn>0<\/mn><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msubsup \n> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03c4<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><mspace width=\"0.3em\" class=\"thinspace\"\/><mo \nclass=\"MathClass-punc\">,<\/mo><mspace width=\"0.3em\" class=\"thinspace\"\/><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><\/mfenced><mi \n>d<\/mi><mi \n>\u03c4<\/mi><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 281--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 284--><\/p>\n<p class=\"indent\" >   We abuse the angle bracket notation here to apply to matrix-vector products as<br \/>\nwell as the usual dot-product. Using this result gives an interesting proof of<br \/>\nconvergence of gradient descent that doesn&#x2019;t rely on the usual convexity<br \/>\nlemmas. This proof bounds the distance to solution, just like the previous<br \/>\nproof.\n<\/p>\n<div class=\"newtheorem\">\n<p><!--l. 289--><\/p>\n<p class=\"noindent\" ><span class=\"head\"><br \/>\n<a \n id=\"x1-6001r2\"><\/a><br \/>\n<span \nclass=\"ecbx-1000\">Lemma 2.<\/span>  <\/span><span \nclass=\"ecti-1000\">For any positive <\/span><!--l. 290--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math><span \nclass=\"ecti-1000\">:<\/span>\n<\/p>\n<div class=\"math-display\"><!--l. 291--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>t<\/mi> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><\/mfenced> <mo \nclass=\"MathClass-rel\">\u2264<\/mo><mo class=\"qopname\"> max<\/mo> <mfenced separators=\"\" \nopen=\"{\"  close=\"}\" ><mrow><mfenced separators=\"\" \nopen=\"|\"  close=\"|\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>t<\/mi><mi \n>L<\/mi><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">,<\/mo> <mfenced separators=\"\" \nopen=\"|\"  close=\"|\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>t<\/mi><mi \n>\u03bc<\/mi><\/mrow><\/mfenced><\/mrow><\/mfenced> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 293--><\/p>\n<p class=\"nopar\" >\n<\/p><\/div>\n<p><!--l. 295--><\/p>\n<p class=\"indent\" >\n<div class=\"proof\">\n<!--l. 296--><\/p>\n<p class=\"indent\" >   <span class=\"head\"><br \/>\n<span \nclass=\"ecti-1000\">Proof.<\/span> <\/span>We start by applying the second fundamental theorem of calculus in the above<br \/>\nform:<\/p>\n<p><!--tex4ht:inline--><\/p>\n<p><!--l. 298--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>t<\/mi> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><\/mfenced><\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>t<\/mi><msubsup><mrow \n><mo \nclass=\"MathClass-op\">\u222b\n <!--nolimits--><\/mo><!--nolimits--><\/mrow><mrow \n><mn>0<\/mn><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msubsup \n> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03c4<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><mspace width=\"0.3em\" class=\"thinspace\"\/><mo \nclass=\"MathClass-punc\">,<\/mo><mspace width=\"0.3em\" class=\"thinspace\"\/><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><\/mfenced><mi \n>d<\/mi><mi \n>\u03c4<\/mi><\/mrow><\/mfenced><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">                    <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msubsup><mrow \n><mo \nclass=\"MathClass-op\">\u222b\n <!--nolimits--><\/mo><!--nolimits--><\/mrow><mrow \n><mn>0<\/mn><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msubsup \n> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><mi \n>t<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03c4<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>I<\/mi><mspace width=\"0.3em\" class=\"thinspace\"\/><mo \nclass=\"MathClass-punc\">,<\/mo><mspace width=\"0.3em\" class=\"thinspace\"\/><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><\/mfenced><mi \n>d<\/mi><mi \n>\u03c4<\/mi><\/mrow><\/mfenced>    <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">                    <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">\u2264<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <msubsup><mrow \n><mo \nclass=\"MathClass-op\">\u222b\n <!--nolimits--><\/mo><!--nolimits--><\/mrow><mrow \n><mn>0<\/mn><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msubsup \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><mi \n>t<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03c4<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>I<\/mi><mspace width=\"0.3em\" class=\"thinspace\"\/><mo \nclass=\"MathClass-punc\">,<\/mo><mspace width=\"0.3em\" class=\"thinspace\"\/><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><\/mfenced><\/mrow><\/mfenced><mi \n>d<\/mi><mi \n>\u03c4<\/mi>     <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">                    <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">\u2264<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <msubsup><mrow \n><mo \nclass=\"MathClass-op\">\u222b\n <!--nolimits--><\/mo><!--nolimits--><\/mrow><mrow \n><mn>0<\/mn><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msubsup \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>t<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03c4<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>y<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>I<\/mi><\/mrow><\/mfenced> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced><mi \n>d<\/mi><mi \n>\u03c4<\/mi>     <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">                    <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">\u2264<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <munder class=\"msub\"><mrow \n><mo class=\"qopname\">max<\/mo><\/mrow><mrow \n><mi \n>z<\/mi><\/mrow><\/munder \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>t<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>z<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>I<\/mi><\/mrow><\/mfenced> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><mi \n>x<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>y<\/mi><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>              <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>    <\/mtr><\/mtable>\n<\/math><br \/>\n<!--l. 304--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 307--><\/p>\n<p class=\"indent\" >   Now we examine the eigenvalues of<br \/>\n<!--l. 307--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>z<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math>. the minimum one<br \/>\nis at least <!--l. 308--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03bc<\/mi><\/math> and the<br \/>\nmaximum at most <!--l. 308--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>L<\/mi><\/math>.<br \/>\nAn examination of the possible range of the eigenvalues of<br \/>\n<!--l. 309--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>t<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>z<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>I<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math> gives<br \/>\n<!--l. 310--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mo class=\"qopname\"> max<\/mo> <mfenced separators=\"\" \nopen=\"{\"  close=\"}\" ><mrow><mfenced separators=\"\" \nopen=\"|\"  close=\"|\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>t<\/mi><mi \n>L<\/mi><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">,<\/mo> <mfenced separators=\"\" \nopen=\"|\"  close=\"|\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>t<\/mi><mi \n>\u03bc<\/mi><\/mrow><\/mfenced><\/mrow><\/mfenced><\/math>.   \u25a1\n<\/p>\n<\/p><\/div>\n<p><!--l. 312--><\/p>\n<p class=\"indent\" >   Using this lemma gives a simple proof along the lines of the iterate descent<br \/>\nproof.\n<\/p>\n<p><!--l. 315--><\/p>\n<p class=\"indent\" >   First, note that <!--l. 315--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" > <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/math><br \/>\nis in the right form for direct application of this lemma after substituting in the step<br \/>\nequation:<\/p>\n<p><!--tex4ht:inline--><\/p>\n<p><!--l. 318--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><\/mfenced>  <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">         <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">\u2264<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <mo class=\"qopname\">max<\/mo> <mfenced separators=\"\" \nopen=\"{\"  close=\"}\" ><mrow><mfenced separators=\"\" \nopen=\"|\"  close=\"|\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><mi \n>L<\/mi><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">,<\/mo> <mfenced separators=\"\" \nopen=\"|\"  close=\"|\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><mi \n>\u03bc<\/mi><\/mrow><\/mfenced><\/mrow><\/mfenced> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>                       <\/mtr><\/mtable>\n<\/math><br \/>\n<!--l. 321--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 324--><\/p>\n<p class=\"indent\" >   Note we introduced <!--l. 324--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math><br \/>\nfor \u201cfree\u201d, as it&#x2019;s of course equal to zero. The next step is optimize this bound in terms of<br \/>\n<!--l. 326--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi><\/math>. Note that<br \/>\n<!--l. 326--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>L<\/mi><\/math> is always<br \/>\nlarger than <!--l. 326--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03bc<\/mi><\/math>, so<br \/>\nwe take the <!--l. 327--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" > <mfenced separators=\"\" \nopen=\"|\"  close=\"|\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><mi \n>L<\/mi><\/mrow><\/mfenced><\/math><br \/>\nabsolute value as negative, and the other positive, and match their magnitudes:\n<\/p>\n<div class=\"math-display\"><!--l. 329--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                       <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03b1<\/mi><mi \n>L<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><mi \n>\u03bc<\/mi><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 331--><\/p>\n<p class=\"nopar\" >\n<div class=\"math-display\"><!--l. 332--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                   <mo \nclass=\"MathClass-rel\">\u2234<\/mo> <mi \n>\u03b1<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>L<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03bc<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>2<\/mn><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 334--><\/p>\n<p class=\"nopar\" >\n<div class=\"math-display\"><!--l. 335--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                    <mo \nclass=\"MathClass-rel\">\u2234<\/mo> <mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo>    <mfrac><mrow \n><mn>2<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03bc<\/mi><\/mrow><\/mfrac><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 337--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 340--><\/p>\n<p class=\"indent\" >   Which gives the convergence rate:\n<\/p>\n<div class=\"math-display\"><!--l. 341--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                 <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mfrac><mrow \n><mi \n>L<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03bc<\/mi><\/mrow>\n<mrow \n><mi \n>L<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03bc<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 343--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 346--><\/p>\n<p class=\"indent\" >   Note that this rate is in terms of the distance to solution directly, rather than its<br \/>\nsquare like in the previous proof. Converting to squared norm gives the same rate as<br \/>\nbefore.\n<\/p>\n<p><!--l. 351--><\/p>\n<p class=\"noindent\" >\n<h4 class=\"likesubsectionHead\"><a \n id=\"x1-70003\"><\/a>Comments<\/h4>\n<p><!--l. 353--><\/p>\n<p class=\"noindent\" >This proof technique has a linear-algebra feel to it, and is perhaps most comfortable<br \/>\nto people with that background. The absolute values make it ugly in my opinion<br \/>\nthough. This proof technique is the building block used in the standard proof of the<br \/>\nconvergence of the heavy ball method for strongly convex problems [<a \nhref=\"#Xpolyak-book\">10<\/a>]. It doesn&#x2019;t<br \/>\nappear to have many other applications, and so is probably the least seen of<br \/>\nthe techniques in this document. The main use of this kind of argument<br \/>\nis in lower complexity bounds, where we often do some sort of eigenvalue<br \/>\nanalysis.\n<\/p>\n<p><!--l. 364--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"sectionHead\"><span class=\"titlemark\">4   <\/span> <a \n id=\"x1-80004\"><\/a>Lyapunov Style<\/h3>\n<p><!--l. 366--><\/p>\n<p class=\"noindent\" >The above results prove convergence of either the iterates or the function value<br \/>\nseparately. There is an interesting proof involving the sum of the two quantities. First<br \/>\nwe start with with the iterate convergence:\n<\/p>\n<p><!--l. 371--><\/p>\n<p class=\"indent\" >\n<!--tex4ht:inline--><\/p>\n<p><!--l. 371--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\"><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n>                     <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">           <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\"><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>2<\/mn><mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">+<\/mo> <msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>          <\/mtr><\/mtable>\n<\/math><\/p>\n<p><!--l. 374--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 377--><\/p>\n<p class=\"indent\" >   Now we use the function descent amount equation (Lemma <a \nhref=\"#x1-2001r1\">1<!--tex4ht:ref: lem:func-descent-amt --><\/a>) to bound the gradient norm<br \/>\nterm: <!--l. 378--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mfrac><mrow \n><mn>1<\/mn><\/mrow>\n<mrow \n><mi \n>c<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math> , where we<br \/>\nhave de\ufb01ned <!--l. 379--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>c<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>1<\/mn><mo \nclass=\"MathClass-bin\">\u2215<\/mo> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><mi \n>\u03b1<\/mi><mi \n>L<\/mi><\/mrow><\/mfenced><\/mrow><\/mfenced><\/math>:\n<\/p>\n<div class=\"math-display\"><!--l. 380--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n><msup><mrow \n>\n      <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">\u2264<\/mo><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>2<\/mn><mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 382--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 385--><\/p>\n<p class=\"indent\" >   Now we use the strong convexity lower bound (<a \nhref=\"#x1-1002r2\">2<!--tex4ht:ref: eq:sc-lower --><\/a>) in a rearranged form:<br \/>\n<!--tex4ht:inline--><\/p>\n<p><!--l. 387--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced><\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">\u2264<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">,<\/mo><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>                     <\/mtr><\/mtable>\n<\/math><br \/>\n<!--l. 389--><\/p>\n<p class=\"nopar\" >\nto simplify:\n<\/p>\n<div class=\"par-math-display\"><!--l. 392--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n><msup><mrow \n>\n  <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><mi \n>\u03bc<\/mi><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">+<\/mo> <mn>2<\/mn><mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 394--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 397--><\/p>\n<p class=\"indent\" >   Now rearranging further:\n<\/p>\n<div class=\"math-display\"><!--l. 398--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n><msup><mrow \n>\n<mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-bin\">+<\/mo><mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><mi \n>\u03bc<\/mi><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-bin\">+<\/mo><mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>2<\/mn><mi \n>\u03b1<\/mi><\/mrow><\/mfenced> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 400--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 403--><\/p>\n<p class=\"indent\" >   Now this equation gives a descent rate for the weighted sum of<br \/>\n<!--l. 403--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/math> and<br \/>\n<!--l. 404--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">.<\/mo><\/math> The<br \/>\nbest rate is given by matching the two convergence rates, that of the iterate distance<br \/>\nterms:\n<\/p>\n<div class=\"math-display\"><!--l. 406--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                              <mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><mi \n>\u03bc<\/mi><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 408--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 411--><\/p>\n<p class=\"indent\" >   and that of the function value terms, which changes from<br \/>\n<!--l. 411--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/math> to<br \/>\n<!--l. 412--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>2<\/mn><mi \n>\u03b1<\/mi><\/math>:<br \/>\n<!--tex4ht:inline--><\/p>\n<p><!--l. 413--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"> <mfrac><mrow \n><mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>2<\/mn><msub><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow>\n    <mrow \n><mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow><\/mfrac>   <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mfrac><mrow \n><mn>2<\/mn><\/mrow> \n<mrow \n><mi \n>c<\/mi><mi \n>\u03b1<\/mi><\/mrow><\/mfrac>       <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">         <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>2<\/mn> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><mi \n>\u03b1<\/mi><mi \n>L<\/mi><\/mrow><\/mfenced><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">         <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <mi \n>\u03b1<\/mi><mi \n>L<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>1<\/mn><mo \nclass=\"MathClass-punc\">.<\/mo>      <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>                                     <\/mtr><\/mtable>\n<\/math><br \/>\n<!--l. 417--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 420--><\/p>\n<p class=\"indent\" >   Matching these two rates:\n<\/p>\n<div class=\"math-display\"><!--l. 421--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                        <mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mi \n>\u03b1<\/mi><mi \n>L<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>1<\/mn><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 423--><\/p>\n<p class=\"nopar\" >\n<div class=\"math-display\"><!--l. 424--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                   <mo \nclass=\"MathClass-rel\">\u2234<\/mo> <mn>2<\/mn> <mo \nclass=\"MathClass-rel\">=<\/mo> <mi \n>\u03b1<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 426--><\/p>\n<p class=\"nopar\" >\n<div class=\"math-display\"><!--l. 427--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                    <mo \nclass=\"MathClass-rel\">\u2234<\/mo> <mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo>    <mfrac><mrow \n><mn>2<\/mn><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow><\/mfrac><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 429--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 432--><\/p>\n<p class=\"indent\" >   Using this derived value for <!--l. 432--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi><\/math><\/p>\n<p>gives a convergence rate of <!--l. 433--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mfrac><mrow \n><mn>2<\/mn><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mi \n>L<\/mi><\/mrow><\/mfrac><\/math>.<br \/>\nI.e.\n<\/p>\n<div class=\"math-display\"><!--l. 434--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n><msup><mrow \n>\n<mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-bin\">+<\/mo><mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mfrac><mrow \n><mn>2<\/mn><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><msup><mrow \n><mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 436--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 439--><\/p>\n<p class=\"indent\" >   and therefore after <!--l. 439--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>k<\/mi><\/math><br \/>\nsteps:\n<\/p>\n<div class=\"math-display\"><!--l. 440--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n><msup><mrow \n>\n <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced> <mo \nclass=\"MathClass-rel\">\u2264<\/mo><msup><mrow \n> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mfrac><mrow \n><mn>2<\/mn><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msup \n> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><msup><mrow \n><mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mn>0<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 442--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 445--><\/p>\n<p class=\"indent\" >   The constants can be simpli\ufb01ed to:\n<\/p>\n<p><!--l. 447--><\/p>\n<p class=\"indent\" >\n<p><!--tex4ht:inline--><\/p>\n<p><!--l. 447--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"> <mi \n>c<\/mi><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">      <mfrac><mrow \n><msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow><\/p>\n<p><mrow \n><mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><mi \n>\u03b1<\/mi><mi \n>L<\/mi><\/mrow><\/mfenced><\/mrow><\/mfrac><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">    <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">      <mfrac><mrow \n><mi \n>\u03b1<\/mi><\/mrow>\n<mrow \n><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><mi \n>\u03b1<\/mi><mi \n>L<\/mi><\/mrow><\/mfrac>   <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">    <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">      <mfrac><mrow \n><mi \n>\u03b1<\/mi><\/mrow>\n<mrow \n><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mfrac><mrow \n><mi \n>L<\/mi><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfrac>   <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">    <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">    <mfrac><mrow \n><mi \n>\u03b1<\/mi><\/mrow>\n<mrow \n>  <mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow>\n<mrow \n><mi \n>\u03bc<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfrac>      <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">    <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\">   <mfrac><mrow \n><mn>2<\/mn><\/mrow>\n<mrow \n><mi \n>\u03bc<\/mi><\/mrow><\/mfrac><mo \nclass=\"MathClass-punc\">.<\/mo>        <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>                                             <\/mtr><\/mtable>\n<\/math><br \/>\n<!--l. 453--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 456--><\/p>\n<p class=\"indent\" >   Now we use: <!--l. 456--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfrac><mrow \n><mi \n>L<\/mi><\/mrow> \n<mrow \n><mn>2<\/mn><\/mrow><\/mfrac><msup><mrow \n>  <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/math><br \/>\non the right, and we just drop the function value term altogether on the<br \/>\nleft:\n<\/p>\n<p><!--l. 460--><\/p>\n<p class=\"indent\" >\n<p><!--tex4ht:inline--><\/p>\n<p><!--l. 460--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">\u2264<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\"><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mfrac><mrow \n><mn>2<\/mn><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msup \n><mfrac><mrow \n><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow>\n  <mrow \n><mi \n>\u03bc<\/mi><\/mrow><\/mfrac><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>                          <\/mtr><\/mtable>\n<\/math><br \/>\n<!--l. 462--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 465--><\/p>\n<p class=\"indent\" >   If we instead use the more robust step size<br \/>\n<!--l. 465--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" > <mfrac><mrow \n><mn>1<\/mn><\/mrow>\n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/math>, which doesn&#x2019;t<br \/>\nrequire knowledge of <!--l. 466--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03bc<\/mi><\/math>,<br \/>\nthen a simple calculation shows that we instead get<br \/>\n<!--l. 467--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>c<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>2<\/mn><mi \n>L<\/mi><\/math>, and<br \/>\nso:<br \/>\n<!--tex4ht:inline--><\/p>\n<p><!--l. 468--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">\u2264<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\"><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msup \n> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><msup><mrow \n><mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mn>0<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfrac><mrow \n><mn>2<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">,<\/mo><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">         <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">\u2264<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\"><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msup \n><mn>2<\/mn><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mn>0<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>                <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>                 <\/mtr><\/mtable>\n<\/math><br \/>\n<!--l. 471--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 474--><\/p>\n<p class=\"indent\" >   The right hand side is obviously a much tighter bound then when<br \/>\n<!--l. 474--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mn>2<\/mn><mo \nclass=\"MathClass-bin\">\u2215<\/mo><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>\u03bc<\/mi> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>L<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math> is<\/p>\n<p>used, but the geometric rate is roughly twice as slow.\n<\/p>\n<p><!--l. 478--><\/p>\n<p class=\"noindent\" >\n<h4 class=\"likesubsectionHead\"><a \n id=\"x1-90004\"><\/a>Comments<\/h4>\n<p><!--l. 480--><\/p>\n<p class=\"noindent\" >This proof technique has seen a lot of application lately. It is used for the SAGA<br \/>\n[<a \nhref=\"#Xadefazio-nips2014\">2<\/a>]and SVRG [<a \nhref=\"#Xsvrg\">4<\/a>] methods, and can be applied to accelerated method even, such as<br \/>\nthe accelerated coordinate descent theory [<a \nhref=\"#Xnes-coord\">8<\/a>]. The Lyapunov function analysis<br \/>\ntechnique is of great general utility, and so it is worth studying carefully. It is covered<br \/>\nperhaps best in Polyak&#x2019;s book [<a \nhref=\"#Xpolyak-book\">10<\/a>].\n<\/p>\n<p><!--l. 488--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"sectionHead\"><span class=\"titlemark\">5   <\/span> <a \n id=\"x1-100005\"><\/a>Gradient Norm Descent<\/h3>\n<p><!--l. 490--><\/p>\n<p class=\"noindent\" >In the strongly convex case, it is actually possible to show that the gradient norm decreases<br \/>\nat least linearly as well as the function value and iterates. This requires a \ufb01xed step size<br \/>\nof <!--l. 492--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/math>, as<br \/>\nit is not true when line searches are used.\n<\/p>\n<div class=\"newtheorem\">\n<!--l. 494--><\/p>\n<p class=\"noindent\" ><span class=\"head\"><br \/>\n<a \n id=\"x1-10001r3\"><\/a><br \/>\n<span \nclass=\"ecbx-1000\">Lemma 3.<\/span>  <\/span><span \nclass=\"ecti-1000\">For <\/span><!--l. 495--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/math><span \nclass=\"ecti-1000\">:<\/span>\n<\/p>\n<div class=\"math-display\"><!--l. 496--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n><msup><mrow \n>\n                          <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>2<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 498--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 501--><\/p>\n<p class=\"indent\" >   <span \nclass=\"ecti-1000\">Note that <\/span><!--l. 501--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>2<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">=<\/mo>  <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><msup><mrow \n><mi \n>L<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow><\/mfrac><msup><mrow \n>  <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/math><br \/>\n<span \nclass=\"ecti-1000\">and <\/span><!--l. 502--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">=<\/mo>  <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><msup><mrow \n><mi \n>L<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow><\/mfrac><msup><mrow \n>  <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/math><span \nclass=\"ecti-1000\">.<\/span><\/p>\n<\/p><\/div>\n<p><!--l. 503--><\/p>\n<p class=\"indent\" >\n<div class=\"proof\">\n<!--l. 504--><\/p>\n<p class=\"indent\" >   <span class=\"head\"><br \/>\n<span \nclass=\"ecti-1000\">Proof.<\/span> <\/span>We start by expanding in terms of the step equation<br \/>\n<!--l. 504--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-rel\">=<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">.<\/mo><\/math><br \/>\n<!--tex4ht:inline--><\/p>\n<p><!--l. 505--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>2<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\"><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03b1<\/mi><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n>  <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">             <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\"><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <msup><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">             <\/mtd><mtd \nclass=\"eqnarray-2\">    <\/mtd><mtd \nclass=\"eqnarray-3\">   <mo \nclass=\"MathClass-bin\">+<\/mo><mn>2<\/mn><mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mspace width=\"0.3em\" class=\"thinspace\"\/><mo \nclass=\"MathClass-punc\">,<\/mo><mspace width=\"0.3em\" class=\"thinspace\"\/><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <mo \nclass=\"MathClass-punc\">.<\/mo>  <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>               <\/mtr><\/mtable>\n<\/math><br \/>\n<!--l. 509--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 512--><\/p>\n<p class=\"indent\" >   Now applying both inner product bounds (5) and (6):\n<\/p>\n<div class=\"math-display\"><!--l. 513--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n><msup><mrow \n>\n        <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>w<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>w<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>\u03b1<\/mi><mi \n>\u03bc<\/mi><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-bin\">+<\/mo> <mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 515--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 518--><\/p>\n<p class=\"indent\" >   So for <!--l. 518--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/math><br \/>\nthis simpli\ufb01es to:\n<\/p>\n<div class=\"math-display\"><!--l. 519--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n><msup><mrow \n>\n                            <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>2<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 521--><\/p>\n<p class=\"nopar\" >\n                                                                 \u25a1\n<\/p>\n<\/p><\/div>\n<p><!--l. 524--><\/p>\n<p class=\"indent\" >   Chaining this result (Lemma <a \nhref=\"#x1-10001r3\">3<!--tex4ht:ref: lem:norm-descends --><\/a>) over<br \/>\n<!--l. 524--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>k<\/mi><\/math><br \/>\ngives:<\/p>\n<p><!--tex4ht:inline--><\/p>\n<p><!--l. 525--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" >\n<mtable \nclass=\"eqnarray-star\" columnalign=\"right center left\" >\n<mtr><mtd \nclass=\"eqnarray-1\"><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">\u2264<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\"><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msup \n><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n>  <\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>\n<\/mtr><mtr><mtd \nclass=\"eqnarray-1\">           <\/mtd><mtd \nclass=\"eqnarray-2\">   <mo \nclass=\"MathClass-rel\">=<\/mo><\/mtd><mtd \nclass=\"eqnarray-3\"><msup><mrow \n>   <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msup \n> <mfrac><mrow \n><mn>1<\/mn><\/mrow>\n<mrow \n><msup><mrow \n><mi \n>L<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo><\/mtd><mtd \nclass=\"eqnarray-4\"> <mtext class=\"eqnarray\"><\/mtext><\/mtd>                               <\/mtr><\/mtable>\n<\/math><br \/>\n<!--l. 528--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 531--><\/p>\n<p class=\"indent\" >   We now use <!--l. 531--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><mi \n>\u03bc<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-rel\">=<\/mo> <mfrac><mrow \n><msup><mrow \n><mi \n>L<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow> \n<mrow \n><mn>2<\/mn><mi \n>\u03bc<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n> <mo \nclass=\"MathClass-punc\">:<\/mo><\/math>\n<\/p>\n<div class=\"math-display\"><!--l. 532--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                           <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mi \n>f<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u2264<\/mo><msup><mrow \n> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">\u2212<\/mo><mfrac><mrow \n><mi \n>\u03bc<\/mi><\/mrow> \n<mrow \n><mi \n>L<\/mi><\/mrow><\/mfrac><\/mrow><\/mfenced><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msup \n> <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mn>2<\/mn><mi \n>\u03bc<\/mi><\/mrow><\/mfrac><msup><mrow \n> <mfenced separators=\"\" \nopen=\"\u2225\"  close=\"\u2225\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfenced><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 534--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 538--><\/p>\n<p class=\"noindent\" >\n<h4 class=\"likesubsectionHead\"><a \n id=\"x1-110005\"><\/a>Comments<\/h4>\n<p><!--l. 540--><\/p>\n<p class=\"noindent\" >This technique is probably the weirdest of those listed here. It has seen<br \/>\napplication in proving the convergence rate of MISO under some di\ufb00erent<br \/>\nstochastic orderings [<a \nhref=\"#Xadefazio-thesis2014\">1<\/a>]. While clearly a primal result, this proof has some<br \/>\ncomponents normally seen in the proof for a dual method. The gradient<br \/>\n<!--l. 544--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math> is<br \/>\ne\ufb00ectively the dual iterate. Another interesting property is that the portion of the<br \/>\nproof concerning the gradient&#x2019;s convergence uses the strong convexity between<\/p>\n<p><!--l. 547--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/math> and<br \/>\n<!--l. 547--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/math>,<br \/>\nwhereas the other proofs considered all use the degree of strong convexity between<br \/>\n<!--l. 549--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/math> and<br \/>\n<!--l. 549--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msup><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mo \nclass=\"MathClass-bin\">\u2217<\/mo><\/mrow><\/msup \n><\/math>.\n<\/p>\n<p><!--l. 551--><\/p>\n<p class=\"indent\" >   This proof technique can&#x2019;t work when line searches are used, as bounding the<br \/>\ninner product:\n<\/p>\n<div class=\"math-display\"><!--l. 553--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                               <mi \n>\u03b1<\/mi> <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n>\n<mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mspace width=\"0.3em\" class=\"thinspace\"\/><mo \nclass=\"MathClass-punc\">,<\/mo><mspace width=\"0.3em\" class=\"thinspace\"\/><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced> <mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 555--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 558--><\/p>\n<p class=\"indent\" >   would fail if <!--l. 558--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b1<\/mi><\/math><br \/>\nchanged between steps, as it would become<br \/>\n<!--l. 559--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" > <mfenced separators=\"\" \nopen=\"\u27e8\"  close=\"\u27e9\" ><mrow><msub><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>\u03b1<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><msup><mrow \n><mi \n>f<\/mi><\/mrow><mrow \n><mi \n>\u2032<\/mi><\/mrow><\/msup \n><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mspace width=\"0.3em\" class=\"thinspace\"\/><mo \nclass=\"MathClass-punc\">,<\/mo><mspace width=\"0.3em\" class=\"thinspace\"\/><msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><mo \nclass=\"MathClass-bin\">+<\/mo><mn>1<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <msub><mrow \n><mi \n>x<\/mi><\/mrow><mrow \n><mi \n>k<\/mi><\/mrow><\/msub \n><\/mrow><\/mfenced><\/math>,<br \/>\nwhich is a weird expression to work with.\n<\/p>\n<p><!--l. 1--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"likesectionHead\"><a \n id=\"x1-120005\"><\/a>References<\/h3>\n<p><!--l. 1--><\/p>\n<p class=\"noindent\" >\n<div class=\"thebibliography\">\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n <a \n id=\"Xadefazio-thesis2014\"><\/a>[1] <span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Aaron Defazio. <span \nclass=\"ecti-1000\">New Optimization Methods for Machine Learning<\/span>. PhD<br \/>\n    thesis, Australian National University, 2014.\n    <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n <a \n id=\"Xadefazio-nips2014\"><\/a>[2] <span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Aaron  Defazio,  Francis  Bach,  and  Simon  Lacoste-Julien.   Saga:  A<br \/>\n    fast incremental gradient method with support for non-strongly convex<br \/>\n    composite objectives. <span \nclass=\"ecti-1000\">Advances in Neural Information Processing Systems<\/span><br \/>\n    <span \nclass=\"ecti-1000\">27 (NIPS 2014)<\/span>, 2014.\n    <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n <a \n id=\"Xmm\"><\/a>[3] <span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>David\u00a0R. Hunter and Kenneth Lange.  Quantile regression via an mm<br \/>\n    algorithm. <span \nclass=\"ecti-1000\">Journal of Computational and Graphical Statistics<\/span>, 9, 2000.\n    <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n <a \n id=\"Xsvrg\"><\/a>[4] <span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Rie Johnson and Tong Zhang. Accelerating stochastic gradient descent<br \/>\n    using predictive variance reduction. <span \nclass=\"ecti-1000\">NIPS<\/span>, 2013.\n    <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n <a \n id=\"Xreweighted-l1\"><\/a>[5] <span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Qiang  Liu  and  Alexander  Ihler.    Learning  scale  free  networks  by<br \/>\n    reweighted l1 regularization. <span \nclass=\"ecti-1000\">AISTATS<\/span>, 2011.\n    <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n <a \n id=\"Xmiso2\"><\/a>[6] <span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Julien  Mairal.   Incremental  majorization-minimization  optimization<br \/>\n    with application to large-scale machine learning. Technical report, INRIA<br \/>\n    Grenoble Rh\u00c3\u017dne-Alpes \/ LJK Laboratoire Jean Kuntzmann, 2014.\n    <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n <a \n id=\"Xnes-book\"><\/a>[7] <span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Yu. Nesterov. <span \nclass=\"ecti-1000\">Introductory Lectures On Convex Programming<\/span>. Springer,<br \/>\n    1998.\n    <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n <a \n id=\"Xnes-coord\"><\/a>[8] <span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Yu. Nesterov.  E\ufb03ciency of coordinate descent methods on huge-scale<br \/>\n    optimization problems. Technical report, CORE, 2010.\n    <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n <a \n id=\"Xnes-cubic\"><\/a>[9] <span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Yu. Nesterov and B.T. Polyak. Cubic regularization of newton method<br \/>\n    and its global performance.  <span \nclass=\"ecti-1000\">Mathematical Programming<\/span>, 108(1):177\u2013205,<br \/>\n    2006.\n    <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xpolyak-book\"><\/a>[10] <span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Boris Polyak.  <span \nclass=\"ecti-1000\">Introduction to Optimization<\/span>.  Optimization Software,<br \/>\n    Inc., Publications Division., 1987.\n<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Consider the classical gradient descent method: xk+1 = xk \u2212 \u03b1f\u2032(x k). It&#x2019;s a thing of beauty isn&#x2019;t it? While it&#x2019;s not used directly in practice any more, the proof techniques used in its analysis are the building blocks behind the theory of more advanced optimization methods. I know of 8 di\ufb00erent ways of proving &hellip; <a href=\"https:\/\/www.aarondefazio.com\/tangentially\/?p=32\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">The Many Ways to Analyse Gradient Descent<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts\/32"}],"collection":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=32"}],"version-history":[{"count":4,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts\/32\/revisions"}],"predecessor-version":[{"id":101,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts\/32\/revisions\/101"}],"wp:attachment":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=32"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=32"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=32"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}