{"id":76,"date":"2016-06-28T23:58:44","date_gmt":"2016-06-28T23:58:44","guid":{"rendered":"http:\/\/www.aarondefazio.com\/tangentially\/?p=76"},"modified":"2016-06-28T23:58:44","modified_gmt":"2016-06-28T23:58:44","slug":"a-curated-list-of-interesting-icml-2016-papers","status":"publish","type":"post","link":"https:\/\/www.aarondefazio.com\/tangentially\/?p=76","title":{"rendered":"A curated list of interesting ICML 2016 papers"},"content":{"rendered":"<p>I&#8217;ve went through the hundreds of ICML 2016 papers and curated a subset that look interesting to me.<br \/>\nIn no particular order:<\/p>\n<hr \/>\n<div class=\"paper\">\n<p class=\"title\">Faster Convex Optimization: Simulated Annealing with an Efficient Universal Barrier<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tJacob Abernethy,<\/p>\n<p>\t\t\tElad Hazan<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/abernethy16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/abernethy16.pdf\">pdf<\/a>]<\/p>\n<p>\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/abernethy16-supp.pdf\">supplementary<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Variance Reduction for Faster Non-Convex Optimization<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tZeyuan Allen-Zhu,<\/p>\n<p>\t\t\tElad Hazan<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/allen-zhua16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/allen-zhua16.pdf\">pdf<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Improved SVRG for Non-Strongly-Convex or Sum-of-Non-Convex Objectives<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tZeyuan Allen-Zhu,<\/p>\n<p>\t\t\tYang Yuan<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/allen-zhub16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/allen-zhub16.pdf\">pdf<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tZeyuan Allen-Zhu,<\/p>\n<p>\t\t\tZheng Qu,<\/p>\n<p>\t\t\tPeter Richtarik,<\/p>\n<p>\t\t\tYang Yuan<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/allen-zhuc16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/allen-zhuc16.pdf\">pdf<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tDario Amodei,<\/p>\n<p>\t\t\tRishita Anubhai,<\/p>\n<p>\t\t\tEric Battenberg,<\/p>\n<p>\t\t\tCarl Case,<\/p>\n<p>\t\t\tJared Casper,<\/p>\n<p>\t\t\tBryan Catanzaro,<\/p>\n<p>\t\t\tJingDong Chen,<\/p>\n<p>\t\t\tMike Chrzanowski,<\/p>\n<p>\t\t\tAdam Coates,<\/p>\n<p>\t\t\tGreg Diamos,<\/p>\n<p>\t\t\tErich Elsen,<\/p>\n<p>\t\t\tJesse Engel,<\/p>\n<p>\t\t\tLinxi Fan,<\/p>\n<p>\t\t\tChristopher Fougner,<\/p>\n<p>\t\t\tAwni Hannun,<\/p>\n<p>\t\t\tBilly Jun,<\/p>\n<p>\t\t\tTony Han,<\/p>\n<p>\t\t\tPatrick LeGresley,<\/p>\n<p>\t\t\tXiangang Li,<\/p>\n<p>\t\t\tLibby Lin,<\/p>\n<p>\t\t\tSharan Narang,<\/p>\n<p>\t\t\tAndrew Ng,<\/p>\n<p>\t\t\tSherjil Ozair,<\/p>\n<p>\t\t\tRyan Prenger,<\/p>\n<p>\t\t\tSheng Qian,<\/p>\n<p>\t\t\tJonathan Raiman,<\/p>\n<p>\t\t\tSanjeev Satheesh,<\/p>\n<p>\t\t\tDavid Seetapun,<\/p>\n<p>\t\t\tShubho Sengupta,<\/p>\n<p>\t\t\tChong Wang,<\/p>\n<p>\t\t\tYi Wang,<\/p>\n<p>\t\t\tZhiqian Wang,<\/p>\n<p>\t\t\tBo Xiao,<\/p>\n<p>\t\t\tYan Xie,<\/p>\n<p>\t\t\tDani Yogatama,<\/p>\n<p>\t\t\tJun Zhan,<\/p>\n<p>\t\t\tZhenyao Zhu<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/amodei16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/amodei16.pdf\">pdf<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">On the Iteration Complexity of Oblivious First-Order Optimization Algorithms<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tYossi Arjevani,<\/p>\n<p>\t\t\tOhad Shamir<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/arjevani16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/arjevani16.pdf\">pdf<\/a>]<\/p>\n<p>\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/arjevani16-supp.pdf\">supplementary<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Black-box Optimization with a Politician<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tSebastien Bubeck,<\/p>\n<p>\t\t\tYin Tat Lee<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/bubeck16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/bubeck16.pdf\">pdf<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Importance Sampling Tree for Large-scale Empirical Expectation<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tOlivier Canevet,<\/p>\n<p>\t\t\tCijo Jose,<\/p>\n<p>\t\t\tFrancois Fleuret<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/canevet16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/canevet16.pdf\">pdf<\/a>]<\/p>\n<p>\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/canevet16-supp.pdf\">supplementary<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tRan Gilad-Bachrach,<\/p>\n<p>\t\t\tNathan Dowlin,<\/p>\n<p>\t\t\tKim Laine,<\/p>\n<p>\t\t\tKristin Lauter,<\/p>\n<p>\t\t\tMichael Naehrig,<\/p>\n<p>\t\t\tJohn Wernsing<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/gilad-bachrach16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/gilad-bachrach16.pdf\">pdf<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Solving Ridge Regression using Sketched Preconditioned SVRG<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tAlon Gonen,<\/p>\n<p>\t\t\tFrancesco Orabona,<\/p>\n<p>\t\t\tShai Shalev-Shwartz<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/gonen16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/gonen16.pdf\">pdf<\/a>]<\/p>\n<p>\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/gonen16-supp.pdf\">supplementary<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Variance-Reduced and Projection-Free Stochastic Optimization<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tElad Hazan,<\/p>\n<p>\t\t\tHaipeng Luo<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/hazana16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/hazana16.pdf\">pdf<\/a>]<\/p>\n<p>\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/hazana16-supp.pdf\">supplementary<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">On Graduated Optimization for Stochastic Non-Convex Problems<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tElad Hazan,<\/p>\n<p>\t\t\tKfir Yehuda Levy,<\/p>\n<p>\t\t\tShai Shalev-Shwartz<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/hazanb16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/hazanb16.pdf\">pdf<\/a>]<\/p>\n<p>\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/hazanb16-supp.pdf\">supplementary<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Doubly Robust Off-policy Value Evaluation for Reinforcement Learning<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tNan Jiang,<\/p>\n<p>\t\t\tLihong Li<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/jiang16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/jiang16.pdf\">pdf<\/a>]<\/p>\n<p>\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/jiang16-supp.pdf\">supplementary<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Stochastic Variance Reduced Optimization for Nonconvex Sparse Learning<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tXingguo Li,<\/p>\n<p>\t\t\tTuo Zhao,<\/p>\n<p>\t\t\tRaman Arora,<\/p>\n<p>\t\t\tHan Liu,<\/p>\n<p>\t\t\tJarvis Haupt<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/lid16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/lid16.pdf\">pdf<\/a>]<\/p>\n<p>\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/lid16-supp.pdf\">supplementary<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">A Variational Analysis of Stochastic Gradient Algorithms<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tStephan Mandt,<\/p>\n<p>\t\t\tMatthew Hoffman,<\/p>\n<p>\t\t\tDavid Blei<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/mandt16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/mandt16.pdf\">pdf<\/a>]<\/p>\n<p>\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/mandt16-supp.pdf\">supplementary<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Stochastic Variance Reduction for Nonconvex Optimization<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tSashank J. Reddi,<\/p>\n<p>\t\t\tAhmed Hefny,<\/p>\n<p>\t\t\tSuvrit Sra,<\/p>\n<p>\t\t\tBarnabas Poczos,<\/p>\n<p>\t\t\tAlex Smola<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/reddi16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/reddi16.pdf\">pdf<\/a>]<\/p>\n<p>\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/reddi16-supp.zip\">supplementary<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">A Superlinearly-Convergent Proximal Newton-type Method for the Optimization of Finite Sums<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tAnton Rodomanov,<\/p>\n<p>\t\t\tDmitry Kropotov<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/rodomanov16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/rodomanov16.pdf\">pdf<\/a>]<\/p>\n<p>\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/rodomanov16-supp.pdf\">supplementary<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">SDCA without Duality, Regularization, and Individual Convexity<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tShai Shalev-Shwartz<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/shalev-shwartza16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/shalev-shwartza16.pdf\">pdf<\/a>]<\/p>\n<\/div>\n<div class=\"paper\">\n<p class=\"title\">Training Neural Networks Without Gradients: A Scalable ADMM Approach<\/p>\n<p class=\"details\">\n\t<span class=\"authors\"><\/p>\n<p>\t\t\tGavin Taylor,<\/p>\n<p>\t\t\tRyan Burmeister,<\/p>\n<p>\t\t\tZheng Xu,<\/p>\n<p>\t\t\tBharat Singh,<\/p>\n<p>\t\t\tAnkit Patel,<\/p>\n<p>\t\t\tTom Goldstein<br \/>\n\t<\/span>\n\t<\/p>\n<p class=\"links\">\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/taylor16.html\">abs<\/a>]<br \/>\n\t\t[<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/taylor16.pdf\">pdf<\/a>]<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;ve went through the hundreds of ICML 2016 papers and curated a subset that look interesting to me. In no particular order: Faster Convex Optimization: Simulated Annealing with an Efficient Universal Barrier Jacob Abernethy, Elad Hazan [abs] [pdf] [supplementary] Variance Reduction for Faster Non-Convex Optimization Zeyuan Allen-Zhu, Elad Hazan [abs] [pdf] Improved SVRG for Non-Strongly-Convex &hellip; <a href=\"https:\/\/www.aarondefazio.com\/tangentially\/?p=76\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">A curated list of interesting ICML 2016 papers<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts\/76"}],"collection":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=76"}],"version-history":[{"count":3,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts\/76\/revisions"}],"predecessor-version":[{"id":79,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts\/76\/revisions\/79"}],"wp:attachment":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=76"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=76"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=76"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}