Jump to content

Convex function

From Wikipedia, the free encyclopedia
(Redirected fromStrictly convex function)
Convex function on aninterval.
A function (in black) is convex if and only if the region above itsgraph(in green) is aconvex set.
A graph of thebivariateconvex functionx2+xy+y2.
Convex vs. Not convex

Inmathematics,areal-valued functionis calledconvexif theline segmentbetween any two distinct points on thegraph of the functionlies above the graph between the two points. Equivalently, a function is convex if itsepigraph(the set of points on or above the graph of the function) is aconvex set. In simple terms, a convex function graph is shaped like a cup(or a straight line like a linear function), while aconcave function's graph is shaped like a cap.

A twice-differentiablefunction of a single variable is convexif and only ifitssecond derivativeis nonnegative on its entiredomain.[1]Well-known examples of convex functions of a single variable include alinear function(whereis areal number), aquadratic function(as a nonnegative real number) and anexponential function(as a nonnegative real number).

Convex functions play an important role in many areas of mathematics. They are especially important in the study ofoptimizationproblems where they are distinguished by a number of convenient properties. For instance, a strictly convex function on anopen sethas no more than oneminimum.Even in infinite-dimensional spaces, under suitable additional hypotheses, convex functions continue to satisfy such properties and as a result, they are the most well-understood functionals in thecalculus of variations.Inprobability theory,a convex function applied to theexpected valueof arandom variableis always bounded above by the expected value of the convex function of the random variable. This result, known asJensen's inequality,can be used to deduceinequalitiessuch as thearithmetic–geometric mean inequalityandHölder's inequality.

Definition[edit]

Visualizing a convex function and Jensen's Inequality

Letbe aconvex subsetof a realvector spaceand letbe a function.

Thenis calledconvexif and only if any of the following equivalent conditions hold:

  1. For alland all:
    The right hand side represents the straight line betweenandin the graph ofas a function ofincreasingfromtoor decreasingfromtosweeps this line. Similarly, the argument of the functionin the left hand side represents the straight line betweenandinor the-axis of the graph ofSo, this condition requires that the straight line between any pair of points on the curve ofto be above or just meets the graph.[2]
  2. For alland allsuch that:
    The difference of this second condition with respect to the first condition above is that this condition does not include the intersection points (for example,and) between the straight line passing through a pair of points on the curve of(the straight line is represented by the right hand side of this condition) and the curve ofthe first condition includes the intersection points as it becomesoratororIn fact, the intersection points do not need to be considered in a condition of convex using
    becauseandare always true (so not useful to be a part of a condition).

The second statement characterizing convex functions that are valued in the real lineis also the statement used to defineconvex functionsthat are valued in theextended real number linewhere such a functionis allowed to takeas a value. The first statement is not used because it permitsto takeoras a value, in which case, iforrespectively, thenwould be undefined (because the multiplicationsandare undefined). The sumis also undefined so a convex extended real-valued function is typically only allowed to take exactly one ofandas a value.

The second statement can also be modified to get the definition ofstrict convexity,where the latter is obtained by replacingwith the strict inequality Explicitly, the mapis calledstrictly convexif and only if for all realand allsuch that:

A strictly convex functionis a function that the straight line between any pair of points on the curveis above the curveexcept for the intersection points between the straight line and the curve. An example of a function which is convex but not strictly convex is.This function is not strictly convex because any two points sharing an x coordinate will have a straight line between them, while any two points NOT sharing an x coordinate will have a greater value of the function than the points between them.

The functionis said to beconcave(resp.strictly concave) if(multiplied by −1) is convex (resp. strictly convex).

Alternative naming[edit]

The termconvexis often referred to asconvex downorconcave upward,and the termconcaveis often referred asconcave downorconvex upward.[3][4][5]If the term "convex" is used without an "up" or "down" keyword, then it refers strictly to a cup shaped graph.As an example,Jensen's inequalityrefers to an inequality involving a convex or convex-(down), function.[6]

Properties[edit]

Many properties of convex functions have the same simple formulation for functions of many variables as for functions of one variable. See below the properties for the case of many variables, as some of them are not listed for functions of one variable.

Functions of one variable[edit]

  • Supposeis a function of onerealvariable defined on an interval, and let
    (note thatis the slope of the purple line in the first drawing; the functionissymmetricinmeans thatdoes not change by exchangingand).is convex if and only ifismonotonically non-decreasinginfor every fixed(or vice versa). This characterization of convexity is quite useful to prove the following results.
  • A convex functionof one real variable defined on someopen intervaliscontinuousonadmitsleft and right derivatives,and these aremonotonically non-decreasing.In addition, the left derivative is left-continuous and the right-derivative is right-continuous. As a consequence,isdifferentiableat all but at mostcountably manypoints, the set on whichis not differentiable can however still be dense. Ifis closed, thenmay fail to be continuous at the endpoints of(an example is shown in theexamples section).
  • Adifferentiablefunction of one variable is convex on an interval if and only if itsderivativeismonotonically non-decreasingon that interval. If a function is differentiable and convex then it is alsocontinuously differentiable.
  • A differentiable function of one variable is convex on an interval if and only if its graph lies above all of itstangents:[7]: 69 
    for allandin the interval.
  • A twice differentiable function of one variable is convex on an interval if and only if itssecond derivativeis non-negative there; this gives a practical test for convexity. Visually, a twice differentiable convex function "curves up", without any bends the other way (inflection points). If its second derivative is positive at all points then the function is strictly convex, but theconversedoes not hold. For example, the second derivative ofis,which is zero forbutis strictly convex.
    • This property and the above property in terms of "...its derivative is monotonically non-decreasing..." are not equal since ifis non-negative on an intervalthenis monotonically non-decreasing onwhile its converse is not true, for example,is monotonically non-decreasing onwhile its derivativeis not defined at some points on.
  • Ifis a convex function of one real variable, and,thenissuperadditiveon thepositive reals,that isfor positive real numbersand.
Proof

Sinceis convex, by using one of the convex function definitions above and lettingit follows that for all real

From,it follows that
Namely,.
  • A functionis midpoint convex on an intervalif for all
    This condition is only slightly weaker than convexity. For example, a real-valuedLebesgue measurable functionthat is midpoint-convex is convex: this is a theorem ofSierpiński.[8]In particular, a continuous function that is midpoint convex will be convex.

Functions of several variables[edit]

  • A function that is marginally convex in each individual variable is not necessarily (jointly) convex. For example, the functionismarginally linear,and thus marginally convex, in each variable, but not (jointly) convex.
  • A functionvalued in theextended real numbersis convex if and only if itsepigraph
    is a convex set.
  • A differentiable functiondefined on a convex domain is convex if and only ifholds for allin the domain.
  • A twice differentiable function of several variables is convex on a convex set if and only if itsHessian matrixof secondpartial derivativesispositive semidefiniteon the interior of the convex set.
  • For a convex functionthesublevel setsandwithare convex sets. A function that satisfies this property is called aquasiconvex functionand may fail to be a convex function.
  • Consequently, the set ofglobal minimisersof a convex functionis a convex set:- convex.
  • Anylocal minimumof a convex function is also aglobal minimum.Astrictlyconvex function will have at most one global minimum.[9]
  • Jensen's inequalityapplies to every convex function.Ifis a random variable taking values in the domain ofthenwheredenotes themathematical expectation.Indeed, convex functions are exactly those that satisfies the hypothesis ofJensen's inequality.
  • A first-orderhomogeneous functionof two positive variablesand(that is, a function satisfyingfor all positive real) that is convex in one variable must be convex in the other variable.[10]

Operations that preserve convexity[edit]

  • is concave if and only ifis convex.
  • Ifis any real number thenis convex if and only ifis convex.
  • Nonnegative weighted sums:
    • ifandare all convex, then so isIn particular, the sum of two convex functions is convex.
    • this property extends to infinite sums, integrals and expected values as well (provided that they exist).
  • Elementwise maximum: letbe a collection of convex functions. Thenis convex. The domain ofis the collection of points where the expression is finite. Important special cases:
    • Ifare convex functions then so is
    • Danskin's theorem:Ifis convex inthenis convex ineven ifis not a convex set.
  • Composition:
    • Ifandare convex functions andis non-decreasing over a univariate domain, thenis convex. For example, ifis convex, then so isbecauseis convex and monotonically increasing.
    • Ifis concave andis convex and non-increasing over a univariate domain, thenis convex.
    • Convexity is invariant under affine maps: that is, ifis convex with domain,then so is,wherewith domain
  • Minimization: Ifis convex inthenis convex inprovided thatis a convex set and that
  • Ifis convex, then its perspectivewith domainis convex.
  • Letbe a vector space.is convex and satisfiesif and only iffor anyand any non-negative real numbersthat satisfy

Strongly convex functions[edit]

The concept of strong convexity extends and parametrizes the notion of strict convexity. Intuitively, a strongly-convex function is a function that grows as fast as a quadratic function.[11]A strongly convex function is also strictly convex, but not vice versa. If a one-dimensional functionis twice continuously differentiable and the domain is the real line, then we can characterize it as follows:

  • convex if and only iffor all
  • strictly convex iffor all(note: this is sufficient, but not necessary).
  • strongly convex if and only iffor all

For example, letbe strictly convex, and suppose there is a sequence of pointssuch that.Even though,the function is not strongly convex becausewill become arbitrarily small.

More generally, a differentiable functionis called strongly convex with parameterif the following inequality holds for all pointsin its domain:[12]

or, more generally,
whereis anyinner product,andis the correspondingnorm.Some authors, such as[13]refer to functions satisfying this inequality asellipticfunctions.

An equivalent condition is the following:[14]

It is not necessary for a function to be differentiable in order to be strongly convex. A third definition[14]for a strongly convex function, with parameteris that, for allin the domain and

Notice that this definition approaches the definition for strict convexity asand is identical to the definition of a convex function whenDespite this, functions exist that are strictly convex but are not strongly convex for any(see example below).

If the functionis twice continuously differentiable, then it is strongly convex with parameterif and only iffor allin the domain, whereis the identity andis theHessian matrix,and the inequalitymeans thatispositive semi-definite.This is equivalent to requiring that the minimumeigenvalueofbe at leastfor allIf the domain is just the real line, thenis just the second derivativeso the condition becomes.Ifthen this means the Hessian is positive semidefinite (or if the domain is the real line, it means that), which implies the function is convex, and perhaps strictly convex, but not strongly convex.

Assuming still that the function is twice continuously differentiable, one can show that the lower bound ofimplies that it is strongly convex. UsingTaylor's Theoremthere exists

such that
Then
by the assumption about the eigenvalues, and hence we recover the second strong convexity equation above.

A functionis strongly convex with parametermif and only if the function

is convex.

A twice continuously differentiable functionon a compact domainthat satisfiesfor allis strongly convex. The proof of this statement follows from theextreme value theorem,which states that a continuous function on a compact set has a maximum and minimum.

Strongly convex functions are in general easier to work with than convex or strictly convex functions, since they are a smaller class. Like strictly convex functions, strongly convex functions have unique minima on compact sets.

Properties of strongly-convex functions[edit]

Iffis a strongly-convex function with parameterm,then:[15]: Prop.6.1.4 

Uniformly convex functions[edit]

A uniformly convex function,[16][17]with modulus,is a functionthat, for allin the domain andsatisfies

whereis a function that is non-negative and vanishes only at 0. This is a generalization of the concept of strongly convex function; by takingwe recover the definition of strong convexity.

It is worth noting that some authors require the modulusto be an increasing function,[17]but this condition is not required by all authors.[16]

Examples[edit]

Functions of one variable[edit]

  • The functionhas,sofis a convex function. It is also strongly convex (and hence strictly convex too), with strong convexity constant 2.
  • The functionhas,sofis a convex function. It is strictly convex, even though the second derivative is not strictly positive at all points. It is not strongly convex.
  • Theabsolute valuefunctionis convex (as reflected in thetriangle inequality), even though it does not have a derivative at the pointIt is not strictly convex.
  • The functionforis convex.
  • Theexponential functionis convex. It is also strictly convex, since,but it is not strongly convex since the second derivative can be arbitrarily close to zero. More generally, the functionislogarithmically convexifis a convex function. The term "superconvex" is sometimes used instead.[18]
  • The functionwith domain [0,1] defined byforis convex; it is continuous on the open intervalbut not continuous at 0 and 1.
  • The functionhas second derivative;thus it is convex on the set whereandconcaveon the set where
  • Examples of functions that aremonotonically increasingbut not convex includeand.
  • Examples of functions that are convex but notmonotonically increasingincludeand.
  • The functionhaswhich is greater than 0 ifsois convex on the interval.It is concave on the interval.
  • The functionwith,is convex on the intervaland convex on the interval,but not convex on the interval,because of the singularity at

Functions ofnvariables[edit]

  • LogSumExpfunction, also called softmax function, is a convex function.
  • The functionon the domain ofpositive-definite matricesis convex.[7]: 74 
  • Every real-valuedlinear transformationis convex but not strictly convex, since ifis linear, then.This statement also holds if we replace "convex" by "concave".
  • Every real-valuedaffine function,that is, each function of the formis simultaneously convex and concave.
  • Everynormis a convex function, by thetriangle inequalityandpositive homogeneity.
  • Thespectral radiusof anonnegative matrixis a convex function of its diagonal elements.[19]

See also[edit]

Notes[edit]

  1. ^"Lecture Notes 2"(PDF).www.stat.cmu.edu.Retrieved3 March2017.
  2. ^"Concave Upward and Downward".Archivedfrom the original on 2013-12-18.
  3. ^Stewart, James (2015).Calculus(8th ed.). Cengage Learning. pp. 223–224.ISBN978-1305266643.
  4. ^W. Hamming, Richard (2012).Methods of Mathematics Applied to Calculus, Probability, and Statistics(illustrated ed.). Courier Corporation. p. 227.ISBN978-0-486-13887-9.Extract of page 227
  5. ^Uvarov, Vasiliĭ Borisovich (1988).Mathematical Analysis.Mir Publishers. p. 126-127.ISBN978-5-03-000500-3.
  6. ^Prügel-Bennett, Adam (2020).The Probability Companion for Engineering and Computer Science(illustrated ed.). Cambridge University Press. p. 160.ISBN978-1-108-48053-6.Extract of page 160
  7. ^abBoyd, Stephen P.; Vandenberghe, Lieven (2004).Convex Optimization(pdf).Cambridge University Press.ISBN978-0-521-83378-3.RetrievedOctober 15,2011.
  8. ^Donoghue, William F. (1969).Distributions and Fourier Transforms.Academic Press. p. 12.ISBN9780122206504.RetrievedAugust 29,2012.
  9. ^"If f is strictly convex in a convex set, show it has no more than 1 minimum".Math StackExchange. 21 Mar 2013.Retrieved14 May2016.
  10. ^Altenberg, L., 2012. Resolvent positive linear operators exhibit the reduction phenomenon. Proceedings of the National Academy of Sciences, 109(10), pp.3705-3710.
  11. ^"Strong convexity · Xingyu Zhou's blog".xingyuzhou.org.Retrieved2023-09-27.
  12. ^Dimitri Bertsekas (2003).Convex Analysis and Optimization.Contributors: Angelia Nedic and Asuman E. Ozdaglar. Athena Scientific. p.72.ISBN9781886529458.
  13. ^Philippe G. Ciarlet (1989).Introduction to numerical linear algebra and optimisation.Cambridge University Press.ISBN9780521339841.
  14. ^abYurii Nesterov (2004).Introductory Lectures on Convex Optimization: A Basic Course.Kluwer Academic Publishers. pp.63–64.ISBN9781402075537.
  15. ^Nemirovsky and Ben-Tal (2023)."Optimization III: Convex Optimization"(PDF).
  16. ^abC. Zalinescu (2002).Convex Analysis in General Vector Spaces.World Scientific.ISBN9812380671.
  17. ^abH. Bauschke and P. L. Combettes (2011).Convex Analysis and Monotone Operator Theory in Hilbert Spaces.Springer. p.144.ISBN978-1-4419-9467-7.
  18. ^Kingman, J. F. C. (1961). "A Convexity Property of Positive Matrices".The Quarterly Journal of Mathematics.12:283–284.doi:10.1093/qmath/12.1.283.
  19. ^Cohen, J.E., 1981.Convexity of the dominant eigenvalue of an essentially nonnegative matrix.Proceedings of the American Mathematical Society, 81(4), pp.657-658.

References[edit]

  • Bertsekas, Dimitri(2003).Convex Analysis and Optimization.Athena Scientific.
  • Borwein, Jonathan,and Lewis, Adrian. (2000). Convex Analysis and Nonlinear Optimization. Springer.
  • Donoghue, William F. (1969).Distributions and Fourier Transforms.Academic Press.
  • Hiriart-Urruty, Jean-Baptiste, andLemaréchal, Claude.(2004). Fundamentals of Convex analysis. Berlin: Springer.
  • Krasnosel'skii M.A.,Rutickii Ya.B. (1961).Convex Functions and Orlicz Spaces.Groningen: P.Noordhoff Ltd.
  • Lauritzen, Niels (2013).Undergraduate Convexity.World Scientific Publishing.
  • Luenberger, David(1984).Linear and Nonlinear Programming.Addison-Wesley.
  • Luenberger, David(1969).Optimization by Vector Space Methods.Wiley & Sons.
  • Rockafellar, R. T.(1970).Convex analysis.Princeton: Princeton University Press.
  • Thomson, Brian (1994).Symmetric Properties of Real Functions.CRC Press.
  • Zălinescu, C. (2002).Convex analysis in general vector spaces.River Edge, NJ: World Scientific Publishing Co., Inc. pp. xx+367.ISBN981-238-067-1.MR1921556.

External links[edit]