ebook img

Total Variation Denoising via the Moreau Envelope PDF

0.48 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Total Variation Denoising via the Moreau Envelope

IEEESIGNALPROCESSINGLETTERS(PREPRINT) 1 Total Variation Denoising via the Moreau Envelope Ivan Selesnick Abstract—Total variation denoising is a nonlinear filtering A. Relation to Prior Work method well suited for the estimation of piecewise-constant signals observed in additive white Gaussian noise. The method Numerous non-convex penalties and algorithms have been is defined by the minimization of a particular non-differentiable proposed to outperform (cid:96) -norm regularization for the esti- 1 convex cost function. This paper describes a generalization of mation of sparse signals e.g., [8], [10], [12], [13], [15], [33], this cost function that can yield more accurate estimation of [34], [39], [43], [51], [53]. However, few of these methods piecewiseconstantsignals.Thenewcostfunctioninvolvesanon- 7 maintain the convexity of the cost function. The prescription convex penalty (regularizer) designed to maintain the convexity 1 of the cost function. The new penalty is based on the Moreau of non-convex penalties maintaining cost function convexity 0 envelope. The proposed total variation denoising method can be was pioneered by Blake, Zisserman, and Nikolova [6], [36], 2 implemented using forward-backward splitting. [39], [40], and further developed in Refs. [3], [4], [14], [20], n [28], [30], [32], [42], [46], [48]. These works rely on the a I. INTRODUCTION presence of both strongly and weakly convex terms, which J Total variation (TV) denoising is a nonlinear filtering is also exploited in [35]. 2 method based on the assumption that the underlying signal is Theproposedpenaltyisexpressedasadifferentiableconvex ] piecewise constant (equivalently, the derivative of the under- function subtracted from the standard penalty (i.e., (cid:96) norm). C 1 lying signal is sparse) [45]. Such signals arise in geoscience, Previous works also use this idea [41], [42], [47]. But the dif- O biophysics, and other areas [31]. The TV denoising technique ferentiable convex functions used therein are either separable h. is also used in conjunction with other methods in order to [41], [42] or sums of bivariate functions [47]. t process more general types of signals [20], [23], [24], [26]. In parallel with the submission of this paper, Carlsson has a m Total variation denoising is prototypical of methods based alsoproposedusingMoreauenvelopestoprescribenon-trivial on sparse signal models. It is defined by the minimization convexcostfunctions[9].Whiletheapproachin[9]startswith [ of a convex cost function comprising a quadratic data fidelity a given non-convex cost function (e.g., with the (cid:96) pseudo- 0 1 termandanon-differentiableconvexpenaltyterm.Thepenalty norm penalty) and seeks the convex envelope, our approach 9v term is the composition of a linear operator and the (cid:96)1 norm. startswiththe(cid:96)1-normpenaltyandseeksaclassofconvexity- 3 Although the (cid:96)1 norm stands out as the convex penalty that preserving penalties. 4 most effectively induces sparsity [27], non-convex penalties Some forms of generalized TV are based on infimal con- 0 can lead to more accurate estimation of the underlying signal volution (related to the Moreau envelope) [5], [7], [11], 0 [37], [38], [40], [44], [50]. [49]. But these works propose convex penalties suitable for 1. Afewrecentpapersconsidertheprescriptionofnon-convex non-piecewise-constant signals, while we propose non-convex 0 penalties that maintain the convexity of the TV denoising penalties suitable for piecewise-constant signals. 7 cost function [1], [30], [32], [48]. (The motivation for this is 1 to leverage the benefits of both non-convex penalization and : v convexoptimization,e.g.,toaccuratelyestimatetheamplitude II. TOTALVARIATIONDENOISING i of jump discontinuities while guaranteeing the uniqueness of X Definition 1. Given y ∈ RN and λ > 0, total variation thesolution.)Thepenaltiesconsideredintheseworksaresep- r denoising is defined as a arable (additive). But non-separable penalties can outperform separable penalties in this context. This is because preserving tvd(y;λ)=arg min(cid:8)1(cid:107)y−x(cid:107)2+λ(cid:107)Dx(cid:107) (cid:9) (1) the convexity of the cost function is a severely limiting x∈RN 2 2 1 requirement. Non-separable penalties can more successfully =prox (y) (2) λ(cid:107)D·(cid:107) meet this requirement because they are more general than 1 separable penalties [47]. where D is the (N −1)×N matrix This paper proposes a non-separable non-convex penalty for total variation denoising that generalizes the standard −1 1  penalty and maintains the convexity of the cost function to be  −1 1  menivneilmopizee,dc.a1nTmheorneeawccpuernataelltyy,ewsthimicahteisthbeaasemdplointudtheesoMfojuremapu D = ... ... . (3) −1 1 discontinuities in an underlying piecewise constant signal. TheauthoriswiththeTandonSchoolofEngineering,NewYorkUniversity, As indicated in (2), TV denoising is the proximity operator NewYork,USA.Email:[email protected]. [16] of the function x (cid:55)→ λ(cid:107)Dx(cid:107) . It is convenient that TV This work was supported by NSF under grant 1525398 and ONR under 1 denoising can be calculated exactly in finite-time [17], [18], grantN00014-15-1-2314. 1Softwareisavailableathttp://eeweb.poly.edu/iselesni/mtvd [22], [29]. 2 IEEESIGNALPROCESSINGLETTERS(PREPRINT) III. MOREAUENVELOPE TheproposedpenaltyisupperboundedbythestandardTV penalty, which is recovered as a special case. Before we define the non-differentiable non-convex penalty inSec.IV,wefirstdefineadifferentiableconvexfunction.We Proposition 5. Let α(cid:62)0. The penalty ψ satisfies α use the Moreau envelope from convex analysis [2]. ψ (x)=(cid:107)Dx(cid:107) , ∀x∈RN (12) 0 1 Definition 2. Let α(cid:62)0. We define S : RN →R as α and S (x)= min(cid:8)(cid:107)Dv(cid:107) + α(cid:107)x−v(cid:107)2(cid:9) (4) 0(cid:54)ψ (x)(cid:54)(cid:107)Dx(cid:107) , ∀x∈RN. (13) α v∈RN 1 2 2 α 1 Proof: It follows from (5) and (7). where D is the first-order difference matrix (3). When a convex function is subtracted from another convex If α>0, then S is the Moreau envelope of index α−1 of function [as in (11)], the resulting function may well be α the function x(cid:55)→(cid:107)Dx(cid:107) . negative on part of its domain. Inequality (13) states that the 1 proposed penalty ψ avoids this fate. This is relevant because Proposition 1. The function S can be calculated by α α the penalty function should be non-negative. S (x)=0 (5) Figures in the supplemental material show examples of the 0 proposed penalty ψ and the function S . S (x)=(cid:107)Dtvd(x;1/α)(cid:107) α α α 1 + α(cid:107)x−tvd(x;1/α)(cid:107)2, α>0. (6) V. ENHANCEDTVDENOISING 2 2 Wedefine‘Moreau-enhanced’TVdenoising.Ifα>0,then Proof: For α =0: Setting α =0 and v =0 in (4) gives (5).Forα>0:BythedefinitionofTVdenoising,thev ∈RN the proposed penalty penalizes large amplitude values of Dx less than the (cid:96) norm does (i.e., ψ (x)(cid:54)(cid:107)Dx(cid:107) ), hence it is minimizing the function in (4) is the TV denoising of x, i.e., 1 α 1 less likely to underestimate jump discontinuities. vopt =tvd(x,1/α). Definition 4. Given y ∈ RN, λ > 0, and α (cid:62) 0, we define Proposition 2. Let α(cid:62)0. The function S satisfies α Moreau-enhanced total variation denoising as 0(cid:54)Sα(x)(cid:54)(cid:107)Dx(cid:107)1, ∀x∈RN. (7) mtvd(y;λ,α)=arg min(cid:8)1(cid:107)y−x(cid:107)2+λψ (x)(cid:9) (14) Proof: From (4), we have S (x)(cid:54)(cid:107)Dv(cid:107) +(α/2)(cid:107)x− x∈RN 2 2 α α 1 where ψ is given by (11). v(cid:107)2 for all v ∈ RN. In particular, v = x leads to S (x) (cid:54) α 2 α (cid:107)Dx(cid:107) . Also, S (x) (cid:62) 0 since S (x) is defined as the The parameter α controls the non-convexity of the penalty. 1 α α minimum of a non-negative function. Ifα=0,thenthepenaltyisconvexandMoreau-enhancedTV denoising reduces to TV denoising. Greater values of α make Proposition 3. Let α (cid:62) 0. The function S is convex and α the penalty more non-convex. What is the greatest value of differentiable. α that maintains convexity of the cost function? The critical Proof: It follows from Proposition 12.15 in Ref. [2]. value is given by Theorem 1. Proposition 4. Let α(cid:62)0. The gradient of Sα is given by Theorem 1. Let λ>0 and α(cid:62)0. Define Fα: RN →R as ∇S0(x)=0 (8) Fα(x)= 21(cid:107)y−x(cid:107)22+λψα(x) (15) (cid:0) (cid:1) ∇S (x)=α x−tvd(x;1/α) , α>0 (9) where ψ is given by (11). If α α where tvd denotes total variation denoising (1). 0(cid:54)α(cid:54)1/λ (16) Proof: Since S is the Moreau envelope of index α−1 thenFα isconvex.If0(cid:54)α<1/λthenFα isstronglyconvex. α of the function x (cid:55)→ (cid:107)Dx(cid:107)1 when α > 0, it follows by Proof: We write the cost function as Proposition 12.29 in Ref. [2] that F (x)= 1(cid:107)y−x(cid:107)2+λ(cid:107)Dx(cid:107) −λS (x) (17) (cid:0) (cid:1) α 2 2 1 α ∇Sα(x)=α x−prox(1/α)(cid:107)D·(cid:107) (x) . (10) = 1(cid:107)y−x(cid:107)2+λ(cid:107)Dx(cid:107) 1 2 2 1 This proximity operator is TV denoising, giving (9). −λ min(cid:8)(cid:107)Dv(cid:107) + α(cid:107)x−v(cid:107)2(cid:9) (18) v∈RN 1 2 2 = max(cid:8)1(cid:107)y−x(cid:107)2+λ(cid:107)Dx(cid:107) IV. NON-CONVEXPENALTY v∈RN 2 2 1 To strongly induce sparsity of Dx, we define a non-convex −λ(cid:107)Dv(cid:107) − λα(cid:107)x−v(cid:107)2(cid:9) (19) 1 2 2 generalization of the standard TV penalty. The new penalty is = max(cid:8)1(1−λα)(cid:107)x(cid:107)2+λ(cid:107)Dx(cid:107) +g(x,v)(cid:9) (20) defined by subtracting a differentiable convex function from v∈RN 2 2 1 the standard penalty. = 1(1−λα)(cid:107)x(cid:107)2+λ(cid:107)Dx(cid:107) + maxg(x,v) (21) Definition 3. Let α(cid:62)0. We define the penalty ψ : RN →R 2 2 1 v∈RN α where g(x,v) is affine in x. The last term is convex as it is as the point-wise maximum of a set of convex functions. Hence, ψ (x)=(cid:107)Dx(cid:107) −S (x) (11) α 1 α F is a convex function if 1−λα (cid:62) 0. If 1−λα > 0, then α where D is the matrix (3) and S is defined by (4). F is strongly convex (and strictly convex). α α 3 VI. ALGORITHM In summary, the proposed Moreau-enhanced TV denoising Proposition 6. Let y ∈ RN, λ > 0, and 0 < α < 1/λ. Then method comprises the steps: x(k) produced by the iteration 1) Set the regularization parameter λ (λ>0). z(k) =y+λα(cid:0)x(k)−tvd(x(k);1/α)(cid:1) (22a) 2) Set the non-convexity parameter α (0(cid:54)α<1/λ). 3) Initialize x(0) =0. x(k+1) =tvd(z(k);λ). (22b) 4) Run iteration (22) until convergence. converges to the solution of the Moreau-enhanced TV denois- ing problem (14). VII. OPTIMALITYCONDITION Proof: If the cost function (15) is strongly convex, then To avoid terminating the iterative algorithm too early, it is the minimizer can be calculated using the forward-backward useful to verify convergence using an optimality condition. splitting (FBS) algorithm [2], [16]. This algorithm minimizes a function of the form Proposition 7. Let y ∈RN, λ>0, and 0<α<1/λ. If x is a solution to (14), then F(x)=f (x)+f (x) (23) 1 2 (cid:2) (cid:0) (cid:1)(cid:3) where both f1 and f2 are convex and ∇f1 is additionally C (x−y)/λ+α(tvd(x;1/α)−x) n ∈sign([Dx]n) (32) Lipschitz continuous. The FBS algorithm is given by for n=0,...,N −1, where C ∈R(N−1)×N is given by z(k) =x(k)−µ(cid:2)∇f (x(k))(cid:3) (24a) 1 (cid:40) x(k+1) =argmxin(cid:8)12(cid:107)z(k)−x(cid:107)22+µf2(x)(cid:9) (24b) Cm,n = 10,, mm(cid:62)<nn, i.e., [Cx]n = (cid:88) xm (33) m(cid:54)n where 0 < µ < 2/ρ and ρ is the Lipschitz constant of ∇f . 1 The iterates x(k) converge to a minimizer of F. and sign is the set-valued signum function To apply the FBS algorithm to the proposed cost function  (15), we write it as {−1}, t<0 F (x)= 1(cid:107)y−x(cid:107)2+λψ (x), (25) sign(t)= [−1,1], t=0 (34) α 2 2 α {1}, t>0. = 1(cid:107)y−x(cid:107)2+λ(cid:107)Dx(cid:107) −λS (x) (26) 2 2 1 α =f1(x)+f2(x) (27) According to (32), if x ∈ RN is a minimizer, then the points ([Dx] ,u )∈R2 must lie on the graph of the signum where n n function, where u denotes the value on the left-hand side n f1(x)= 12(cid:107)y−x(cid:107)22−λSα(x) (28a) of (32). Hence, the optimality condition can be depicted as a f (x)=λ(cid:107)Dx(cid:107) . (28b) scatter plot. Figures in the supplemental material show how 2 1 the points in the scatter plot converge to the signum function The gradient of f is given by 1 as the algorithm (22) progresses. ∇f (x)=x−y−λ∇S (x) (29) Proof of Proposition 7: A vector x minimizes a convex 1 α (cid:0) (cid:1) function F if 0 ∈ ∂F(x) where ∂F(x) is the subdifferential =x−y−λα x−tvd(x;1/α) (30) ofF atx.Thesubdifferentialofthecostfunction(15)isgiven usingProposition4.SubtractingS fromf doesnotincrease by α 1 theLipschitzconstantof∇f ,thevalueofwhichis1.Hence, 1 we may set 0<µ<2. ∂Fα(x)=x−y−λ∇Sα(x)+∂(λ(cid:107)D·(cid:107)1)(x) (35) Using (28), the FBS algorithm (24) becomes which can be written as z(k) =x(k)−µ(cid:2)x(k)−y −λα(cid:0)x(k)−tvd(x(k);1/α)(cid:1)(cid:3) (31a) ∂F (x)={x−y−λ∇S (x)+λDTu α α x(k+1) =argmxin(cid:8)12(cid:107)z(k)−x(cid:107)22+µλ(cid:107)Dx(cid:107)1(cid:9). (31b) :un ∈sign([Dx]n), u∈RN−1}. (36) Note that (31b) is TV denoising (1). Using the value µ = Hence, the condition 0∈∂F (x) can be written as α 1 gives iteration (22). (Experimentally, we found this value yields fast convergence.) (y−x)/λ+∇S (x) α Each iteration of (22) entails solving two standard TV ∈{DTu:u ∈sign([Dx] ), u∈RN−1}. (37) n n denoising problems. In this work, we calculate TV denoising usingthefastexactClanguageprogrambyCondat[17].Like LetC beamatrixofsize(N−1)×N suchthatCDT =−I, the iterative shrinkage/thresholding algorithm (ISTA) [19], e.g.,(33).Itfollowsthatthecondition0∈∂F (x)impliesthat α [25], algorithm (22) can be accelerated in various ways. We suggest not setting α too close to the critical value 1/λ (cid:2)C(cid:0)(x−y)/λ−∇S (x)(cid:1)(cid:3) ∈sign([Dx] ) (38) α n n because the FBS algorithm generally converges faster when the cost function is more strongly convex (α<1). for n=0,...,N −1. Using Proposition 4 gives (32). 4 IEEESIGNALPROCESSINGLETTERS(PREPRINT) (a) Noisy signal (σ = 0.50) TV denoising 6 L1 norm 4 0.5 Scalar log Scalar MC E S 0.4 Proposed 2 M R e 0.3 0 g a er v 0.2 −2 A 0.1 0 50 100 150 200 250 0 (b) Standard convex penalty (L1 norm) 0.2 0.4 0.6 0.8 1 6 Noise standard deviation (σ) RMSE = 0.318, MAE = 0.225 4 Fig. 2. TV denoising using four penalties: RMSE as a function of noise level. 2 0 Figure 1 shows the result of TV denoising with three different penalties. In each case, a convex cost function is −2 minimized.Figure1(b)showstheresultusingstandardTVde- noising (i.e., using the (cid:96) -norm). This denoised signal consis- 0 50 100 150 200 250 1 tently underestimates the amplitudes of jump discontinuities, (c) Scalar MC penalty especially those occurring near other jump discontinuities of 6 opposite sign. Figure 1(c) shows the result using a separable RMSE = 0.240, MAE = 0.172 non-convex penalty [48]. This method can use any non- 4 convex scalar penalty satisfying aprescribed set of properties. Here we use the minimax-concave (MC) penalty [3], [52] 2 with non-convexity parameter set to maintain cost function 0 convexity. This result significantly improves the root-mean- squareerror(RMSE)andmean-absolute-deviation(MAE),but −2 still underestimates the amplitudes of jump discontinuities. Moreau-enhancedTVdenoising,showninFig.1(d),further 0 50 100 150 200 250 reduces the RMSE and MAE and more accurately estimates the amplitudes of jump discontinuities. The proposed non- (d) Proposed penalty 6 separable non-convex penalty avoids the consistent underes- RMSE = 0.200, MAE = 0.130 timation of discontinuities seen in Figs. 1(b) and 1(c). 4 To further compare the denoising capability of the consid- ered penalties, we calculate the average RMSE as a function 2 of the noise level. We let the noise standard deviation span the interval 0.2(cid:54)σ (cid:54)1.0. For each σ value, we calculate the 0 average RMSE of 100 noise realizations. Figure 2 shows that −2 the proposed penalty yields the lowest average RMSE for all σ (cid:62) 0.4. However, at low noise levels, separable convexity- 0 50 100 150 200 250 preserving penalties [48] perform better than the proposed non-separable convexity-preserving penalty. Fig.1. Totalvariationdenoisingusingthreedifferentpenalties.(Thedashed lineisthetruenoise-freesignal.) IX. CONCLUSION This paper demonstrates the use of the Moreau envelope to define a non-separable non-convex TV denoising penalty that VIII. EXAMPLE maintains the convexity of the TV denoising cost function. This example applies TV denoising to the noisy piecewise ThebasicideaistosubtractfromaconvexpenaltyitsMoreau constant signal shown in Fig. 1(a). This is the ‘blocks’ signal envelope. This idea should also be useful for other problems, (length N = 256) generated by the Wavelab [21] function e.g., analysis tight-frame denoising [41]. MakeSignal with additive white Gaussian noise (σ =0.5). Separableconvexity-preservingpenalties[48]outperformed √ We settheregularization parameterto λ= Nσ/4 following theproposedoneatlownoiselevelsintheexample.Itisyetto adiscussioninRef.[22].ForMoreau-enhancedTVdenoising, be determined if a more general class of convexity-preserving we set the non-convexity parameter to α=0.7/λ. penalties can outperform both across all noise levels. 5 REFERENCES [28] W.He,Y.Ding,Y.Zi,andI.W.Selesnick.Sparsity-basedalgorithmfor detecting faults in rotating machines. Mechanical Systems and Signal Processing,72-73:46–64,May2016. [1] F. Astrom and C. Schnorr. On coupled regularization for non-convex variational image enhancement. In IAPR Asian Conf. on Pattern [29] N.A.Johnson. Adynamicprogrammingalgorithmforthefusedlasso Recognition(ACPR),pages786–790,November2015. andL0-segmentation. J.Computat.Graph.Stat.,22(2):246–260,2013. [30] A.Lanza,S.Morigi,andF.Sgallari. Conveximagedenoisingvianon- [2] H. H. Bauschke and P. L. Combettes. Convex Analysis and Monotone convex regularization with parameter selection. J. Math. Imaging and OperatorTheoryinHilbertSpaces. Springer,2011. [3] ˙I.Bayram. Penaltyfunctionsderivedfrommonotonemappings. IEEE Vision,pages1–26,2016. [31] M.A.LittleandN.S.Jones.Generalizedmethodsandsolversfornoise SignalProcessingLetters,22(3):265–269,March2015. removal from piecewise constant signals: Part I – background theory. [4] I. Bayram. On the convergence of the iterative shrinkage/thresholding Proc.R.Soc.A,467:3088–3114,2011. algorithmwithaweaklyconvexpenalty. IEEETrans.SignalProcess., [32] M. Malek-Mohammadi, C. R. Rojas, and B. Wahlberg. A class 64(6):1597–1608,March2016. of nonconvex penalties preserving overall convexity in optimization- [5] S.BeckerandP.L.Combettes. Analgorithmforsplittingparallelsums basedmeanfiltering. IEEETrans.SignalProcess.,64(24):6650–6664, of linearly composed monotone operators, with applications to signal December2016. recovery. J.NonlinearandConvexAnalysis,15(1):137–159,2014. [33] Y. Marnissi, A. Benazza-Benyahia, E. Chouzenoux, and J.-C. Pesquet. [6] A.BlakeandA.Zisserman. VisualReconstruction. MITPress,1987. Generalizedmultivariateexponentialpowerpriorforwavelet-basedmul- [7] M.Burger,K.Papafitsoros,E.Papoutsellis,andC.-B.Scho¨nlieb.Infimal tichannelimagerestoration. InProc.IEEEInt.Conf.ImageProcessing convolution regularisation functionals of BV and Lp spaces. J. Math. (ICIP),pages2402–2406,September2013. ImagingandVision,55(3):343–369,2016. [34] H. Mohimani, M. Babaie-Zadeh, and C. Jutten. A fast approach for [8] E. J. Cande`s, M. B. Wakin, and S. Boyd. Enhancing sparsity by overcompletesparsedecompositionbasedonsmoothedl0norm. IEEE reweighted l1 minimization. J. Fourier Anal. Appl., 14(5):877–905, Trans.SignalProcess.,57(1):289–301,January2009. December2008. [35] T. Mo¨llenhoff, E. Strekalovskiy, M. Moeller, and D. Cremers. The [9] M. Carlsson. On convexification/optimization of functionals including primal-dual hybrid gradient method for semiconvex splittings. SIAM anl2-misfitterm. https://arxiv.org/abs/1609.09378,September2016. J.Imag.Sci.,8(2):827–857,2015. [10] M.CastellaandJ.-C.Pesquet. OptimizationofaGeman-McClurelike [36] M. Nikolova. Estimation of binary images by minimizing convex criterionforsparsesignaldeconvolution.InIEEEInt.WorkshopComput. criteria. InProc.IEEEInt.Conf.ImageProcessing(ICIP),pages108– Adv.Multi-SensorAdaptiveProc.,pages309–312,December2015. 112vol.2,1998. [11] A. Chambolle and P.-L. Lions. Image recovery via total variation [37] M.Nikolova.Localstronghomogeneityofaregularizedestimator.SIAM minimization and related problems. Numerische Mathematik, 76:167– J.Appl.Math.,61(2):633–658,2000. 188,1997. [38] M.Nikolova. Analysisoftherecoveryofedgesinimagesandsignals [12] R.Chartrand. Shrinkagemappingsandtheirinducedpenaltyfunctions. byminimizingnonconvexregularizedleast-squares. MultiscaleModel. InProc.IEEEInt.Conf.Acoust.,Speech,SignalProcessing(ICASSP), Simul.,4(3):960–991,2005. pages1026–1029,May2014. [39] M. Nikolova. Energy minimization methods. In O. Scherzer, editor, [13] L. Chen and Y. Gu. The convergence guarantees of a non-convex HandbookofMathematicalMethodsinImaging,chapter5,pages138– approachforsparserecovery.IEEETrans.SignalProcess.,62(15):3754– 186.Springer,2011. 3767,August2014. [40] M. Nikolova, M. K. Ng, and C.-P. Tam. Fast nonconvex nonsmooth [14] P.-Y. Chen and I. W. Selesnick. Group-sparse signal denoising: Non- minimization methods for image restoration and reconstruction. IEEE convex regularization, convex optimization. IEEE Trans. Signal Pro- Trans.ImageProcess.,19(12):3073–3088,December2010. cess.,62(13):3464–3478,July2014. [41] A. Parekh and I. W. Selesnick. Convex denoising using non-convex [15] E. Chouzenoux, A. Jezierska, J. Pesquet, and H. Talbot. A majorize- tightframeregularization.IEEESignalProcessingLetters,22(10):1786– minimizesubspaceapproachfor(cid:96)2−(cid:96)0 imageregularization. SIAMJ. 1790,October2015. Imag.Sci.,6(1):563–591,2013. [42] A.ParekhandI.W.Selesnick. Enhancedlow-rankmatrixapproxima- [16] P.L.CombettesandJ.-C.Pesquet. Proximalsplittingmethodsinsignal tion. IEEESignalProcessingLetters,23(4):493–497,April2016. processing.InH.H.Bauschkeetal.,editors,Fixed-PointAlgorithmsfor [43] J. Portilla and L. Mancera. L0-based sparse approximation: two InverseProblemsinScienceandEngineering,pages185–212.Springer- alternative methods and some applications. In Proceedings of SPIE, Verlag,2011. volume6701(WaveletsXII),SanDiego,CA,USA,2007. [44] P. Rodriguez and B. Wohlberg. Efficient minimization method for a [17] L.Condat. Adirectalgorithmfor1-Dtotalvariationdenoising. IEEE generalized total variation functional. IEEE Trans. Image Process., SignalProcessingLetters,20(11):1054–1057,November2013. 18(2):322–332,February2009. [18] J. Darbon and M. Sigelle. Image restoration with discrete constrained [45] L.Rudin,S.Osher,andE.Fatemi.Nonlineartotalvariationbasednoise totalvariationPartI:Fastandexactoptimization.J.Math.Imagingand removalalgorithms. PhysicaD,60:259–268,1992. Vision,26(3):261–276,2006. [46] I.W.SelesnickandI.Bayram. Sparsesignalestimationbymaximally [19] I.Daubechies,M.Defrise,andC.DeMol. Aniterativethresholdingal- sparseconvexoptimization. IEEETrans.SignalProcess.,62(5):1078– gorithmforlinearinverseproblemswithasparsityconstraint.Commun. 1092,March2014. PureAppl.Math,57(11):1413–1457,2004. [47] I. W. Selesnick and I. Bayram. Enhanced sparsity by non-separable [20] Y. Ding and I. W. Selesnick. Artifact-free wavelet denoising: Non- regularization. IEEE Trans. Signal Process., 64(9):2298–2313, May convex sparse regularization, convex optimization. IEEE Signal Pro- 2016. cessingLetters,22(9):1364–1368,September2015. [48] I.W.Selesnick,A.Parekh,andI.Bayram. Convex1-Dtotalvariation [21] D. Donoho, A. Maleki, and M. Shahram. Wavelab 850, 2005. denoising with non-convex regularization. IEEE Signal Processing http://www-stat.stanford.edu/%7Ewavelab/. Letters,22(2):141–144,February2015. [22] L. Du¨mbgen and A. Kovac. Extensions of smoothing via taut strings. [49] S.Setzer,G.Steidl,andT.Teuber. Infimalconvolutionregularizations Electron.J.Statist.,3:41–75,2009. with discrete l1-type functionals. Commun. Math. Sci., 9(3):797–827, [23] S.DurandandJ.Froment. Reconstructionofwaveletcoefficientsusing 2011. total variation minimization. SIAM J. Sci. Comput., 24(5):1754–1767, [50] M. Storath, A. Weinmann, and L. Demaret. Jump-sparse and sparse 2003. recovery using Potts functionals. IEEE Trans. Signal Process., [24] G.R.Easley,D.Labate,andF.Colonna. Shearlet-basedtotalvariation 62(14):3654–3666,July2014. diffusion for denoising. IEEE Trans. Image Process., 18(2):260–268, [51] D. P. Wipf, B. D. Rao, and S. Nagarajan. Latent variable Bayesian February2009. modelsforpromotingsparsity.IEEETrans.Inform.Theory,57(9):6236– [25] M. Figueiredo and R. Nowak. An EM algorithm for wavelet-based 6255,September2011. imagerestoration. IEEETrans.ImageProcess.,12(8):906–916,August [52] C.-H.Zhang.Nearlyunbiasedvariableselectionunderminimaxconcave 2003. penalty. TheAnnalsofStatistics,pages894–942,2010. [26] A.GholamiandS.M.Hosseini. AbalancedcombinationofTikhonov [53] H. Zou and R. Li. One-step sparse estimates in nonconcave penalized andtotalvariationregularizationsforreconstructionofpiecewise-smooth likelihoodmodels. Ann.Statist.,36(4):1509–1533,2008. signals. SignalProcessing,93(7):1945–1960,2013. [27] T. Hastie, R. Tibshirani, and M. Wainwright. Statistical learning with sparsity:thelassoandgeneralizations. CRCPress,2015. 6 IEEESIGNALPROCESSINGLETTERS(PREPRINT) X. SUPPLEMENTALFIGURES To gain intuition about the proposed penalty function and how it induces sparsity of Dx while maintaining convexity of the cost function, a few illustrations are useful. Figure 3 illustrates the proposed penalty, its sparsity- inducing behavior, and its relationship to the differentiable ||Dx|| convex function S . Figure 4 illustrates how the proposed 1 α 4 penalty is able to maintain the convexity of the cost function. Figure3showstheproposedpenalty ψ definedin(11)for α α=1andN =2.Itcanbeseenthatthepenaltyapproximates 3 the standard TV penalty (cid:107)D·(cid:107) for signals x for which Dx 1 is approximately zero. But it increases more slowly than the 2 standardTVpenaltyas(cid:107)Dx(cid:107)→∞.Inthatsense,itpenalizes large values less than the standard TV penalty. 1 As shown in Fig. 3, the proposed penalty is expressed as the standard TV penalty minus the differentiable convex non- 0 2 negative function S . Since S is flat around the null space α α of D, the penalty ψ approximates the standard TV penalty 0 around the null spaαce of D. In addition, since Sα is non- x2 −2 −2 −1 x 0 1 2 negative, the penalty ψ lies below the standard TV penalty. 1 α Figure 4 shows the differentiable part of the cost function S (x) Fα for α = 1, λ = 1, and N = 2. The differentiable part is α given by f in (28a). The total cost function is obtained by 4 1 adding the standard TV penalty to f , see (23). Hence, F is 1 α convex if the differentiable part f1 is convex. As can be seen 3 in Fig. 4, the function f is convex. We note that the function 1 f1 in this figure is not strongly convex. This is because we 2 have used α=1/λ. If α<1/λ, then the function f will be 1 strongly convex (and hence F will also be strongly convex α 1 and have a unique minimizer). We recommend α<1/λ. Figure5showsthedifferentiablepartf ofthecostfunction 1 0 F for α = 2, λ = 1, and N = 2. Here, the function f is 2 α 1 non-convex because α > 1/λ which violates the convexity 0 condition. x −2 −2 −1 0 1 2 In order to simplify the illustration, we have set y = 0 in 2 x 1 Fig. 4. In practice y (cid:54)=0. But the only difference between the cases y = 0 and y (cid:54)= 0 is an additive affine function which ψ (x) = ||Dx|| − S (x) α 1 α does not alter the convexity properties of the function. 1.5 InpracticeweareinterestedinthecaseN (cid:29)2,i.e.,signals much longer than two samples. However, in order to illustrate the functions, we are limited to the case of N = 2. We note 1 thatthecaseofN =2doesnotfullyillustratethebehaviorof the proposed penalty. In particular, when N = 2 the penalty is simply a linear transformation of a scalar function, which 0.5 does not convey the non-separable behavior of the penalty for N >2. 0 A separate document has additional supplemental figures 2 illustrating the convergence of the iterative algorithm (22). 0 These figures show the optimality condition as a scatter plot. The points in the scatter plot converge to the signum function x2 −2 −2 −1 x 0 1 2 1 as the algorithm converges. Fig.3. Penaltyψα withα=1forN =2. 7 0.5 ||x||2 0.5 ||x||2 2 2 4 4 3 3 2 2 1 1 0 0 2 2 0 0 x −2 −2 −1 0 1 2 x −2 −2 −1 0 1 2 2 x 2 x 1 1 S (x) S (x) α α 4 4 3 3 2 2 1 1 0 0 2 2 0 0 x −2 −2 −1 0 1 2 x −2 −2 −1 0 1 2 2 x 2 x 1 1 f (x) = 0.5 ||x||2 − λ S (x) f (x) = 0.5 ||x||2 − λ S (x) 1 2 α 1 2 α 4 4 3 3 2 2 1 1 0 0 2 2 0 0 x −2 −2 −1 0 1 2 x −2 −2 −1 0 1 2 2 x 2 x 1 1 Fig. 4. Differentiable part f1 of cost function with λ = 1, α = 1 for Fig. 5. Differentiable part f1 of cost function with λ = 1, α = 2 for N =2.Thefunctionisconvexasα(cid:54)1/λ. N =2.Thefunctionisnon-convexasα>1/λ.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.