NONNEGATIVE MATRIX AND TENSOR FACTORIZATIONS NONNEGATIVE MATRIX AND TENSOR FACTORIZATIONS APPLICATIONS TO EXPLORATORY MULTI-WAY DATA ANALYSIS AND BLIND SOURCE SEPARATION AndrzejCichocki LaboratoryforAdvancedBrainSignalProcessing,RikenBrainScienceInstitute,Japan;andWarsaw UniversityofTechnologyandSystemsResearchInstitute,PAN,Poland RafalZdunek InstituteofTelecommunications,TeleinformaticsandAcoustics,WroclawUniversityofTechnology, Poland;andRikenBrainScienceInstitute,Japan AnhHuyPhan LaboratoryforAdvancedBrainSignalProcessing,RikenBrainScienceInstitute,Japan Shun-ichiAmari ResearchUnitforMathematicalNeuroscience,RikenBrainScienceInstitute,Japan Thiseditionfirstpublished2009 ©2009JohnWiley&Sons,Ltd Registeredoffice JohnWiley&SonsLtd,TheAtrium,SouthernGate,Chichester,WestSussex,PO198SQ,UnitedKingdom Fordetailsofourglobaleditorialoffices,forcustomerservicesandforinformationabouthowtoapplyforpermissionto reuse the copyright material in this book please see our website at www.wiley.com. TherightoftheauthortobeidentifiedastheauthorofthisworkhasbeenassertedinaccordancewiththeCopyright, DesignsandPatentsAct1988. Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmitted,inany formorbyanymeans,electronic,mechanical,photocopying,recordingorotherwise,exceptaspermittedbytheUK Copyright,DesignsandPatentsAct1988,withoutthepriorpermissionofthepublisher. Wileyalsopublishesitsbooksinavarietyofelectronicformats.Somecontentthatappearsinprintmaynotbeavailable inelectronicbooks. Designationsusedbycompaniestodistinguishtheirproductsareoftenclaimedastrademarks.Allbrandnamesand productnamesusedinthisbookaretradenames,servicemarks,trademarksorregisteredtrademarksoftheirrespective owners.Thepublisherisnotassociatedwithanyproductorvendormentionedinthisbook.Thispublicationisdesigned toprovideaccurateandauthoritativeinformationinregardtothesubjectmattercovered.Itissoldontheunderstanding thatthepublisherisnotengagedinrenderingprofessionalservices.Ifprofessionaladviceorotherexpertassistanceis required,theservicesofacompetentprofessionalshouldbesought. MATLAB®MATLABandanyassociatedtrademarksusedinthisbookaretheregisteredtrademarksofThe MathWorks,Inc. ForMATLAB®productinformation,pleasecontact: TheMathWorks,Inc. 3AppleHillDrive Natick,MA,01760-2098USA Tel:508-647-7000 Fax:508-647-7001 E-mail:[email protected] Web:www.mathworks.com LibraryofCongressCataloging-in-PublicationData Nonnegativematrixandtensorfactorizations:applicationstoexploratorymultiwaydataanalysis andblindsourceseparation/AndrzejCichocki...[etal.]. p.cm. Includesbibliographicalreferencesandindex. ISBN978-0-470-74666-0(cloth) 1. Computeralgorithms.2. Datamining.3.Machinelearning4. Datastructures(Computerscience) I. Cichocki,Andrzej. QA76.9.A43N652009 005.1–dc22 2009016049 AcataloguerecordforthisbookisavailablefromtheBritishLibrary. ISBN:978-0-470-74666-0 Setin9/11ptTimesbyThomsonDigital,Noida,India PrintedinSingaporebyFabulous Contents Preface xi Acknowledgments xv GlossaryofSymbolsandAbbreviations xvii 1 Introduction–ProblemStatementsandModels 1 1.1 BlindSourceSeparationandLinearGeneralizedComponentAnalysis 2 1.2 MatrixFactorizationModelswithNonnegativityandSparsityConstraints 7 1.2.1 WhyNonnegativityandSparsityConstraints? 7 1.2.2 BasicNMFModel 8 1.2.3 SymmetricNMF 9 1.2.4 Semi-OrthogonalNMF 10 1.2.5 Semi-NMFandNonnegativeFactorizationofArbitraryMatrix 10 1.2.6 Three-factorNMF 10 1.2.7 NMFwithOffset(AffineNMF) 13 1.2.8 Multi-layerNMF 14 1.2.9 SimultaneousNMF 14 1.2.10 ProjectiveandConvexNMF 15 1.2.11 KernelNMF 16 1.2.12 ConvolutiveNMF 16 1.2.13 OverlappingNMF 17 1.3 BasicApproachestoEstimateParametersofStandardNMF 18 1.3.1 Large-scaleNMF 21 1.3.2 Non-uniquenessofNMFandTechniquestoAlleviatetheAmbiguityProblem 22 1.3.3 InitializationofNMF 24 1.3.4 StoppingCriteria 25 1.4 TensorPropertiesandBasisofTensorAlgebra 26 1.4.1 Tensors(Multi-wayArrays)–Preliminaries 26 1.4.2 Subarrays,TubesandSlices 27 1.4.3 Unfolding–Matricization 28 1.4.4 Vectorization 31 1.4.5 Outer,Kronecker,Khatri-RaoandHadamardProducts 32 1.4.6 Mode-nMultiplicationofTensorbyMatrixandTensor byVector,ContractedTensorProduct 34 1.4.7 SpecialFormsofTensors 38 1.5 TensorDecompositionsandFactorizations 39 1.5.1 WhyMulti-wayArrayDecompositionsandFactorizations? 40 1.5.2 PARAFACandNonnegativeTensorFactorization 42 1.5.3 NTF1Model 47 vi Contents 1.5.4 NTF2Model 49 1.5.5 IndividualDifferencesinScaling(INDSCAL)andImplicitSliceCanonical DecompositionModel(IMCAND) 52 1.5.6 ShiftedPARAFACandConvolutiveNTF 53 1.5.7 NonnegativeTuckerDecompositions 55 1.5.8 BlockComponentDecompositions 59 1.5.9 Block-OrientedDecompositions 62 1.5.10 PARATUCK2andDEDICOMModels 63 1.5.11 HierarchicalTensorDecomposition 65 1.6 DiscussionandConclusions 66 Appendix1.A:UniquenessConditionsforThree-wayTensorFactorizations 66 Appendix1.B:SingularValueDecomposition(SVD)andPrincipalComponent Analysis(PCA)withSparsityand/orNonnegativityConstraints 67 1.B.1 StandardSVDandPCA 68 1.B.2 SparsePCA 70 1.B.3 NonnegativePCA 71 Appendix1.C:DeterminingaTrueNumberofComponents 71 Appendix1.D:NonnegativeRankFactorizationUsingWedderbornTheorem–Estimation oftheNumberofComponents 74 References 75 2 SimilarityMeasuresandGeneralizedDivergences 81 2.1 Error-inducedDistanceandRobustRegressionTechniques 82 2.2 RobustEstimation 84 2.3 Csisza´rDivergences 90 2.4 BregmanDivergence 96 2.4.1 BregmanMatrixDivergences 103 2.5 Alpha-Divergences 104 2.5.1 AsymmetricAlpha-Divergences 104 2.5.2 SymmetricAlpha-Divergences 110 2.6 Beta-Divergences 112 2.7 Gamma-Divergences 116 2.8 DivergencesDerivedfromTsallisandRe´nyiEntropy 118 2.8.1 ConcludingRemarks 119 Appendix2.A:InformationGeometry,CanonicalDivergence,andProjection 120 2.A.1 SpaceofProbabilityDistributions 120 2.A.2 GeometryofSpaceofPositiveMeasures 123 Appendix2.B:ProbabilityDensityFunctionsforVariousDistributions 125 References 127 3 MultiplicativeIterativeAlgorithmsforNMFwithSparsityConstraints 131 3.1 ExtendedISRAandEMMLAlgorithms:RegularizationandSparsity 132 3.1.1 MultiplicativeNMFAlgorithmsBasedontheSquaredEuclideanDistance 132 3.1.2 MultiplicativeNMFAlgorithmsBasedonKullback-LeiblerI-Divergence 139 3.2 MultiplicativeAlgorithmsBasedonAlpha-Divergence 143 3.2.1 MultiplicativeAlphaNMFAlgorithm 143 3.2.2 GeneralizedMultiplicativeAlphaNMFAlgorithms 147 3.3 AlternatingSMART:SimultaneousMultiplicativeAlgebraicReconstructionTechnique 148 3.3.1 AlphaSMARTAlgorithm 148 3.3.2 GeneralizedSMARTAlgorithms 150 Contents vii 3.4 MultiplicativeNMFAlgorithmsBasedonBeta-Divergence 151 3.4.1 MultiplicativeBetaNMFAlgorithm 151 3.4.2 MultiplicativeAlgorithmBasedontheItakura-SaitoDistance 156 3.4.3 GeneralizedMultiplicativeBetaAlgorithmforNMF 156 3.5 AlgorithmsforSemi-orthogonalNMFandOrthogonalThree-FactorNMF 157 3.6 MultiplicativeAlgorithmsforAffineNMF 159 3.7 MultiplicativeAlgorithmsforConvolutiveNMF 160 3.7.1 MultiplicativeAlgorithmforConvolutiveNMFBasedonAlpha-Divergence 162 3.7.2 MultiplicativeAlgorithmforConvolutiveNMFBasedonBeta-Divergence 162 3.7.3 EfficientImplementationofCNMFAlgorithm 165 3.8 SimulationExamplesforStandardNMF 166 3.9 ExamplesforAffineNMF 170 3.10 MusicAnalysisandDecompositionUsingConvolutiveNMF 176 3.11 DiscussionandConclusions 184 Appendix3.A:FastAlgorithmsforLarge-scaleData 187 3.A.1 RandomBlock-wiseProcessingApproach–Large-scaleNMF 187 3.A.2 Multi-layerProcedure 188 3.A.3 ParallelProcessing 188 Appendix3.B:PerformanceEvaluation 188 3.B.1 Signal-to-Interference-Ratio-SIR 188 3.B.2 PeakSignal-to-Noise-Ratio(PSNR) 190 Appendix3.C:ConvergenceAnalysisoftheMultiplicativeAlphaNMFAlgorithm 191 Appendix3.D:MATLABImplementationoftheMultiplicativeNMFAlgorithms 193 3.D.1 AlphaAlgorithm 193 3.D.2 SMARTAlgorithm 195 3.D.3 ISRAAlgorithmforNMF 197 Appendix3.E:AdditionalMATLABFunctions 198 3.E.1 Multi-layerNMF 198 3.E.2 MCAnalysiswithDistributedComputingTool 199 References 199 4 AlternatingLeastSquaresandRelatedAlgorithmsforNMFandSCAProblems 203 4.1 StandardALSAlgorithm 203 4.1.1 MultipleLinearRegression–VectorizedVersionofALSUpdateFormulas 206 4.1.2 WeightedALS 206 4.2 MethodsforImprovingPerformanceandConvergenceSpeedofALSAlgorithms 207 4.2.1 ALSAlgorithmforVeryLarge-scaleNMF 207 4.2.2 ALSAlgorithmwithLine-Search 208 4.2.3 AccelerationofALSAlgorithmviaSimpleRegularization 208 4.3 ALSAlgorithmwithFlexibleandGeneralizedRegularizationTerms 209 4.3.1 ALSwithTikhonovTypeRegularizationTerms 210 4.3.2 ALSAlgorithmswithSparsityControlandDecorrelation 211 4.4 CombinedGeneralizedRegularizedALSAlgorithms 212 4.5 Wang-HancewiczModifiedALSAlgorithm 213 4.6 ImplementationofRegularizedALSAlgorithmsforNMF 213 4.7 HALSAlgorithmanditsExtensions 214 4.7.1 ProjectedGradientLocalHierarchicalAlternatingLeastSquares(HALS) Algorithm 214 4.7.2 ExtensionsandImplementationsoftheHALSAlgorithm 216 4.7.3 FastHALSNMFAlgorithmforLarge-scaleProblems 217 viii Contents 4.7.4 HALSNMFAlgorithmwithSparsity,SmoothnessandUncorrelatedness Constraints 220 4.7.5 HALSAlgorithmforSparseComponentAnalysisandFlexible ComponentAnalysis 222 4.7.6 SimplifiedHALSAlgorithmforDistributedandMulti-taskCompressedSensing 227 4.7.7 GeneralizedHALS-CSAlgorithm 231 4.7.8 GeneralizedHALSAlgorithmsUsingAlpha-Divergence 233 4.7.9 GeneralizedHALSAlgorithmsUsingBeta-Divergence 234 4.8 SimulationResults 236 4.8.1 UnderdeterminedBlindSourceSeparationExamples 236 4.8.2 NMFwithSparseness,OrthogonalityandSmoothnessConstraints 237 4.8.3 SimulationsforLarge-scaleNMF 239 4.8.4 IllustrativeExamplesforCompressedSensing 241 4.9 DiscussionandConclusions 249 Appendix4.A:MATLABSourceCodeforALSAlgorithm 252 Appendix4.B:MATLABSourceCodeforRegularizedALSAlgorithms 253 Appendix4.C:MATLABSourceCodeforMixedALS-HALSAlgorithms 256 Appendix4.D:MATLABSourceCodeforHALSCSAlgorithm 259 Appendix4.E:AdditionalMATLABFunctions 261 References 264 5 ProjectedGradientAlgorithms 267 5.1 ObliqueProjectedLandweber(OPL)Method 268 5.2 Lin’sProjectedGradient(LPG)AlgorithmwithArmijoRule 270 5.3 Barzilai-BorweinGradientProjectionforSparseReconstruction(GPSR-BB) 271 5.4 ProjectedSequentialSubspaceOptimization(PSESOP) 273 5.5 InteriorPointGradient(IPG)Algorithm 275 5.6 InteriorPointNewton(IPN)Algorithm 276 5.7 RegularizedMinimalResidualNormSteepestDescentAlgorithm(RMRNSD) 279 5.8 SequentialCoordinate-WiseAlgorithm(SCWA) 281 5.9 Simulations 283 5.10 Discussions 289 Appendix5.A:StoppingCriteria 290 Appendix5.B:MATLABSourceCodeforLin’sPGAlgorithm 292 References 293 6 Quasi-NewtonAlgorithmsforNonnegativeMatrixFactorization 295 6.1 ProjectedQuasi-NewtonOptimization 296 6.1.1 ProjectedQuasi-NewtonforFrobeniusNorm 296 6.1.2 ProjectedQuasi-NewtonforAlpha-Divergence 298 6.1.3 ProjectedQuasi-NewtonforBeta-Divergence 303 6.1.4 PracticalImplementation 305 6.2 GradientProjectionConjugateGradient 305 6.3 FNMAalgorithm 308 6.4 NMFwithQuadraticProgramming 310 6.4.1 NonlinearProgramming 311 6.4.2 QuadraticProgramming 312 6.4.3 Trust-regionSubproblem 314 6.4.4 UpdatesforA 316 6.5 HybridUpdates 318