Evaluating Component Solver Contributions to Portfolio-Based Algorithm Selectors Lin Xu, Frank Hutter, Holger Hoos, Kevin Leyton-Brown BETA Lab Department of Computer Science University of British Columbia Canada
SAT Competitions help... establish benchmarks Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 2
SAT Competitions help... establish benchmarks assess state of the art Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 2
SAT Competitions help... establish benchmarks assess state of the art promote solvers, solver development Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 2
Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 3
2009: 5 of 27 medals 2011: 30 of 54 medals Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 3
What is the state of the art in SAT solving? single best solver, SBS (= winner of competition category)? Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 4
What is the state of the art in SAT solving? single best solver, SBS (= winner of competition category)? virtual best solver, VBS (= oracle) over winners of competition categories? Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 4
What is the state of the art in SAT solving? single best solver, SBS (= winner of competition category)? virtual best solver, VBS (= oracle) over all solvers from competition? Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 4
What is the state of the art in SAT solving? single best solver, SBS (= winner of competition category)? virtual best solver, VBS (= oracle) over all solvers from competition? portfolio-based selector over all solvers from competition? Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 4
Meta-algorithmic techniques rule! instance-based solver selection (e.g., SATzilla, 3S) Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 5
Meta-algorithmic techniques rule! instance-based solver selection (e.g., SATzilla, 3S) parallel solver portfolios (e.g., ManySAT, ppfolio, plingeling) Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 5
Meta-algorithmic techniques rule! instance-based solver selection (e.g., SATzilla, 3S) parallel solver portfolios (e.g., ManySAT, ppfolio, plingeling) sequential solver schedules (used in SATzilla, 3S) Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 5
Meta-algorithmic techniques rule! instance-based solver selection (e.g., SATzilla, 3S) parallel solver portfolios (e.g., ManySAT, ppfolio, plingeling) sequential solver schedules (used in SATzilla, 3S) SATzilla-2009: 3+2 = 5/27 medals in 2009 SAT Competition Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 5
Meta-algorithmic techniques rule! instance-based solver selection (e.g., SATzilla, 3S) parallel solver portfolios (e.g., ManySAT, ppfolio, plingeling) sequential solver schedules (used in SATzilla, 3S) SATzilla-2009: 3+2 = 5/27 medals in 2009 SAT Competition ppfolio: 5+5+6 = 16/54 medals in 2011 SAT Competition 3S: 2+1+3 = 7/54 medals in 2011 SAT Competition Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 5
Meta-algorithmic techniques rule! instance-based solver selection (e.g., SATzilla, 3S) parallel solver portfolios (e.g., ManySAT, ppfolio, plingeling) sequential solver schedules (used in SATzilla, 3S) SATzilla-2009: 3+2 = 5/27 medals in 2009 SAT Competition ppfolio: 5+5+6 = 16/54 medals in 2011 SAT Competition 3S: 2+1+3 = 7/54 medals in 2011 SAT Competition Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 5
When does portfolio-based selection work well? several strong & weakly/un-correlated component solvers Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 6
When does portfolio-based selection work well? several strong & weakly/un-correlated component solvers informative & cheaply computable features Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 6
When does portfolio-based selection work well? several strong & weakly/un-correlated component solvers informative & cheaply computable features effective selector construction technique informative set of training data Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 6
How to improve the state of the art in SAT solving? several strong & weakly/un-correlated component solvers informative & cheaply computable features effective selector construction technique informative set of training data Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 7
How to improve the state of the art in SAT solving? several strong & weakly/un-correlated component solvers informative & cheaply computable features effective selector construction technique informative set of training data Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 7
Goals in (non-portfolio) solver development: (A) better all-round performance (required to do well in competition under current scoring) Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 8
Goals in (non-portfolio) solver development: (A) better all-round performance (required to do well in competition under current scoring) (B) better performance on certain types of instances (rewarded under purse-based scoring, van Gelder et al., 2005) Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 8
Goals in (non-portfolio) solver development: (A) better all-round performance (required to do well in competition under current scoring) (B) better performance on certain types of instances (rewarded under purse-based scoring, van Gelder et al., 2005) state of the art (SOTA) = portfolio-based selector (B) is more effective in improving SOTA Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 8
SOTA portfolio strongest portfolio-based solver that can be fully automatically constructed from available solvers Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 9
SOTA portfolio strongest portfolio-based solver that can be fully automatically constructed from available solvers Marginal contribution of solver S to SOTA portfolio P difference in performance of P with and without S (trained separately) Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 9
SOTA portfolio strongest portfolio-based solver that can be fully automatically constructed from available solvers Marginal contribution of solver S to SOTA portfolio P difference in performance of P with and without S (trained separately) frequency of selecting S Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 9
SOTA portfolio strongest portfolio-based solver that can be fully automatically constructed from available solvers Marginal contribution of solver S to SOTA portfolio P difference in performance of P with and without S (trained separately) frequency of selecting S fraction of instances solved by S Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 9
SOTA portfolio strongest portfolio-based solver that can be fully automatically constructed from available solvers Marginal contribution of solver S to SOTA portfolio P difference in performance of P with and without S (trained separately) frequency of selecting S fraction of instances solved by S contribution of S to VBS Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 9
How SATzilla Works Instance Feature extractor Algorithm selector Minimal cost feature extractor Feature cost predictor Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 10
How SATzilla Works Instance Feature extractor Algorithm selector Minimal cost feature extractor Feature cost predictor SATzilla 2011 uses... cost-sensitive decision forests for every pair of solvers Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 10
How SATzilla Works Instance Feature extractor Algorithm selector Minimal cost feature extractor Feature cost predictor SATzilla 2011 uses... cost-sensitive decision forests for every pair of solvers voting to select solver to be run Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 10
Empirical Analysis all instances from 2011 SAT Competition: 300 Application; 300 Crafted; 300 Random Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 11
Empirical Analysis all instances from 2011 SAT Competition: 300 Application; 300 Crafted; 300 Random candidate solvers from 2011 SAT Competition: for determining VBS and SBS: all solvers from Phase 2 of competition: 31 Application; 25 Crafted; 17 Random Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 11
Empirical Analysis all instances from 2011 SAT Competition: 300 Application; 300 Crafted; 300 Random candidate solvers from 2011 SAT Competition: for determining VBS and SBS: all solvers from Phase 2 of competition: 31 Application; 25 Crafted; 17 Random for building SATzilla: all sequential, non-portfolio solvers from Phase 2: 18 Application; 15 Crafted; 9 Random Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 11
Empirical Analysis all instances from 2011 SAT Competition: 300 Application; 300 Crafted; 300 Random candidate solvers from 2011 SAT Competition: for determining VBS and SBS: all solvers from Phase 2 of competition: 31 Application; 25 Crafted; 17 Random for building SATzilla: all sequential, non-portfolio solvers from Phase 2: 18 Application; 15 Crafted; 9 Random SATzilla assessed by 10-fold cross validation Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 11
SATzilla 2011 Performance (Inst. Solved) Solver Application Crafted Random VBS 84.7% 76.3% 82.2% SATzilla 2011 75.3% 66.0% 80.8% SATzilla 2009 70.3% 63.0% 80.3% Gold medalist (SBS) 71.7% 54.3% 68.0% Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 12
SATzilla 2011 vs 3S Fair Comparison same 26 candidate solvers, features, training data 5000 sec cutoff time same machine, instance set for testing Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 13
SATzilla 2011 vs 3S Fair Comparison same 26 candidate solvers, features, training data 5000 sec cutoff time same machine, instance set for testing SATzilla 2011 3S SBS = MXC09 VBS Inst. Solved 68.3% 67.4% 38.2% 76.9% PAR10 16 166 16 442 31 185 11 836 (combined results for all 3 categories) Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 13
SATzilla 2011 vs 3S Fair Comparison same 26 candidate solvers, features, training data 5000 sec cutoff time same machine, instance set for testing SATzilla 2011 3S SBS = MXC09 VBS Inst. Solved 68.3% 67.4% 38.2% 76.9% PAR10 16 166 16 442 31 185 11 836 (combined results for all 3 categories) Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 13
SATzilla 2011 vs 3S Fair Comparison same 26 candidate solvers, features, training data 5000 sec cutoff time same machine, instance set for testing SATzilla 2011 3S SBS = MXC09 VBS Inst. Solved 68.3% 67.4% 38.2% 76.9% PAR10 16 166 16 442 31 185 11 836 (combined results for all 3 categories) Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 13
SATzilla 2011 Performance (Inst. Solved) Solver Application Crafted Random VBS 84.7% 76.3% 82.2% SATzilla 2011 75.3% 66.0% 80.8% SATzilla 2009 70.3% 63.0% 80.3% Gold medalist (SBS) 71.7% 54.3% 68.0% Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 14
Performance of Individual Solvers Application RestartSAT Cirminisat Minisat EBMinisat Minisatagile Glueminisat LR GL SHR Precosat Lingeling MPhaseSAT64 Contrasat Minisat_psm Rcl Glucose1 Glucose2 EBGlucose CryptoMinisat QuteRSat 0 2 60 80 100 Percentage Solved 5000 CPU sec cutoff Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 15
Correlation of Solver Performance Application RestartSAT Cirminisat Minisat EBMinisat Minisatagile Glueminisat LR GL SHR Precosat Lingeling MPhaseSAT64 Contrasat Minisat_psm Rcl Glucose1 Glucose2 EBGlucose CryptoMinisat QuteRSat RestartSAT Cirminisat Minisat EBMinisat Minisatagile Glueminisat LR GL SHR Precosat Lingeling MPhaseSAT64 Contrasat Minisat_psm Rcl Glucose1 Glucose2 EBGlucose CryptoMinisat QuteRSat darker = higher Spearman correlation coefficient Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 16
Correlation of Solver Performance Random Sparrow EagleUP Gnovelty+2 TNM Sattime11 Adaptg2wsat11 MPhaseSAT_M March_rw March_hi Sparrow EagleUP Gnovelty+2 TNM Sattime11 Adaptg2wsat11 MPhaseSAT_M March_rw March_hi darker = higher Spearman correlation coefficient Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 17
Solver Selection Frequency in SATzilla 2011 Application Glucose2 (Backup) Solved by Presolvers Glucose2 Glueminisat QuteRSat Precosat Other Solvers Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 18
Instances Solved by SATzilla 2011 Components Application Glucose2 (Backup) Unsolved Glucose2 (Pre1) Other Solvers Glucose2 Glueminisat (Pre1) Glueminisat QuteRSat EBGlucose EBGlucose (Pre1) Precosat Minisat psm Minisat psm (Pre1) Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 19
Marginal Contribution of Components Application RestartSAT Cirminisat Minisat EBMinisat Minisatagile Glueminisat LR GL SHR Precosat Lingeling MPhaseSAT64 Contrasat Minisat_psm Rcl Glucose1 Glucose2 EBGlucose CryptoMinisat QuteRSat 0 2 4 6 8 10 Marginal Contribution (%) Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 20
Instances Solved vs Marginal Contribution of Components Application 10 Marginal Contribution 8 6 4 2 0 0 10 20 30 40 50 60 % Solved by Component Solver Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 21
Instances Solved vs Marginal Contribution of Components Application 10 Marginal Contribution 8 6 4 2 MPhaseSAT64 Glueminisat 0 0 10 20 30 40 50 60 % Solved by Component Solver Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 21
Instances Solved vs Marginal Contribution of Components Crafted 10 Marginal Contribution 8 6 4 2 0 0 10 20 30 40 50 60 % Solved by Component Solver Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 22
Instances Solved vs Marginal Contribution of Components Crafted 10 Marginal Contribution 8 6 4 2 Sattime MPhaseSAT Sol Clasp2 0 0 10 20 30 40 50 60 % Solved by Component Solver Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 22
Instances Solved vs Marginal Contribution of Components Crafted 10 Marginal Contribution 8 6 4 2 Sattime MPhaseSAT Sol Clasp2 Joint contributions: - 2 Clasp variants = 6.3% - 2 Sattime variants = 5.4% 0 0 10 20 30 40 50 60 % Solved by Component Solver Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 22
Instances Solved vs Marginal Contribution of Components Random 10 Marginal Contribution 8 6 4 2 0 0 10 20 30 40 50 60 % Solved by Component Solver Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 23
Instances Solved vs Marginal Contribution of Components Random 10 Marginal Contribution 8 6 4 2 Sparrow EagleUP March_rw 0 0 10 20 30 40 50 60 % Solved by Component Solver Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 23
Instances Solved vs Marginal Contribution of Components Random Marginal Contribution 10 8 6 4 2 Joint contributions: - 2 March variants = 4% - 6 LS solvers = 22.5% Sparrow EagleUP March_rw 0 0 10 20 30 40 50 60 % Solved by Component Solver Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 23
Conclusions State of the arts in SAT solving: portfolio-based algorithm selectors Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 24
Conclusions State of the arts in SAT solving: portfolio-based algorithm selectors Use marginal contributions to SOTA portfolio to assess value of solvers for improving state of the art. Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 24
Conclusions State of the arts in SAT solving: portfolio-based algorithm selectors Use marginal contributions to SOTA portfolio to assess value of solvers for improving state of the art. To promote development of strong, uncorrelated solvers: Give formal recognition to solvers contributing most to SOTA portfolio(s). Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 24
Conclusions State of the arts in SAT solving: portfolio-based algorithm selectors Use marginal contributions to SOTA portfolio to assess value of solvers for improving state of the art. To promote development of strong, uncorrelated solvers: Give formal recognition to solvers contributing most to SOTA portfolio(s). Evaluate portfolio-based solvers separately. Xu, Hutter, Hoos, Leyton-Brown: Evaluating Portfolio-Based Algorithm Selectors 24