Problem 1. Spline approximation for one factor (one independent variable)
Mathematical Problem Statement
Problem dimension and solving time
Solution in Run-File Environment
Solution in MATLAB Environment
Mathematical Problem Statement
Problem dimension and solving time
Solution in Run-File Environment
Solution in MATLAB Environment
Problem 3. Sum of splines approximation with 4-fold Cross Validation
Mathematical Problem Statement
Problem dimension and solving time
Solution in Run-File Environment
Solution in MATLAB Environment
This case study demonstrate binary classifier on the base of approximation multidimensional data (with several independent variables) by a sum of splines using PSG function spline_sum.
PSG function Maximum Likelihood for Logistic Regression, logexp_sum, is minimized to find variables of splines providing the best approximation of data in the case of one factor (see Problem 1) and set of factors (Problem 2). Estimated spline may "overfit" the in-sample data and this may result in poor out-of-sample performance. Сross-validation technique is used to check overfitting (see Problem 3). To prepare data for cross-validation we use PSG Crossvalidation(K,Matrix) matrix operation which splits input Matrix of Scenarios in N pairs of complementary sub-matrices.
Spline approximation for one factor (one independent variable).
Maximize logexp_sum(spline_sum) (maximize Logarithms Exponents Sum applied to Spline Sum)
Calculate:
logexp_sum(spline_sum) (function Logarithms Exponents Sum applied to Spline Sum)
L(spline_sum) (function L applied to Spline Sum)
where
logexp_sum = Logarithms Exponents Sum
spline_sum = Spline Sum calculates spline values depending upon regression variables for every scenario
l = Linear Loss for Spline Sum
Mathematical Problem Statement
Problem dimension and solving time
Number of Variables |
20 |
Number of Scenarios |
4000 |
Objective Value |
-0.6890 |
Solving Time (sec) |
0.11 |
Solution in Run-File Environment
Input Files to run CS:
Output Files:
Solution in MATLAB Environment
Solved with PSG MATLAB function tbpsg_run (General (Text) Format of PSG in MATLAB):
Input Files to run CS:
Sum of splines approximation for set of factors. Set of splines are built for factors. Their sum best fit dependent variable.
Maximize logexp_sum(spline_sum) (maximize Logarithms Exponents Sum applied to Spline Sum)
Calculate:
logexp_sum(spline_sum) (function Logarithms Exponents Sum applied to Spline Sum)
logistic(spline_sum) (function Logistic applied to Spline Sum)
L(spline_sum) (function L applied to Spline Sum)
where
logexp_sum = Logarithms Exponents Sum
spline_sum = Spline Sum calculates spline values depending upon regression variables for every scenario
logistic = calculate values of logistic function of spline regression for every scenario
l = Linear Loss for Spline Sum
Mathematical Problem Statement
Problem dimension and solving time
Number of Variables |
286 |
Number of Scenarios |
4000 |
Objective Value |
-0.6781 |
Solving Time (sec) |
19.77 |
Solution in Run-File Environment
Input Files to run CS:
Output Files:
Solution in MATLAB Environment
Solved with PSG MATLAB function tbpsg_run (General (Text) Format of PSG in MATLAB):
Input Files to run CS:
Sum of splines approximation with 4-fold Cross Validation (4 in-sample data and 4 out-of-sample data).
4-fold crossvalidation
Maximize logexp_sum(spline_sum) (maximize Logarithms Exponents Sum applied to Spline Sum)
Calculate:
logexp_sum(spline_sum) (function Logarithms Exponents Sum applied to Spline Sum on the out-of-sample data)
logistic(spline_sum) (function Logistic applied to Spline Sum on the in-sample data)
logistic(spline_sum) (function Logistic applied to Spline Sum on the out-of-sample data)
logexp_sum(spline_sum) (function Logarithms Exponents Sum applied to Spline Sum on the in-sample data)
logexp_sum(spline_sum) (function Logarithms Exponents Sum applied to Spline Sum on the out-of-sample data)
where
crossvalidation(N,Matrix) = matrix operation splits input Matrix into N pairs of complementary sub-matrices
logexp_sum = Logarithms Exponents Sum
spline_sum = Spline Sum calculates spline values depending upon regression variables for every scenario
logistic = calculate values of logistic function of spline regression for every scenario
Mathematical Problem Statement
Problem dimension and solving time
For one problem in Cross-validation:
Number of Variables |
286 |
Number of Scenarios |
4000 |
Objective Value |
-0.6769 |
Solving Time (sec) |
7.48 |
Solution in Run-File Environment
Input Files to run CS:
Output Files:
Solution in MATLAB Environment
Solved with PSG MATLAB function tbpsg_run (General (Text) Format of PSG in MATLAB):
Input Files to run CS: