Excel process to find optimal range to maximize data
I have a spread sheet with 5 columns, A, B, C, D and E, all the same length all containing numerical data.
Columns A, B, and C all have values from -10 to +10, column D is the results. in column E I have an If AND formula that only shows the appropriate result from D if A, B, or C fall within a range. The range is set by separate cells that the IF AND formula references.
I want to find a separate range for each of the 3 columns that will maximize the average in the adjacent column E. I have tried to use the evolutionary solver but it gives me different results every time. The other solver methods don't seem to work but I am new to the solver function so I may be using it wrong.
An example of the result I am looking for is; A must be between 0 and 5, B can be between -10 and +10, C is -2 to 0. This criteria maximizes the average in the adjacent cells E.
I dont know if using the solver is the best approach or not to this but if anyone has any advice on how to better approach this problem without tedious trial and error it would be very helpful
My spreadsheet looks as follows:
A | B | C | D | E 1 3 4 6 6 3 -5 -0.2 -2 -2 0.5 -1 2 1 1 2 4 6 2 2 -1 2 1 10 10 A B C Max 10 10 10 Min -10 -10 -10 E count: 5 E avg: 3.4
A B C D are all pasted values, E uses an IF statement to check if the relevant column (A B C) are within the specified max and min range, If A B and C are in the range of the MAX and MIN row it posts the value of D in E, other wise it leaves it blank. I was using the solver to change the values in the MAX / MIN rows to try and find the optimal mix of max and min for ABC that would result in the highest avg for column E.
only constraints i set were the max and min to be within 10 to -10, as well the minimum count being 10, and max must be > min (in the actual spreadsheet the data goes on for 100 samples)
The way this is implemented makes this a nasty problem. The IF/AND operations cause this to be highly nonlinear and discontinuous and non-differentiable. I.e. not very easy to optimize using traditional gradient based methods. The evolutionary algorithm looks to me like the correct thing to try. Indeed by default it will use a different seed each time it is run. You can change that in the solver options by setting the RandomSeed to a positive number.
I think this problem can be solved as a linear mixed integer programming problem. Requires a little bit of work, but that would make it possible to find guaranteed optimal solutions. I'll see if I can try this out.
The mathematical model is somewhat complex as we need to do several linearizations. Here is what I have:
- the variable b indicates if a data cell is too low or too high
- the variable delta indicates if a data row has all its values inside the bounds
- the calculation of the mean is complicated as we divide a variable by a variable, so we need to linearize
- we added that we can set a minimum K of selected rows
Some results for different K:
This dataset is not so interesting. It would be more interesting if increasing the size of min/max window suddenly causes low value points to enter the solution.