Loop through all Combinations of Transformation of DataFrame Columns

For a regression problem, I'm trying to try out all kinds of transformations (log, exp, sqrt, **2, custom transformation) on the various columns of a pandas.DataFrame df.

If df have columns A, B, and Y, how can a loop be created such that we perform regression on all possible combinations of transformations applied to the columns of df?

eg:

sm.ols(formula="Y ~ np.log(A) + B", data=df).fit()
sm.ols(formula="Y ~ np.log(A) + np.log(B)", data=df).fit()
sm.ols(formula="Y ~ np.log(A) + exp(B)", data=df).fit()
sm.ols(formula="Y ~ exp(A) + B", data=df).fit()
sm.ols(formula="Y ~ exp(A) + exp(B)", data=df).fit()
...
sm.ols(formula="transform1(Y) ~ transform1(A) + transform1(B)", data=df).fit()

Answers


Create a list of the variables and their transformations, and use itertools.combinations to create all combinations of 2 elements:

variables = [('A', 'np.log(A)', 'exp(A)'), ('B', 'np.log(B)', 'exp(B)')]
for combination in itertools.product(*variables):
    sm.ols(formula="Y ~ {0} + {1}".format(combination[0], combination[1]) data=df).fit()

Need Your Help

Array inside a JavaScript Object?

javascript

I've tried looking to see if this is possible, but I can't find my answer.

Numpy Array Slicing

python arrays numpy scipy numeric

I have a 1D numpy array, and some offset/length values. I would like to extract from this array all entries which fall within offset, offset+length, which are then used to build up a new 'reduced' ...