Spark: groupBy with conditions
I have a groupBy for a DataFrame which is based on 3 columns. I am doing something like this:
myDf.groupBy($"col1", $"col2", $"col3")
Anyway I am not sure how this works.
Does it manage ignore cases? I need that for each column "FOO" and "foo" are considered the same like "" and null.
If this is not the supposed working mode how I can add it? From the API doc I can see something with apply on a column but I could not find any example.
You can run functions inside of your groupBy statement. So in this case it sounds like you will want to convert the strings to lower case when you are grouping them. Check out the lower function