Spark: groupBy with conditions

I have a groupBy for a DataFrame which is based on 3 columns. I am doing something like this:

myDf.groupBy($"col1", $"col2", $"col3")

Anyway I am not sure how this works.

Does it manage ignore cases? I need that for each column "FOO" and "foo" are considered the same like "" and null.

If this is not the supposed working mode how I can add it? From the API doc I can see something with apply on a column but I could not find any example.

Any idea?

Answers


You can run functions inside of your groupBy statement. So in this case it sounds like you will want to convert the strings to lower case when you are grouping them. Check out the lower function

https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.sql.functions$


Need Your Help

PathTooLongException C# 4.5

c# exception visual-studio-2012 io pathtoolongexception

I having trouble of copying some folder 260+ chars (for example: F:\NNNNNNNNNNNNNNNN\NNNNNNNNNNN\ROOT\$RECYCLE.BIN\S-1-5-21-3299053755-4209892151-505108915-1000\$RMSL3U8\NNNNNNNNN NNNNNNNN\NNNNNNNN...

Yii2 - How to pass 'isNewRecord' to custom form?

php activerecord yii model yii2

I have a custom form CreateAdminForm where an admin in the backend can create a new Admin user. I created my own form model so I can handle the passsword. They just fill out username, nicename, ema...