Rename the columns after split in r
Hi I am aware that there are similar questions, but the solutions did not seem to address my problem, so I wonder if anyone may help.
I have a large data frame, inside which there is a column like this:
result A, B-C A, C-D E, F-G ...
I managed to split the column into three using:
df$new_result <- str_match(df$result, "^(.*),(.*)-(.*)$")[,-1]
Now part of the data frame looks like:
result new_result.1 new_result.2 new_result.3 A, B-C A B C A, C-D A C D E, F-G E F G ...
However, when I tried to call:
R gave me an error stating that "new_result.1" could not be found.
I have tried the following but none of them worked.
with(df, colsplit(df$result, pattern = "^(.*),(.*)-(.*)$", names = c('a', 'b', 'c')))
names(df)[names(df) == 'new_result.1'] <- 'a'
I think the problem is that "new_result.1", "new_result.2", "new_result.3" cannot be found in the data frame, instead, they are referred together as "new_result". Any idea how can I separate them so that later I can refer to the columns individually? Thanks!
Following your approach, when we look at 'str(df)' we get this:
> str(df) 'data.frame': 3 obs. of 2 variables: $ result : chr "A, B-C" "A, C-D" "E, F-G" $ new_result: chr [1:3, 1:3] "A" "A" "E" " B" ...
Which is not surprising, as str_match returns a matrix.
An approach to fix this is the following:
Create a 'splitted' dataframe with relevant column names
splitted <- data.frame(str_match(df$result, "^(.*),(.*)-(.*)$")[,-1], stringsAsFactors=F) colnames(splitted) <- paste0("new_result.",1:ncol(splitted))
And cbind everything together
df <- cbind(df,splitted) > str(df) 'data.frame': 3 obs. of 4 variables: $ result : chr "A, B-C" "A, C-D" "E, F-G" $ new_result.1: chr "A" "A" "E" $ new_result.2: chr " B" " C" " F" $ new_result.3: chr "C" "D" "G"