Some String Functions in R, String Manipulation in R
By Güngör Budak
- 2 minutes read - 337 wordsI have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s). Default for separator character is one white space. It directly returns output as string and starting strings remain the same.
1str1 <- "Gungor"
2str2 <- "Budak"
3paste(str1, str2, sep="_") # Output: [1] "Gungor_Budak"
More on the documentation for paste
Extract character(s) from strings
This can be done with substr function. It gets string, start and stop indices for the extraction. It directly returns output as string and starting string remains the same.
1str1 <- "Gungor"
2substr(str1, 1, 3) # Output: [1] "Gun"
More on the documentation for substr
Split characters from strings
One function is strsplit which gets string and the character(s) from which it will be split as arguments. It returns a list and a list of splits in that list. So, to get a direct list of splits. You can use unlist.
1str1 <- "Gungor"
2unlist(strsplit(str1, "ng")) # Output: [1] "Gu" "or"
More on the documentation for strsplit
Check if character(s) are present in strings
Here, you can use grep functions. grepl takes pattern and string as arguments and returns TRUE if it matches and FALSE otherwise.
1str1 <- "Gungor"
2grepl("A", str1) # Output: [1] FALSE
More on the documentation for grep
Replace character(s) in strings
gsub is the function I use for this. It takes pattern, replacement and string as arguments. It returns changed string directly.
1str1 <- "Gungor"
2gsub("ng", "gn", str1) # Output: [1] "Gugnor"
More on the documentation for gsub
These are the ones I use most, but there are many more and I will be adding them to this post in the future.