Converting Code into Reusable Functions in R for Easier Maintenance and Repetition Reduction

Converting Code into a Function in R

=====================================================

As data scientists and analysts, we often find ourselves working with complex code to extract relevant information from various sources. In this blog post, we’ll explore how to convert your code into a function in R, making it easier to reuse and maintain.

Introduction to Functions in R


In R, a function is a block of code that can be executed multiple times with different inputs. Functions are essential for organizing your code, reducing repetition, and improving readability. In this section, we’ll introduce the basics of functions in R and how to create them.

Creating a Function in R

To create a function in R, you use the function keyword followed by the name of the function and its parameters. The parameters are the input values that will be passed to the function when it’s called.

my_function <- function(input) {
  # code here
}

In the example above, we create a function named my_function that takes one parameter input.

Calling a Function

To call a function in R, you simply type its name followed by parentheses containing any necessary input values.

result <- my_function(5)
print(result)

In this example, we call the my_function with an input value of 5 and store the result in a variable named result.

Converting Code into a Function


Now that we’ve covered the basics of functions in R, let’s explore how to convert your code into a function. We’ll use the example from the provided code snippet.

Step 1: Identify the Main Logic

The main logic of our function is reading the first element of a list of PDF files, extracting the account number, and extracting relevant information using regular expressions.

files <- list.files(pattern = "pdf$")[1]
acc_num <- str_extract(files, "^\\d+")

Step 2: Define Regular Expressions

We’ll define four regular expressions to extract the relevant information:

protec_per_reg <- "Protected\\sP\\w+\\sof"
Arr_Fee_reg <- "^The\\sArrangement\\sF\\w+"
Fix_inter_reg <- "Fixed\\sI\\w+\\sR\\w+"
Bench_rate_reg <- "Benchmark\\sR\\w+\\sthat"

Step 3: Create a Function

We’ll create a function named dummyFunc that takes a data frame as input and performs the main logic:

dummyFunc <- function(df) {
  files <- list.files(pattern = "pdf$")[1]
  acc_num <- str_extract(files, "^\\d+")

  protec_per_reg <- "Protected\\sP\\w+\\sof"
  Arr_Fee_reg <- "^The\\sArrangement\\sF\\w+"
  Fix_inter_reg <- "Fixed\\sI\\w+\\sR\\w+"
  Bench_rate_reg <- "Benchmark\\sR\\w+\\sthat"

  Off_let <- df %>% filter(page_id == 3, str_detect(df$text, protec_per_reg) |
                             str_detect(df$text, Arr_Fee_reg) | str_detect(df$text, Fix_inter_reg) | 
                             str_detect(df$text, Bench_rate_reg))

  off_let_num <- str_extract(Off_let$text, "\\d+\\.?\\d+")

  off_let_num[is.na(off_let_num)] <- str_extract(Off_let$text, "\\d+%")[[1]]
  return(off_let_num)
}

Step 4: Test the Function

We’ll test our function by passing a sample data frame to it:

dummyFunc(Off_let_data)

Extending the Function


To make our function more flexible and reusable, we can extend it by adding additional parameters. Let’s create an extended version of the function named dummyFuncExt.

Step 1: Define Additional Parameters

We’ll define a list of regular expressions to extract relevant information:

regexprlist <- list(protec_per_reg, Arr_Fee_reg,
                    Fix_inter_reg, Bench_rate_reg)

Step 2: Modify the Function Logic

We’ll modify the function logic to use the new parameters:

dummyFuncExt <- function(df, regexp, page_id) {
  files <- list.files(pattern = "pdf$")[1]
  acc_num <- str_extract(files, "^\\d+")

  Off_let <- df %>% filter(page_id == page_id, str_detect(df$text, regexp[[1]]) |
                             str_detect(df$text, regexp[[2]]) | str_detect(df$text, regexp[[3]]) | 
                             str_detect(df$text, regexp[[4]]))

  off_let_num <- str_extract(Off_let$text, "\\d+\\.?\\d+")

  off_let_num[is.na(off_let_num)] <- str_extract(Off_let$text, "\\d+%")[[1]]
  return(off_let_num)
}

Step 3: Test the Extended Function

We’ll test our extended function by passing a sample data frame to it:

dummyFuncExt(Off_let_data, regexprlist, 3)

Conclusion:

In this article, we’ve covered how to create functions in R and convert your code into a function. We’ve also explored how to extend the function logic by adding additional parameters. By following these steps, you can write reusable and efficient functions in R that simplify your workflow and improve productivity.


Last modified on 2023-06-15