Building an R Package with LLMs: Insights and Observations

September 15, 2023

Author: Rahul Chaudhary

I just watched a YouTube video by Jared P. Lander, focusing on building an R package with the help of a Language Model (LLM). The video speaks to several intriguing aspects of using LLMs for programming tasks, particularly in R.

Basic Function Creation

The initial task involved creating a function that counts different file types in a specified folder. When the model was instructed with the prompt, “write a function whose input is a path to a folder and the output is a count of each of the different file types in the folder,” it produced a function using base R.

Here’s where things get interesting: the function not only performs the task but also includes error handling, checking if the given path exists. It’s essential to note that LLMs often default to base R, even when the prompt doesn’t specify that as a requirement. This bias could lead to inefficiencies and missed opportunities for leveraging more modern packages like the Tidyverse.

Key feature of the function:

If no value is provided to the argument for specified file types, it checks for this and avoids errors. Robustness in software design is crucial in production environments.

Transition to Tidyverse

The next step involved rewriting the function to utilize the Tidyverse paradigm. The model switched from using list.files to fs::dir_ls, showcasing its adaptability. However, it retained older Anonymous syntax rather than the newer ~ notation for maps, an important distinction for R developers concerned with code clarity and conciseness.

The output changed to a tibble, which aligns with Tidyverse standards. A named vector output aligns with base R but can be less intuitive, particularly for those used to tibbles.

Optimizing Code Efficiency

When prompted to use regex instead of iterating through each file type, the LLM made the necessary changes to streamline the code. It utilized the glob pattern argument effectively, showcasing the strengths of regex for file type matching rather than iterating through numerous files—an excellent example of efficiency in coding practices:

[ \text{Time complexity dropped from} \ O(n \cdot m) \ \text{to} \ O(n) ] where ( n ) is the number of files and ( m ) is the number of specified types.

Documentation with Roxygen

Once the core functionality was established, the next phase involved writing documentation for the function using Roxygen format. The model succeeded, albeit with minor omissions related to explicit annotations. It correctly included don't run to avoid executing examples that might not work on CRAN.

Test Writing Challenges

The function-writing process matured into testing. The model produced tests but with inconsistencies in naming conventions (switching from camel case to snake case), suggesting a lack of consistency in its programming style. This inconsistency can dilute the integrity of a codebase:

Example Test Function Naming:
- Original: countFileTypes()
- After LLM: count_file_types()

These differences can lead to confusion, especially when collaborating with other developers who might have strong preferences for camel case or snake case.

After additional prompts, tests were generated that, while functional, didn’t fully capture the various output types or the tests for expected classes. Range checks for outputs such as tibbles versus data frames can provide additional robustness. Nonetheless, the LLM’s guidance on testing structures like setup and tear down provided a solid pathway for managing temporary files.

Final Steps: Packaging

As it wrapped up development, the LLM provided practical steps to package the R functions. It generated potential names and a description for the package, allowing the creator to finalize the project efficiently. However, it also suggested unnecessary dependencies, which indicates the model’s limitations in comprehending a growing codebase.

Findings and Reflections

Speed vs. Accuracy: The LLM demonstrates an exceptional ability to generate functional R code quickly. However, as the complexity increases, it requires human intervention to ensure code fidelity and efficiency.
Style Consistency: The model lacks a firm grasp on style guidelines. This inconsistency can lead to issues in collaborative environments where coding style is a significant concern.
Iterative Improvement: The iterative nature of testing and refining functions exhibited how LLMs can aid the development process but also how human oversight is crucial in catching errors. In particular, the capacity of LLMs to adapt to feedback remains a valuable asset.
Documentation and Testing: Automated documentation generation is beneficial, but focusing on specifics like input-output formats is essential. The tests generated can be improved to ensure they align with expected outputs, which is critical in any agile development cycle.

In summary, leveraging LLMs in function creation can expedite the coding process, but these outcomes must be tempered with a human touch to guide stylistic choices, ensure accuracy, and refine output. The implications of such collaborations will likely shape how we design and develop in R moving forward.