Stop the working directory insanity (original) (raw)

There are packages for this now!

2017-08-03: Since I wrote this in 2014, the universe, specifically Kirill Müller (https://github.com/krlmlr), has provided better solutions to this problem. I now recommend that you use one of these two packages:

I love these packages so much I wrote an ode to here.

I use these packages now instead of what I describe below. I'll leave this gist up for historical interest. 😆

TL;DR

Include this in the .Rprofile in the top-level directory of an RStudio project:

RPROJ <- list(PROJHOME = normalizePath(getwd())) attach(RPROJ) rm(RPROJ)

Then build paths like so:

file.path(PROJHOME, , )

and never worry about working directory again(?). Read on for the problem I am trying to solve.

Problem statement

My near-daily dilemma

So, what's working directory going to be folks? The above rules and needs admit no obvious solution. Is every one else faffing around with working directory as much as I am?

Define the home directory for a Project

In the pre-RStudio era, I used to define a path object at the top of every file, whereAmI, and I constructed absolute paths based on that. I'm returning to this idea but want to upgrade the smarts, so the solution is more general. I think I've answered my own question.

I define the Project home directory to be the directory where the <project_name>.Rproj file sits.

Store Project home directory in an environment

I cannot believe I am using attach() but here goes.

Create a .Rprofile file in the Project home directory that includes these lines:

RPROJ <- list(PROJHOME = normalizePath(getwd())) attach(RPROJ)

cat("sourcing Project-specific .Rprofile\n") cat('retrieve the top-level Project directory at any time with PROJHOME or via get("PROJHOME", "RPROJ"):\n', get("PROJHOME", "RPROJ"), "\n")

rm(RPROJ)

This creates a new environment on the search path, named RPROJ, containing an object PROJHOME giving the normalized absolute path to Project home. Since the value is determined at the time of R session start, this should work for different collaborators/machine/OSes. In theory.

Always specify paths relative to Project home

The easiest way to retrieve the Project home is simply via PROJHOME, though in theory that could be masked by objects with the same name earlier in the search path. The most proper way to access is via get("PROJHOME", "RPROJ").

Now I can build absolute-but-portable paths like so:

file.path(PROJHOME, , )

Here's what I see at R session start:

sourcing Project-specific .Rprofile retrieve the top-level Project directory at any time with PROJHOME or via get("PROJHOME", "RPROJ"): /Users/jenny/path/to/my-project

The interactive workspace is how I left it the last time I worked on this Project; in particular, it's not cluttered up with PROJHOME. The working directory of R Console is also how I left it, though this suddenly becomes much less important and I think this work style should eliminate fussing around with working directory. I can clean out the workspace with rm(list = ls()) or RStudio's broom button without harming my ability to build robust paths.

R processes launched from outside RStudio

Added 2014-12-12, after using above approach for a couple of months. Since people from #rrhack showed a glimmer of interest, want to add this missing piece.

Above approach will work if and only if path/to/my-project/.Rprofile is processed upon R start up. When does that happen?

What other situations are likely to arise in practice? When will path/to/my-project/.Rprofile not get processed and PROJHOME will be undefined?

~/path/to/my-project/code$ Rscript my_script.R  

My current solution: create an additional .Rprofile in any subdirectory that holds R scripts or RMarkdown files. Continuing the above example, we create path/to/my-project/code/.Rprofile. The only difference is the specification of PROJHOME as the parent of working directory:

    ```R
    RPROJ <- list(PROJHOME = normalizePath("..")))
    attach(RPROJ)
    rm(RPROJ)
    ```

Similarity to path handling in Jekyll

Jekyll is a static website generator. It supports the construction of relative paths through the notion that a website has a root directory. Within files for individual webpages, the path to root can be specified via YAML frontmatter. This, in turn, allows the construction of paths relative to root. The rationale is to encourage use of relative paths over absolute and to make it easy to develop content before the entire directory structure of a site is fixed.

Example of YAML frontmatter specifying relative path to website root:

    ---
    title: My Page title
    root: "../"
    ---

and here's how links would be built within a page:

    <img src="{{ post.root }}images/happy.png" />
    <a href="{{ post.root }}2010/01/01/another_post>Relative link to another post</a>

The Project home directory PROJHOME is equivalent to Jekyll's website root directory post.root. The use of .Rprofile to define PROJHOME is equivalent to Jekyll's use of root: "../" in YAML frontmatter.

Example from this stackoverflow thread.

This stackoverflow thread is kind of relevant.

It was helpful to re-read the Environments chapter of Hadley's Advanced R book.

Should I just go ahead and set an environment variable, i.e. via Sys.setenv()?

Should I worry about where the RPROJ environment ends up in the search path?

This seems tied up with other issues, like building whole websites with rmarkdown, which currently also has a very "one directory to rule them all" approach (I'm looking at you _output.yaml, libs, include). It needs to be easier to designate home directory for a project or website and then write paths relative to that. The way jekyll works seems worth considering.