(original) (raw)

On Fri, Nov 10, 2017 at 6:13 AM, Rafael Avila de Espindola <rafael.espindola@gmail.com> wrote:
Currently a power failure or other hard crash can cause lld leave a temporary
file around. The same is true for other llvm tools.

As an example, put a breakpoint in Writer.cpp:236 ("writeBuildId()") and
restart the run a few times. You will get

t.tmp43a735a t.tmp4deeabb t.tmp9bacdd3 t.tmpe4115c4 t.tmpeb01fff

The same would happen if there was a fatal error between the
FileOutputBuffer creation and commit. I don't think that is a code path
where that is possible right now, but it would be an easy thing to miss
in a code review.

I was hopping the OS could help us manage the temporary file so that
there was no way to accidentally leave it behind.

On linux there is O\_TMPFILE, which allows us to create a file with no
name in the file system. A name can be given with linkat. Unfortunately
we can't use

linkat(fd, "", AT\_FDCWD, "destination", AT\_EMPTY\_PATH)

Without special permissions and have instead to depend on proc:

linkat(AT\_FDCWD, "/proc/self/fd/", AT\_FDCWD, "destination",
AT\_SYMLINK\_FOLLOW)

I often execute lld in a chroot environment in which /proc is not mounted, and I expect lld would work just fine in that environment. So the presence of /proc should be considered optional even on Linux.

Another annoyance is that linkat will not replace the destination and
renameat2 doesn't support AT\_EMPTY\_PATH. The result is that we have to
use unlink+linkat and loop.

I think you can avoid unlink+linkat. You can instead give a temporary file name to an unnamed file and then rename the temporary file the destination file. There's a small chance between linkat and rename, but that race condition is probably better than unlink+linkat.

On windows there is FILE\_FLAG\_DELETE\_ON\_CLOSE, but there seems to be no
way to cancel it. If a file is created with it I can rename it, but it
is still deleted in the end.

I couldn't find any support for this on FreeBSD.

This suggest that we cannot just have a createUniqueEntity that returs
just an FD. We will need a class that stores a temporary name too for
the systems where we cannot give a filename to a FD.

I have attached the patch I got so far, but given the existing
restrictions I would say it is not worth it.

It will try to just make it more obvious that lld's FileOutputBuffer is
deleted on all code paths.


diff --git a/include/llvm/Support/FileSystem.h b/include/llvm/Support/FileSystem.h
index 03015a0ca3b..5d80db1d5a2 100644
\--- a/include/llvm/Support/FileSystem.h
+++ b/include/llvm/Support/FileSystem.h
@@ -31,6 +31,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Twine.h"
#include "llvm/Support/Chrono.h"
+#include "llvm/Support/Error.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/ErrorOr.h"
#include "llvm/Support/MD5.h"
@@ -694,6 +695,26 @@ std::error\_code createUniqueFile(const Twine &Model, int &ResultFD,
std::error\_code createUniqueFile(const Twine &Model,
SmallVectorImpl &ResultPath);

+// Represents a temporary file and hides the name. This allows not even having
+// no name when O\_TMPFILE is supported.
+class TempFile {
\+ // Temporary name used when O\_TMPFILE is not avilable.
\+ std::string TmpName;
+
+public:
\+ TempFile(StringRef Name, int FD);
\+ TempFile(TempFile &&Other);
+
\+ // The open file descriptor.
\+ int FD;
+
\+ Error keep(const Twine &Name);
\+ \~TempFile();
+};
+
+Expected createUniqueFile(const Twine &Directory,
\+ unsigned Mode = all\_read | all\_write);
+
/// @brief Create a file in the system temporary directory.
///
/// The filename is of the form prefix-random\_chars.suffix. Since the directory
diff --git a/lib/Support/FileOutputBuffer.cpp b/lib/Support/FileOutputBuffer.cpp
index db8ff38e5b5..af61af3c32e 100644
\--- a/lib/Support/FileOutputBuffer.cpp
+++ b/lib/Support/FileOutputBuffer.cpp
@@ -34,9 +34,9 @@ using namespace llvm::sys;
// with the temporary file on commit().
class OnDiskBuffer : public FileOutputBuffer {
public:
\- OnDiskBuffer(StringRef Path, StringRef TempPath,
\+ OnDiskBuffer(StringRef Path, fs::TempFile Temp,
std::unique\_ptrfile\_region> Buf)
\- : FileOutputBuffer(Path), Buffer(std::move(Buf)), TempPath(TempPath) {}
\+ : FileOutputBuffer(Path), Buffer(std::move(Buf)), Temp(std::move(Temp)) {}

uint8\_t \*getBufferStart() const override { return (uint8\_t \*)Buffer->data(); }

@@ -51,21 +51,18 @@ public:
Buffer.reset();

// Atomically replace the existing file with the new one.
\- auto EC = fs::rename(TempPath, FinalPath);
\- sys::DontRemoveFileOnSignal(TempPath);
\- return errorCodeToError(EC);
\+ return Temp.keep(FinalPath);
}

\~OnDiskBuffer() override {
// Close the mapping before deleting the temp file, so that the removal
// succeeds.
Buffer.reset();
\- fs::remove(TempPath);
}

private:
std::unique\_ptrfile\_region> Buffer;
\- std::string TempPath;
\+ fs::TempFile Temp;
};

// A FileOutputBuffer which keeps data in memory and writes to the final
@@ -110,13 +107,13 @@ createInMemoryBuffer(StringRef Path, size\_t Size, unsigned Mode) {

static ExpectedOnDiskBuffer>>
createOnDiskBuffer(StringRef Path, size\_t Size, unsigned Mode) {
\- // Create new file in same directory but with random name.
\- SmallString<128> TempPath;
\- int FD;
\- if (auto EC = fs::createUniqueFile(Path + ".tmp%%%%%%%", FD, TempPath, Mode))
\- return errorCodeToError(EC);
\-
\- sys::RemoveFileOnSignal(TempPath);
\+ StringRef Directory = path::parent\_path(Path);
\+ if (Directory.empty())
\+ Directory = ".";
\+ Expected FileOrErr = fs::createUniqueFile(Directory, Mode);
\+ if (!FileOrErr)
\+ return FileOrErr.takeError();
\+ fs::TempFile File = std::move(\*FileOrErr);

#ifndef LLVM\_ON\_WIN32
// On Windows, CreateFileMapping (the mmap function on Windows)
@@ -124,18 +121,18 @@ createOnDiskBuffer(StringRef Path, size\_t Size, unsigned Mode) {
// extend the file beforehand. \_chsize (ftruncate on Windows) is
// pretty slow just like it writes specified amount of bytes,
// so we should avoid calling that function.
\- if (auto EC = fs::resize\_file(FD, Size))
\+ if (auto EC = fs::resize\_file(File.FD, Size))
return errorCodeToError(EC);
#endif

// Mmap it.
std::error\_code EC;
auto MappedFile = llvm::make\_uniquefile\_region>(
\- FD, fs::mapped\_file\_region::readwrite, Size, 0, EC);
\- close(FD);
\+ File.FD, fs::mapped\_file\_region::readwrite, Size, 0, EC);
if (EC)
return errorCodeToError(EC);
\- return llvm::make\_unique<OnDiskBuffer>(Path, TempPath, std::move(MappedFile));
\+ return llvm::make\_unique<OnDiskBuffer>(Path, std::move(File),
\+ std::move(MappedFile));
}

// Create an instance of FileOutputBuffer.
diff --git a/lib/Support/Path.cpp b/lib/Support/Path.cpp
index 9692acb5283..d3a356cecde 100644
\--- a/lib/Support/Path.cpp
+++ b/lib/Support/Path.cpp
@@ -18,8 +18,12 @@
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/Process.h"
+#include "llvm/Support/Signals.h"
#include
#include
+#include
+#include
+#include

#if !defined(\_MSC\_VER) && !defined(\_\_MINGW32\_\_)
#include
@@ -759,6 +763,73 @@ std::error\_code createUniqueFile(const Twine &Model,
return createUniqueEntity(Model, Dummy, ResultPath, false, 0, FS\_Name);
}

+TempFile::TempFile(StringRef Name, int FD) : TmpName(Name), FD(FD) {}
+TempFile::TempFile(TempFile &&Other) {
\+ TmpName = std::move(Other.TmpName);
\+ FD = Other.FD;
\+ Other.FD = -1;
+}
+
+TempFile::\~TempFile() {
\+ // FIXME: error handling. Move this to a discard method?
\+ if (FD != -1)
\+ close(FD);
\+ if (!TmpName.empty())
\+ fs::remove(TmpName);
\+ sys::DontRemoveFileOnSignal(TmpName);
+}
+
+Error TempFile::keep(const Twine &Name) {
\+ if (TmpName.empty()) {
\+ // We can't do it atomically.
\+ // FIXME: should we loop?
\+ if (std::error\_code EC = fs::remove(Name))
\+ return errorCodeToError(EC);
\+ SmallString<128> Storage;
\+ StringRef NameP = Name.toNullTerminatedStringRef(Storage);
\+ std::string ProcFD = "/proc/self/fd/" + std::to\_string(FD);
\+ if (linkat(AT\_FDCWD, ProcFD.c\_str(), AT\_FDCWD, NameP.begin(),
\+ AT\_SYMLINK\_FOLLOW) == -1) {
\+ std::error\_code EC(errno, std::generic\_category());
\+ return errorCodeToError(EC);
\+ }
\+ } else {
\+ if (std::error\_code EC = fs::rename(TmpName, Name))
\+ return errorCodeToError(EC);
\+ sys::DontRemoveFileOnSignal(TmpName);
\+ TmpName = "";
\+ }
+
\+ if (close(FD) == -1) {
\+ std::error\_code EC(errno, std::generic\_category());
\+ return errorCodeToError(EC);
\+ }
\+ FD = -1;
+
\+ return Error::success();
+}
+
+Expected createUniqueFile(const Twine &Directory, unsigned Mode) {
+#if 1
\+ SmallString<128> Storage;
\+ StringRef P = Directory.toNullTerminatedStringRef(Storage);
\+ int FD = open(P.begin(), O\_RDWR | O\_TMPFILE | O\_CLOEXEC, Mode);
\+ if (FD == -1) {
\+ std::error\_code EC(errno, std::generic\_category());
\+ return errorCodeToError(EC);
\+ }
\+ return TempFile("", FD);
+#else
\+ int FD;
\+ SmallString<128> ResultPath;
\+ if (std::error\_code EC =
\+ createUniqueFile(Directory + "/tmp%%%%%%%", FD, ResultPath, Mode))
\+ return errorCodeToError(EC);
\+ sys::RemoveFileOnSignal(ResultPath);
\+ return TempFile(ResultPath, FD);
+#endif
+}
+
static std::error\_code
createTemporaryFile(const Twine &Model, int &ResultFD,
llvm::SmallVectorImpl &ResultPath, FSEntity Type) {
diff --git a/lib/Support/Unix/Path.inc b/lib/Support/Unix/Path.inc
index 2ecb97316c8..3099b75a1b8 100644
\--- a/lib/Support/Unix/Path.inc
+++ b/lib/Support/Unix/Path.inc
@@ -779,6 +779,8 @@ std::error\_code openFileForWrite(const Twine &Name, int &ResultFD,

int OpenFlags = O\_CREAT;

\+ // add O\_TMPFILE?
+
#ifdef O\_CLOEXEC
OpenFlags |= O\_CLOEXEC;
#endif


Cheers,
Rafael