[llvm-dev] Contributing a new sanitizer for pointer casts (original) (raw)

Stephen Kell via llvm-dev llvm-dev at lists.llvm.org
Tue Apr 25 06:54:12 PDT 2017


Hi all,

Some of you might remember that at EuroLLVM last year in Barcelona, Chris Diamand and I gave a talk about Clang/libcrunch, a run-time checking system which can be thought of as another flavour of sanitizer. It checks pointer casts, using run-time type information. Roughly the check is that the pointer really points to an instance of the target type, though there are refinements to deal with various idioms violating that. <http://www.llvm.org/devmtg/2016-03/#presentation9>

(I dropped a mention of this in the recent TBAA sanitizer thread, but consensus was that on balance it's a different enough tool to want both.)

My current research funding has some room for tech transfer activity, so I've been spending some time on improving the code, with a hope of eventually contributing it to LLVM.

This mail is just to get a handle on two questions: how much interest is there in this, and what changes are most important in order to get something contributable?

The system is a bit complex, so let me give you an overview of how it currently works. (If you want full technical details, there are a couple of research papers you could read -- see the bottom.)

Currently, my plan in a nutshell is to eliminate the C inline helpers in favour of fully IR-level instrumentation, and also eliminate the compiler wrapper in favour of a gold plugin (and maybe a bit of help in the clang driver). This should result in a contributable diff that adds a new sanitizer option (currently "-fsanitize=crunch", but name negotiable :-). Binaries built this way will also require the gold plugin and runtime (both out-of-tree) to do useful checking.

I don't intend to port the runtime. Although in principle this could share code with the sanitizer runtimes, that's a lot of work and I don't have the resource to visit this right now... barring major rewrites, the runtime pretty much has to be GPL-licensed anyway, since it borrows code from glibc and Xen (for purposes I'm pretty sure are not covered by the sanitizer runtimes).

So my questions for you are whether this contribution would be welcome, and in particular any red lines about how to do instrumentation, how to factor everything, and how to deal with the external dependencies. As I currently envisage things, the gold plugin must live out-of-tree since it will require my libraries to build; I don't believe equivalent library support exists within LLVM. This being out-of-tree seems not a huge loss given that the runtime also will be.

Oh, and runtime support exists for x86-64/Linux only at the moment, though there is a bit of code for FreeBSD.

For the interested, here are the research papers I mentioned.

"Dynamically diagnosing run-time type errors in unsafe code" (OOPSLA '16) http://www.cl.cam.ac.uk/~srk31/#oopsla16a

"Towards a dynamic object model within Unix processes" (Onward! '15) http://www.cl.cam.ac.uk/~srk31/#onward15

Code: <https://github.com/stephenrkell/liballocs> <https://github.com/stephenrkell/libcrunch> <https://github.com/stephenrkell/clangcrunch>.

All thoughts appreciated... let me know if you see any obstacles to contribution, or if you're able to help, or just if you have questions. Much obliged,

Stephen.



More information about the llvm-dev mailing list