Automatic extraction of function bodies from software binaries (original) (raw)

Migration of software, from older general purpose embedded processors onto newer mixed hardware/software platforms, is becoming an increasingly important topic. Generation of these designs requires partitioning of the original code. This partitioning can be done at several levels of granularity, a popular one being functions. In most cases, the partitioning of software is complicated due to the lack of function boundary information. To address this problem, this paper describes a method of automatically extracting function bodies from linked software binaries. It utilizes procedure-calling conventions along with limited control and data flow information. We use binaries of a representative DSP processor platform, the Texas Instruments C6000. Results are reported on eight benchmarks for which our algorithm successfully identifies all functions. This corresponds to 198% more functions than the number correctly identified using simple procedure calling conventions.