buffer: add Buffer.from(), Buffer.alloc() and Buffer.allocUnsafe(), soft-deprecate Buffer(num) by jasnell · Pull Request #4682 · nodejs/node (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation231 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

jasnell

Node-EPS: nodejs/node-eps#7
Ref: #4660,

Note: PR description updated to reflect new naming and implementation details

@jasnell jasnell added wip

Issues and PRs that are still a work in progress.

buffer

Issues and PRs related to the buffer subsystem.

semver-major

PRs that contain breaking changes and should be released in the next major version.

labels

Jan 14, 2016

@silverwind

+1 on the current approach, but can we please find another name? safe and unsafe might be immediately clear in the context of the current issue, but imagine someone who is unfamiliar reading the docs. How about something like

Buffer.allocate(num[, unsafe])

where unsafe is a boolean. Maybe even find a better name for the argument.

@jasnell

-1 on Buffer.allocate(num[, unsafe])

Part of the motivation for this change is to make the choice between using
a prefilled buffer or not more explicit and obvious. We don't achieve that
using a single method with an optional Boolean.

We can bikeshed the names but safe and unsafe are concise and to the
point.
On Jan 13, 2016 9:57 PM, "silverwind" notifications@github.com wrote:

+1 on the approach, but can we please find another name? safe and unsafe
might be immediately clear in the context of the current issue, but imagine
someone fresh reading the docs. How about something like

Buffer.allocate(num[, unsafe])

where unsafe is a boolean.


Reply to this email directly or view it on GitHub
#4682 (comment).

@jasnell

@ChALkeR

@jasnell I might miss the point of this a bit, but just to be sure:

  1. This commit features hard deprecation and does not touch the docs at all. The proposed solution was soft (doc only) deprecation at first.
  2. The deprecation message tells people to use «Buffer.safe or Buffer.unsafe» with no warnings against the latter. The latter should be either removed from this notice or reworded, see Buffer(number) is unsafe #4660 (comment) and Buffer(number) is unsafe #4660 (comment).
  3. Buffer.safe(number) probably needs new tests, including those that make sure that it's zero-filled.

@saghul

Just a quick note/question: with this approach a safe Buffer would be basically built with malloc + memset, right? (I'm looking here)

In order to make safe buffers faster it might be a good idea to use calloc, since the zeroing could essentially be free in some cases. Repeating the benchmarks @mscdex did afterwards would be nice.

Related question: is there a reason why the ArrayBuffer here gets the data allocated right above instead of just passing the size and letting it use its constructor which would already use calloc if the zero-fill flag was set? (see here).

PS: I just glanced throught the code out of curiosity, I might be totally off-base here. Sorry for the noise if that is the case.

@ChALkeR

@silverwind
-1 on Buffer.allocate(num[, unsafe]).

Several reasons:

  1. That is a bit harder to grep and to automatically find all the places where you (or someone else) is using unitialized Buffers.
  2. That flag changes the behaviour of a function in an unobvious way (I have seen people failing to understand all impications of using unitialized buffers).
  3. The names for these functions should better be self-describing.
  4. Merging two different methods with different behaviour into a one that changes it's behaviour depending on a flag isn't a good design move.
  5. (Pretty important). The documentation on those methods should be separate, and it's very important. While Buffer.safe could be documented as usually, Buffer.unsafe documentation should describe all the dangers of using unitialized Buffers.

@ChALkeR

Once more notice: as it looks now, Buffer.safe completely ignores the pooled behaviour. What impications would that have on performance of Buffer.safe(number) vs Buffer.unsafe(number).fill(0)?

/cc @trevnorris

@rvagg

if someone wants to quickly do some benchmarks for us, you can grab N|Solid and run the buffer benchmarks. It's based on the latest v4.x and comes with a "policies" feature that you can use to zero-fill by default—it switches out the malloc with a calloc when enabled, see: https://docs.nodesource.com/nsolid/1.2/docs/policies. The performance profile for this would be very close to what we'd end up with by doing this by default in core because it's basically the same thing.

@jasnell

@ChALkeR ... This PR is still a work in process so some pieces are still missing or not quite complete. It's intended to facilitate more constructive conversation about what changes we need to make. I fully intend to iterate on it more and add the necessary docs.

@jasnell

@saghul ... with the approach currently implemented by this PR, calling Buffer.safe() allocates a new Buffer using calloc (see https://github.com/nodejs/node/pull/4682/files#diff-196d056a936b6d2649721eb639e0442bR90). Calling createBuffer() with flags[kNoZeroFill] set to 0 will cause the ArrayBufferAllocator::Allocate() function in node.cc to use calloc(size,1) as opposed to malloc.

Currently, when Node starts up, a large buffer pool is created. New Buffer instances created using Buffer(num) are allocated off that pool and the memory is NOT zeroed out each time. Calling Buffer.safe() bypasses this pool and ensures that a new zero-filled allocation is created each time. You don't get the performance benefits of the pool but it's safer overall. And calling calloc should be a bit faster than calling fill(0) after the memory is already allocated, but that needs to be benchmarked (I'm sure @trevnorris could say for sure).

I'll be looking for ways of making this more efficient but given that the intent is to provide a more obvious way of creating "safe" buffers, going straight to calloc each time likely makes the most sense. We'll see tho.

@saghul

@jasnell

Looking at the implementation for the --safe-buffers command line flag now... there are a couple of design questions. The intent of --safe-buffers is to force all newly created Buffers to be zero filled automatically. Given that, there are three approaches I can take with this:

  1. Still use the slab buffer pool and fill(0) all Buffers after they are created. This works ok for smallish Buffers but could have significant performance impact on large buffers.
  2. Skip the slab buffer pool and simply calloc all new Buffers. This is more efficient than calling fill(0) but means we don't enjoy the benefits of the slab allocated pool.
  3. Use a hybrid approach... for Buffers under a certain size, we allocate off the pool and fill(0), but for Buffers over that size we fall back to calloc.

Regardless of the option chosen, --safe-buffers is going to cause a significant performance degradation, it's just a matter of figuring out how to minimize it.

@ChALkeR

@jasnell Btw, about --safe-buffers — I don't like that name. Buffers are generally safe, careless (or accidential) usage of Buffer(number) isn't. Also, that flag probably should be there only until Buffer(number) is hard deprecated (or removed). Perhaps giving it a more explicit name, like --safe-buffer-from-number or --safe-buffer-number would be better.

@ChALkeR

@jasnell About the Buffer(number) speed under the flag:

  1. Buffer.safe(number) should try best to be fast with various number values. That could involve calloc-ing for all number values, or callocing for large number values and using a pooled approach (like Buffer.unsafe(number).fill(0)) for smaller number values, whichever is faster.
  2. Buffer(number) under the --safe-buffer-number flag (or however it would be called) should behave exactly like Buffer.safe(number) and reuse that code. That will guarantee that it's as fast as Buffer(number) is and minimize the codebase, lowering the probability of introducing bugs.

@jasnell

@trevnorris @silverwind @saghul @ChALkeR ... ok, pushed an update:

Still todo: add the --safe-buffer command line flag... per @ChALkeR's comments I'll likely rename it to --zero-fill-buffers

@Fishrock123

This is moving a bit quick, but some general things:

  1. If we are going to do this, Buffer(number) should be deprecated, probably first only in docs. I prefer runtime deprecations, since you can ignore them --no-deprecation, but I understand some users don't.
  2. safe() and unsafe() are not actually that descriptive. These sound like you are turning something on or off. I suggest going for allocSafe() and allocUnsafe().
  3. --zero-fill-buffers is a better naming, yes.

You mentioned the pooling aspect. Would it be worthwhile to have a safe pool, or does that defeat the purpose?

@jasnell Could you make sure to split up these commits, one replacing usage in core should be separate as it is quite large and makes this difficult to review.

@jasnell

@Fishrock123 ... I have absolutely no intention of landing this any time soon so there shouldn't be a worry about it moving too quickly :-) ... the intent right now is to get concrete options in the form of working code on the table for discussion as opposed to going around and around discussing whether or not it's a bug or not.

@jasnell

oh, another point, having a safe pool likely isn't worthwhile because you'd end up having to zero fill every time anyway to reset from the last time :-) An argument could be made that if --zero-fill-buffers is set, the pool should be zero filled on creation but given that we end up having to zero fill later anyway, it's a bit redundant and wouldn't buy us anything.

Also, the current implementation of Buffer.safe() simply does a Buffer.unsafe().fill(), which shouldn't be more efficient that simply doing a calloc but benchmarks are showing that it is... which baffles me a bit. I'd like to get that figured out.

@jasnell

@Fishrock123 ... I reswizzled the commits to make it easier to review.

@MylesBorins

@jasnell

Pushed a new commit to use the names zalloc() and alloc() instead

@jasnell jasnell changed the titlebuffer: add Buffer.safe() and Buffer.unsafe(), deprecate Buffer(num) buffer: add Buffer.zalloc() and Buffer.alloc(), deprecate Buffer(num)

Jan 14, 2016

@Fishrock123

@jasnell

zalloc == 'zeroed-allocation'

jasnell added a commit that referenced this pull request

Apr 26, 2016

@jasnell

The following significant (semver-major) changes have been made since the previous Node v5.0.0 release.

jasnell added a commit that referenced this pull request

Apr 26, 2016

@jasnell

The following significant (semver-major) changes have been made since the previous Node v5.0.0 release.

jasnell added a commit that referenced this pull request

Apr 26, 2016

@jasnell

The following significant (semver-major) changes have been made since the previous Node v5.0.0 release.

ChALkeR added a commit to ChALkeR/io.js that referenced this pull request

Jul 8, 2016

@ChALkeR

This backports the new Buffer.alloc(), Buffer.allocUnsafe(), Buffer.from(), and Buffer.allocUnsafeSlow() APIs for v4.

Some backported tests are disabled, but those are not related to the new API.

Note that Buffer.from(arrayBuffer[, byteOffset [, length]]) is not supported in v4.x, only Buffer.from(arrayBuffer) is.

Refs: nodejs#4682 Refs: nodejs#5833 Refs: nodejs#7475 PR-URL: nodejs#7562 Reviewed-By: Trevor Norris trev.norris@gmail.com Reviewed-By: James M Snell jasnell@gmail.com Reviewed-By: Anna Henningsen anna@addaleax.net Reviewed-By: Nikolai Vavilov vvnicholas@gmail.com

MylesBorins pushed a commit that referenced this pull request

Jul 8, 2016

@ChALkeR

This backports the new Buffer.alloc(), Buffer.allocUnsafe(), Buffer.from(), and Buffer.allocUnsafeSlow() APIs for v4.

Some backported tests are disabled, but those are not related to the new API.

Note that Buffer.from(arrayBuffer[, byteOffset [, length]]) is not supported in v4.x, only Buffer.from(arrayBuffer) is.

Refs: #4682 Refs: #5833 Refs: #7475 PR-URL: #7562 Reviewed-By: Trevor Norris trev.norris@gmail.com Reviewed-By: James M Snell jasnell@gmail.com Reviewed-By: Anna Henningsen anna@addaleax.net Reviewed-By: Nikolai Vavilov vvnicholas@gmail.com

MylesBorins pushed a commit that referenced this pull request

Jul 11, 2016

@ChALkeR

This backports the new Buffer.alloc(), Buffer.allocUnsafe(), Buffer.from(), and Buffer.allocUnsafeSlow() APIs for v4.

Some backported tests are disabled, but those are not related to the new API.

Note that Buffer.from(arrayBuffer[, byteOffset [, length]]) is not supported in v4.x, only Buffer.from(arrayBuffer) is.

Refs: #4682 Refs: #5833 Refs: #7475 PR-URL: #7562 Reviewed-By: Trevor Norris trev.norris@gmail.com Reviewed-By: James M Snell jasnell@gmail.com Reviewed-By: Anna Henningsen anna@addaleax.net Reviewed-By: Nikolai Vavilov vvnicholas@gmail.com

MylesBorins pushed a commit that referenced this pull request

Jul 12, 2016

@ChALkeR

This backports the new Buffer.alloc(), Buffer.allocUnsafe(), Buffer.from(), and Buffer.allocUnsafeSlow() APIs for v4.

Some backported tests are disabled, but those are not related to the new API.

Note that Buffer.from(arrayBuffer[, byteOffset [, length]]) is not supported in v4.x, only Buffer.from(arrayBuffer) is.

Refs: #4682 Refs: #5833 Refs: #7475 PR-URL: #7562 Reviewed-By: Trevor Norris trev.norris@gmail.com Reviewed-By: James M Snell jasnell@gmail.com Reviewed-By: Anna Henningsen anna@addaleax.net Reviewed-By: Nikolai Vavilov vvnicholas@gmail.com

MylesBorins pushed a commit that referenced this pull request

Jul 14, 2016

@ChALkeR

This backports the new Buffer.alloc(), Buffer.allocUnsafe(), Buffer.from(), and Buffer.allocUnsafeSlow() APIs for v4.

Some backported tests are disabled, but those are not related to the new API.

Note that Buffer.from(arrayBuffer[, byteOffset [, length]]) is not supported in v4.x, only Buffer.from(arrayBuffer) is.

Refs: #4682 Refs: #5833 Refs: #7475 PR-URL: #7562 Reviewed-By: Trevor Norris trev.norris@gmail.com Reviewed-By: James M Snell jasnell@gmail.com Reviewed-By: Anna Henningsen anna@addaleax.net Reviewed-By: Nikolai Vavilov vvnicholas@gmail.com

MylesBorins pushed a commit that referenced this pull request

Jul 14, 2016

@ChALkeR

This backports the new Buffer.alloc(), Buffer.allocUnsafe(), Buffer.from(), and Buffer.allocUnsafeSlow() APIs for v4.

Some backported tests are disabled, but those are not related to the new API.

Note that Buffer.from(arrayBuffer[, byteOffset [, length]]) is not supported in v4.x, only Buffer.from(arrayBuffer) is.

Refs: #4682 Refs: #5833 Refs: #7475 PR-URL: #7562 Reviewed-By: Trevor Norris trev.norris@gmail.com Reviewed-By: James M Snell jasnell@gmail.com Reviewed-By: Anna Henningsen anna@addaleax.net Reviewed-By: Nikolai Vavilov vvnicholas@gmail.com

@bminer

Just realized the Buffer.from is defined in Node 4.x and it cannot be overwritten without Object.defineProperty call. Makes it a tad annoying for Node 4.x polyfills.

Just thought I'd mention it since new Buffer is technically deprecated now and many users are still using Node 4.x.

@MylesBorins

@bminer it is defined because we backported the function. No need to polyfill afaik

@bminer

Right, but I have systems running 6.x and Node 4.2.2, for example.

So, I have to do this to use Buffer.from everywhere:

if(parseInt(process.versions.node.split(".")[0]) < 6) { Object.defineProperty(Buffer, "from", {"value": function() { return Buffer.apply(this, arguments); } }); }

@MylesBorins

is there a particular reason you are running on v4. 2.2? this is an
understatement supported and insecure version of v4

On Thu, Oct 6, 2016, 8:26 PM Blake Miner notifications@github.com wrote:

Right, but I have systems running 6.x and Node 4.2.2, for example.

So, I have to do this to use Buffer.from everywhere:

if(parseInt(process.versions.node.split(".")[0]) < 6) {
Object.defineProperty(Buffer, "from", {"value": function() {
return Buffer.apply(this, arguments);
} });
}


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#4682 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAecV7Iyx0ZkNw5MllbyrNIY49E8VUgaks5qxZGqgaJpZM4HEl70
.

@bminer

@thealphanerd - Good point. I should probably upgrade to latest 4.x. Just ignore me. :)

gluons referenced this pull request in gluons/gulp-json2cson

Nov 22, 2016

@gluons

ChALkeR

flags[kNoZeroFill] = 1;
// Only pay attention to encoding if it's a string. This
// prevents accidentally sending in a number that would
// be interpretted as a start offset.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Labels

buffer

Issues and PRs related to the buffer subsystem.

semver-major

PRs that contain breaking changes and should be released in the next major version.