From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id A05701F55B; Thu, 11 Jun 2020 19:39:21 +0000 (UTC) Date: Thu, 11 Jun 2020 19:39:21 +0000 From: Eric Wong To: meta@public-inbox.org Subject: amusing CoW string dedupe example Message-ID: <20200611193921.GA17563@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline List-Id: I've always known hash keys get deduplicated in Perl to save RAM for a while; but it turns out that it's possible to (ab)use them for dynamic strings even when the hash doesn't live beyond a subroutine scope. Looks like we'll be able to save RAM in some places :D The following script takes between 5-9M of RAM depending on the version of Perl I'm using (tested down to 5.16.3), but nearly 1G if the early return is uncommented in the `dedupe' sub: ----8<---- #!perl -w use strict; use Devel::Peek; sub dedupe { my ($k) = @_; #return $k; # uncomment to disable CoW dedupe, needs ~1G of RAM my %cow = ( $k => undef ); ((keys %cow)[0]); } my $n = 3_600_000; my $k = pack('S*',($n + 1)..($n + 50000)); # uint16_t array[50000] my @x = map { dedupe($k) } (1..10000); { no warnings 'once'; $Devel::Peek::pv_limit = 10; } Dump(\@x); print 'length: ',length($k), "\n";