all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Thien-Thi Nguyen <ttn@gnuvola.org>
To: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>
Cc: emacs-devel@gnu.org
Subject: Re: Untagging by subtraction instead of masking on USE_LSB_TAG
Date: Mon, 28 Jan 2008 04:52:22 +0100	[thread overview]
Message-ID: <87ve5ezjwp.fsf@ambire.localdomain> (raw)
In-Reply-To: <wl8x2aaejj.wl%mituharu@math.s.chiba-u.ac.jp> (YAMAMOTO Mitsuharu's message of "Mon, 28 Jan 2008 11:07:28 +0900")

() YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>
() Mon, 28 Jan 2008 11:07:28 +0900

     _cons_to_long:                  _cons_to_long:
             andi. r0,r3,7                   andi. r0,r3,7
             srawi r0,r3,3                   srawi r0,r3,3
             beq cr0,L592                    beq cr0,L592
             rlwinm r2,r3,0,0,28
A            lwz r9,4(r2)                    lwz r9,-1(r3)
B            lwz r3,0(r2)                    lwz r3,-5(r3)
             rlwinm r0,r9,0,29,31            rlwinm r0,r9,0,29,31
             cmpwi cr7,r0,5                  cmpwi cr7,r0,5
             bne cr7,L593                    bne cr7,L593
             rlwinm r2,r9,0,0,28
C            lwz r9,0(r2)                    lwz r9,-5(r9)
     L593:                           L593:
             rlwinm r2,r3,13,0,15            rlwinm r2,r3,13,0,15
             srawi r0,r9,3                   srawi r0,r9,3
             or r0,r2,r0                     or r0,r2,r0
     L592:                           L592:
             mr r3,r0                        mr r3,r0
             blr                             blr

   This would make sense if the latency of load/store does not
   depend on its displacement (I'm not sure if that is the case in
   general).  Comments?

For masking, i see offsets (lwz) of 4,0,0 (lines A,B,C).
For subtraction, -1,-5,-5.

It's very possible that the machine can handle 4,0,0 more
efficiently; those all are even (0, modulo 2) and in two cases
"nothing"!  Furthermore, the maximum absolute offset for the
subtraction method is 5, which is larger (faaarther away) than 4.

Anyway, here is an excerpt from p.532 of "PowerPC 405, Embedded
Processor Core, User's Manual":

| C.2.6     Alignment in Scalar Load and Store Instructions
| 
| The PPC405 requires an extra cycle to execute scalar loads and
| stores having unaligned big or little endian data (except for
| lwarx and stwcx., which require word-aligned operands). If the
| target data is not operand aligned, and the sum of the least two
| significant bits of the effective address (EA) and the byte count
| is greater than four, the PPC405 decomposes a load or store scalar
| into two load or store operations. That is, the PPC405 never
| presents the DCU with a request for a transfer that crosses a word
| boundary. For example, a lwz with an EA of 0b11 causes the PPC405
| to decompose the lwz into two load operations. The first load
| operation is for a byte at the starting effective address; the
| second load operation is for three bytes, starting at the next
| word address.

But don't heed my (mostly) ignorant gut feelings!  Esperience sez:
isolate the variable; build two versions; compare on "typical"
workload; if (dis)advantage is under some "wow!"  threshold, write
down your findings in the notebook (for Emacs, comments would be
fine), but prioritize maintainability (i.e, refrain from
implementing).

I am interested in how you define "typical" and "wow!".

Seasons change, pipelines change.  Keep in mind that sometimes
optimization now translates to pessimization down the road.

thi

  reply	other threads:[~2008-01-28  3:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-28  2:07 Untagging by subtraction instead of masking on USE_LSB_TAG YAMAMOTO Mitsuharu
2008-01-28  3:52 ` Thien-Thi Nguyen [this message]
2008-01-28  4:22   ` Miles Bader
2008-01-28  4:25     ` Miles Bader
2008-01-28  5:02       ` YAMAMOTO Mitsuharu
2008-01-28 15:07 ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ve5ezjwp.fsf@ambire.localdomain \
    --to=ttn@gnuvola.org \
    --cc=emacs-devel@gnu.org \
    --cc=mituharu@math.s.chiba-u.ac.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.