dolphin/Source/Core
Sintendo b968120f8a Jit64: srawx - Optimize shift by constant
More efficient code can be generated if the shift amount is known at
compile time. We can once again take advantage of shifts with the shift
amount in an 8-bit immediate to eliminate ECX as a scratch register,
reducing register pressure and removing the occasional spill. We can
also do 32-bit shifts instead of 64-bit operations.

We recognize four distinct cases:

- The special case where we're dealing with the PowerPC's quirky shift
  amount masking. If the shift amount is a number from 32 to 63, all
  bits are shifted out and the result it either all zeroes or all ones.

Before:
B9 F0 FF FF FF       mov         ecx,0FFFFFFF0h
8B F7                mov         esi,edi
48 C1 E6 20          shl         rsi,20h
48 D3 FE             sar         rsi,cl
8B C6                mov         eax,esi
48 C1 EE 20          shr         rsi,20h
85 F0                test        eax,esi
0F 95 45 58          setne       byte ptr [rbp+58h]

After:
8B F7                mov         esi,edi
C1 FE 1F             sar         esi,1Fh
0F 95 45 58          setne       byte ptr [rbp+58h]

- The shift amount is zero. Not calculation needs to be done, just clear
  the carry flag.

Before:
B9 00 00 00 00       mov         ecx,0
49 C1 E5 20          shl         r13,20h
49 D3 FD             sar         r13,cl
41 8B C5             mov         eax,r13d
49 C1 ED 20          shr         r13,20h
44 85 E8             test        eax,r13d
0F 95 45 58          setne       byte ptr [rbp+58h]

After:
C6 45 58 00          mov         byte ptr [rbp+58h],0

- The carry flag doesn't need to be computed. Just do the arithmetic
  shift.

Before:
B9 02 00 00 00       mov         ecx,2
48 C1 E7 20          shl         rdi,20h
48 D3 FF             sar         rdi,cl
48 C1 EF 20          shr         rdi,20h

After:
C1 FF 02             sar         edi,2

- The carry flag must be computed. In addition to the arithmetic shift,
  we do a shift to the left and and them together to know if any ones
  were shifted out. It's still better than before, because we can do
  32-bit shifts.

Before:
B9 02 00 00 00       mov         ecx,2
49 C1 E5 20          shl         r13,20h
49 D3 FD             sar         r13,cl
41 8B C5             mov         eax,r13d
49 C1 ED 20          shr         r13,20h
44 85 E8             test        eax,r13d
0F 95 45 58          setne       byte ptr [rbp+58h]

After:
41 8B C5             mov         eax,r13d
41 C1 FD 02          sar         r13d,2
C1 E0 1E             shl         eax,1Eh
44 85 E8             test        eax,r13d
0F 95 45 58          setne       byte ptr [rbp+58h]
2020-12-25 19:30:51 +01:00
..
AudioCommon AudioCommon: Convert alerts over to fmt-based variants 2020-11-27 10:10:11 -05:00
Common Core: Add new Free Look settings and config 2020-12-24 13:49:25 -06:00
Core Jit64: srawx - Optimize shift by constant 2020-12-25 19:30:51 +01:00
DiscIO DiscIO: Fix recursive directory extraction 2020-12-03 21:13:53 +01:00
DolphinNoGUI Clean up screen saver inhibition and apply setting change immediately. 2020-10-18 16:31:48 -05:00
DolphinQt DolphinQt: Only trigger Free Look mouse movement when the Free Look camera is active 2020-12-24 13:49:25 -06:00
InputCommon InputCommon: Fix callback dispatch deadlock 2020-12-13 00:30:27 +00:00
MacUpdater Reformat repo to clang-format 7.0 rules 2019-05-06 18:48:04 +00:00
UICommon Turn Config::Info into a class with getters 2020-12-11 19:54:16 +01:00
UpdaterCommon Fix updater not always cleaning up temp directory 2020-11-13 12:25:53 -08:00
VideoBackends Core: Remove ImageWrite and get rid of -Wmissing-declarations warnings 2020-12-16 16:04:19 +01:00
VideoCommon VideoCommon: Update active config when we check for config changes, this ensures Free Look settings are copied at the start of the frame. Also update the camera's controller type at this time 2020-12-24 13:51:46 -06:00
WinUpdater DolphinQt: Handle non-ASCII characters in Windows cmd arguments 2020-09-21 17:26:29 +02:00
CMakeLists.txt WinUpdater: Add CMakeLists.txt 2019-05-08 23:59:04 +02:00