fix(backend): EmitLargeImmediate 跳过前导零,避免冗余 movz #0

32-bit 立即数低 16 位为零时(如 0x00020000),直接发射移位
movz 而非 movz #0 + movk 双指令。crypto -7, fft -2, h-4 -1,
h-10 -1,总计 -33 条,零退化。
lzk
lzkk 5 days ago
parent bb58aac749
commit acdac5391d

@ -223,6 +223,11 @@ namespace mir
{
continue;
}
// 跳过前导零——直接用移位后的 movz避免浪费 movz #0
if (!emitted && part == 0)
{
continue;
}
if (!emitted)
{

@ -45,3 +45,15 @@
- **效果**functional 测试从 87/88 → **100/100 全部通过**
- **已知局限**30_many_dimensions19 维多维数组参数)仍失败,该 bug 在降级层(无优化也错),需专项修复 GEP 偏移计算
- **后续**30_many_dimensions 已知根因在多维数组 GEP 降级,待后续处理
---
## 2026-05-25 | Movz #0 前导零优化
- **类型**后端AsmPrinter
- **假设**EmitLargeImmediate 中,当 32-bit 立即数的低 16-bit 为零时,应该直接用移位后的 movz而不是先 `movz #0``movk`。例如 `0x00020000``movz w8, #2, lsl #16` 而非 `movz w8, #0; movk w8, #2, lsl #16`
- **实现**AsmPrinter.cpp EmitLargeImmediate 循环中,`!emitted && part == 0` 时跳过3 行),保持底部 `!emitted → mov #0` 兜底处理全零情况
- **指令数效果**:减少 33 条crypto -7×3、fft -2×3、h-4 -1×3、h-10 -1×3
- **退化**:无
- **功能测试**100/100 functional 通过30/31 h_functional 通过1 个预存故障 30_many_dimensions
- **已知局限**:仅修复 EmitLargeImmediateEmitStackAdjust/EmitAddressFromBase 中的 movz 模式仍有同样问题,可后续统一

@ -32,21 +32,21 @@
| performance/crc1 | 279 |
| performance/crc2 | 279 |
| performance/crc3 | 279 |
| performance/crypto-1 | 1926 |
| performance/crypto-2 | 1926 |
| performance/crypto-3 | 1926 |
| performance/fft0 | 597 |
| performance/fft1 | 597 |
| performance/fft2 | 597 |
| performance/crypto-1 | 1919 |
| performance/crypto-2 | 1919 |
| performance/crypto-3 | 1919 |
| performance/fft0 | 595 |
| performance/fft1 | 595 |
| performance/fft2 | 595 |
| performance/h-1-01 | 157 |
| performance/h-1-02 | 157 |
| performance/h-1-03 | 157 |
| performance/h-10-01 | 328 |
| performance/h-10-02 | 328 |
| performance/h-10-03 | 328 |
| performance/h-4-01 | 163 |
| performance/h-4-02 | 163 |
| performance/h-4-03 | 163 |
| performance/h-10-01 | 327 |
| performance/h-10-02 | 327 |
| performance/h-10-03 | 327 |
| performance/h-4-01 | 162 |
| performance/h-4-02 | 162 |
| performance/h-4-03 | 162 |
| performance/h-5-01 | 338 |
| performance/h-5-02 | 338 |
| performance/h-5-03 | 338 |

Loading…
Cancel
Save