1; RUN: llc < %s -x86-experimental-vector-shuffle-lowering=false -mattr=+avx2 | FileCheck %s 2 3target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" 4target triple = "x86_64-apple-darwin" 5 6; PR21876 7; The old shuffle lowering sometimes generates VZEXT nodes with both input 8; and output same-sized types, here 256-bits. For instance, a v8i8 to v8i32 9; zero-extend would become a (v8i32 (VZEXT v32i8)) node, which can't happen 10; otherwise. The companion commit r223996 added those patterns temporarily. 11; This test, along with the VR256 for AVX2 PMOVXrr instructions, should be 12; removed once the old vector shuffle lowering goes away. 13 14define void @test_avx2_pmovx_256(<8 x i8>* %tmp64, <8 x float>* %tmp75) { 15; CHECK-LABEL: test_avx2_pmovx_256 16; We really don't care about the generated code. 17; CHECK: vpmovzxbd 18; CHECK: vpbroadcastd 19; CHECK: vpand 20; CHECK: vcvtdq2ps 21; CHECK: vmovups 22; CHECK: vzeroupper 23; CHECK: retq 24 25 %wide.load458 = load <8 x i8>* %tmp64, align 1 26 %tmp68 = uitofp <8 x i8> %wide.load458 to <8 x float> 27 store <8 x float> %tmp68, <8 x float>* %tmp75, align 4 28 ret void 29} 30