Warning: Permanently added '13.216.3.209' (ED25519) to the list of known hosts. You can reproduce this build on your computer by running: sudo dnf install copr-rpmbuild /usr/bin/copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.fedorainfracloud.org/backend/get-build-task/9598689-fedora-43-x86_64 --chroot fedora-43-x86_64 Version: 1.5 PID: 86475 Logging PID: 86477 Task: {'allow_user_ssh': False, 'appstream': False, 'background': False, 'build_id': 9598689, 'buildroot_pkgs': [], 'chroot': 'fedora-43-x86_64', 'enable_net': False, 'fedora_review': False, 'git_hash': '450753dbd937afb35cffb74e3108030675abc223', 'git_repo': 'https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/RH/composable_kernel', 'isolation': 'default', 'memory_reqs': 2048, 'package_name': 'composable_kernel', 'package_version': '6.4.2-1', 'project_dirname': 'RH', 'project_name': 'RH', 'project_owner': '@rocm-packagers-sig', 'repo_priority': None, 'repos': [{'baseurl': 'https://download.copr.fedorainfracloud.org/results/@rocm-packagers-sig/RH/fedora-43-x86_64/', 'id': 'copr_base', 'name': 'Copr repository', 'priority': None}], 'sandbox': '@rocm-packagers-sig/RH--trix', 'source_json': {}, 'source_type': None, 'ssh_public_keys': None, 'storage': 0, 'submitter': 'trix', 'tags': [], 'task_id': '9598689-fedora-43-x86_64', 'timeout': 180000, 'uses_devel_repo': False, 'with_opts': [], 'without_opts': []} Running: git clone https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/RH/composable_kernel /var/lib/copr-rpmbuild/workspace/workdir-i_m0k2ik/composable_kernel --depth 500 --no-single-branch --recursive cmd: ['git', 'clone', 'https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/RH/composable_kernel', '/var/lib/copr-rpmbuild/workspace/workdir-i_m0k2ik/composable_kernel', '--depth', '500', '--no-single-branch', '--recursive'] cwd: . rc: 0 stdout: stderr: Cloning into '/var/lib/copr-rpmbuild/workspace/workdir-i_m0k2ik/composable_kernel'... Running: git checkout 450753dbd937afb35cffb74e3108030675abc223 -- cmd: ['git', 'checkout', '450753dbd937afb35cffb74e3108030675abc223', '--'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-i_m0k2ik/composable_kernel rc: 0 stdout: stderr: Note: switching to '450753dbd937afb35cffb74e3108030675abc223'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example: git switch -c Or undo this operation with: git switch - Turn off this advice by setting config variable advice.detachedHead to false HEAD is now at 450753d automatic import of composable_kernel Running: dist-git-client sources cmd: ['dist-git-client', 'sources'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-i_m0k2ik/composable_kernel rc: 0 stdout: stderr: INFO: Reading stdout from command: git rev-parse --abbrev-ref HEAD INFO: Reading stdout from command: git rev-parse HEAD INFO: Reading sources specification file: sources INFO: Downloading composable_kernel-6.4.2.tar.gz INFO: Reading stdout from command: curl --help all INFO: Calling: curl -H Pragma: -o composable_kernel-6.4.2.tar.gz --location --connect-timeout 60 --retry 3 --retry-delay 10 --remote-time --show-error --fail --retry-all-errors https://copr-dist-git.fedorainfracloud.org/repo/pkgs/@rocm-packagers-sig/RH/composable_kernel/composable_kernel-6.4.2.tar.gz/md5/88eeca911106ed5ced4721bb650ce57e/composable_kernel-6.4.2.tar.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 4270k 100 4270k 0 0 141M 0 --:--:-- --:--:-- --:--:-- 143M INFO: Reading stdout from command: md5sum composable_kernel-6.4.2.tar.gz tail: /var/lib/copr-rpmbuild/main.log: file truncated Running (timeout=180000): unbuffer mock --spec /var/lib/copr-rpmbuild/workspace/workdir-i_m0k2ik/composable_kernel/composable_kernel.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-i_m0k2ik/composable_kernel --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1758760224.854724 -r /var/lib/copr-rpmbuild/results/configs/child.cfg INFO: mock.py version 6.3 starting (python version = 3.13.7, NVR = mock-6.3-1.fc42), args: /usr/libexec/mock/mock --spec /var/lib/copr-rpmbuild/workspace/workdir-i_m0k2ik/composable_kernel/composable_kernel.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-i_m0k2ik/composable_kernel --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1758760224.854724 -r /var/lib/copr-rpmbuild/results/configs/child.cfg Start(bootstrap): init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish(bootstrap): init plugins Start: init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish: init plugins INFO: Signal handler active Start: run INFO: Start(/var/lib/copr-rpmbuild/workspace/workdir-i_m0k2ik/composable_kernel/composable_kernel.spec) Config(fedora-43-x86_64) Start: clean chroot Finish: clean chroot Mock Version: 6.3 INFO: Mock Version: 6.3 Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1758760224.854724/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata INFO: Guessed host environment type: unknown INFO: Using container image: registry.fedoraproject.org/fedora:43 INFO: Pulling image: registry.fedoraproject.org/fedora:43 INFO: Tagging container image as mock-bootstrap-100615d0-62ec-4733-bd98-8550eacc1f51 INFO: Checking that b3e51017c664f74e59c94e690758e8f8152a7fe2f80ec6d551de6a072b676324 image matches host's architecture INFO: Copy content of container b3e51017c664f74e59c94e690758e8f8152a7fe2f80ec6d551de6a072b676324 to /var/lib/mock/fedora-43-x86_64-bootstrap-1758760224.854724/root INFO: mounting b3e51017c664f74e59c94e690758e8f8152a7fe2f80ec6d551de6a072b676324 with podman image mount INFO: image b3e51017c664f74e59c94e690758e8f8152a7fe2f80ec6d551de6a072b676324 as /var/lib/containers/storage/overlay/43eb459982dd22378aabc82e313eee5f8df7ae485fe25fb51b7f22db7ca9dbcd/merged INFO: umounting image b3e51017c664f74e59c94e690758e8f8152a7fe2f80ec6d551de6a072b676324 (/var/lib/containers/storage/overlay/43eb459982dd22378aabc82e313eee5f8df7ae485fe25fb51b7f22db7ca9dbcd/merged) with podman image umount INFO: Removing image mock-bootstrap-100615d0-62ec-4733-bd98-8550eacc1f51 INFO: Package manager dnf5 detected and used (fallback) INFO: Not updating bootstrap chroot, bootstrap_image_ready=True Start(bootstrap): creating root cache Finish(bootstrap): creating root cache Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-1758760224.854724/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Package manager dnf5 detected and used (direct choice) INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-5.99.91-5.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 dnf5-5.2.17.0-1.fc43.x86_64 dnf5-plugins-5.2.17.0-1.fc43.x86_64 Start: installing minimal buildroot with dnf5 Updating and loading repositories: Copr repository 100% | 109.4 KiB/s | 1.5 KiB | 00m00s fedora 100% | 60.0 KiB/s | 26.0 KiB | 00m00s updates 100% | 221.8 KiB/s | 30.2 KiB | 00m00s Repositories loaded. Package Arch Version Repository Size Installing group/module packages: bash x86_64 5.3.0-2.fc43 fedora 8.4 MiB bzip2 x86_64 1.0.8-21.fc43 fedora 95.3 KiB coreutils x86_64 9.7-5.fc43 fedora 5.4 MiB cpio x86_64 2.15-6.fc43 fedora 1.1 MiB diffutils x86_64 3.12-3.fc43 fedora 1.6 MiB fedora-release-common noarch 43-0.22 fedora 20.4 KiB findutils x86_64 1:4.10.0-6.fc43 fedora 1.8 MiB gawk x86_64 5.3.2-2.fc43 fedora 1.8 MiB glibc-minimal-langpack x86_64 2.42-4.fc43 fedora 0.0 B grep x86_64 3.12-2.fc43 fedora 1.0 MiB gzip x86_64 1.13-4.fc43 fedora 388.8 KiB info x86_64 7.2-6.fc43 fedora 353.9 KiB patch x86_64 2.8-2.fc43 fedora 222.8 KiB redhat-rpm-config noarch 343-11.fc43 fedora 182.9 KiB rpm-build x86_64 5.99.91-5.fc43 fedora 285.5 KiB sed x86_64 4.9-5.fc43 fedora 857.3 KiB shadow-utils x86_64 2:4.18.0-3.fc43 fedora 3.9 MiB tar x86_64 2:1.35-6.fc43 fedora 2.9 MiB unzip x86_64 6.0-67.fc43 fedora 386.3 KiB util-linux x86_64 2.41.1-16.fc43 fedora 3.5 MiB which x86_64 2.23-3.fc43 fedora 83.5 KiB xz x86_64 1:5.8.1-2.fc43 fedora 1.3 MiB Installing dependencies: add-determinism x86_64 0.6.0-2.fc43 fedora 2.4 MiB alternatives x86_64 1.33-2.fc43 fedora 62.2 KiB ansible-srpm-macros noarch 1-18.1.fc43 fedora 35.7 KiB audit-libs x86_64 4.1.1-2.fc43 fedora 378.8 KiB binutils x86_64 2.45-1.fc43 fedora 26.5 MiB build-reproducibility-srpm-macros noarch 0.6.0-2.fc43 fedora 735.0 B bzip2-libs x86_64 1.0.8-21.fc43 fedora 80.6 KiB ca-certificates noarch 2025.2.80_v9.0.304-1.1.fc43 fedora 2.7 MiB coreutils-common x86_64 9.7-5.fc43 fedora 11.3 MiB crypto-policies noarch 20250714-4.gitcd6043a.fc43 fedora 146.9 KiB curl x86_64 8.15.0-2.fc43 fedora 473.6 KiB cyrus-sasl-lib x86_64 2.1.28-33.fc43 fedora 2.3 MiB debugedit x86_64 5.2-3.fc43 fedora 214.0 KiB dwz x86_64 0.16-2.fc43 fedora 287.1 KiB ed x86_64 1.22.2-1.fc43 fedora 148.1 KiB efi-srpm-macros noarch 6-4.fc43 fedora 40.1 KiB elfutils x86_64 0.193-3.fc43 fedora 2.9 MiB elfutils-debuginfod-client x86_64 0.193-3.fc43 fedora 83.9 KiB elfutils-default-yama-scope noarch 0.193-3.fc43 fedora 1.8 KiB elfutils-libelf x86_64 0.193-3.fc43 fedora 1.2 MiB elfutils-libs x86_64 0.193-3.fc43 fedora 683.4 KiB fedora-gpg-keys noarch 43-0.4 fedora 131.2 KiB fedora-release noarch 43-0.22 fedora 0.0 B fedora-release-identity-basic noarch 43-0.22 fedora 658.0 B fedora-repos noarch 43-0.4 fedora 4.9 KiB file x86_64 5.46-7.fc43 fedora 100.2 KiB file-libs x86_64 5.46-7.fc43 fedora 11.9 MiB filesystem x86_64 3.18-50.fc43 fedora 112.0 B filesystem-srpm-macros noarch 3.18-50.fc43 fedora 38.2 KiB fonts-srpm-macros noarch 1:2.0.5-23.fc43 fedora 55.8 KiB forge-srpm-macros noarch 0.4.0-3.fc43 fedora 38.9 KiB fpc-srpm-macros noarch 1.3-15.fc43 fedora 144.0 B gap-srpm-macros noarch 1-1.fc43 fedora 2.0 KiB gdb-minimal x86_64 16.3-5.fc43 fedora 13.3 MiB gdbm-libs x86_64 1:1.23-10.fc43 fedora 129.9 KiB ghc-srpm-macros noarch 1.9.2-3.fc43 fedora 779.0 B glibc x86_64 2.42-4.fc43 fedora 6.7 MiB glibc-common x86_64 2.42-4.fc43 fedora 1.0 MiB glibc-gconv-extra x86_64 2.42-4.fc43 fedora 7.2 MiB gmp x86_64 1:6.3.0-4.fc43 fedora 811.2 KiB gnat-srpm-macros noarch 6-8.fc43 fedora 1.0 KiB gnupg2 x86_64 2.4.8-4.fc43 fedora 6.5 MiB gnupg2-dirmngr x86_64 2.4.8-4.fc43 fedora 618.4 KiB gnupg2-gpg-agent x86_64 2.4.8-4.fc43 fedora 671.4 KiB gnupg2-gpgconf x86_64 2.4.8-4.fc43 fedora 250.0 KiB gnupg2-keyboxd x86_64 2.4.8-4.fc43 fedora 201.4 KiB gnupg2-verify x86_64 2.4.8-4.fc43 fedora 348.5 KiB gnutls x86_64 3.8.10-3.fc43 fedora 3.8 MiB go-srpm-macros noarch 3.8.0-1.fc43 fedora 61.9 KiB gpgverify noarch 2.2-3.fc43 fedora 8.7 KiB ima-evm-utils-libs x86_64 1.6.2-6.fc43 fedora 60.7 KiB jansson x86_64 2.14-3.fc43 fedora 89.1 KiB java-srpm-macros noarch 1-7.fc43 fedora 870.0 B json-c x86_64 0.18-7.fc43 fedora 82.7 KiB kernel-srpm-macros noarch 1.0-27.fc43 fedora 1.9 KiB keyutils-libs x86_64 1.6.3-6.fc43 fedora 54.3 KiB krb5-libs x86_64 1.21.3-7.fc43 fedora 2.3 MiB libacl x86_64 2.3.2-4.fc43 fedora 35.9 KiB libarchive x86_64 3.8.1-3.fc43 fedora 951.1 KiB libassuan x86_64 2.5.7-4.fc43 fedora 163.8 KiB libattr x86_64 2.5.2-6.fc43 fedora 24.4 KiB libblkid x86_64 2.41.1-16.fc43 fedora 262.4 KiB libbrotli x86_64 1.1.0-9.fc43 fedora 833.3 KiB libcap x86_64 2.76-3.fc43 fedora 209.1 KiB libcap-ng x86_64 0.8.5-7.fc43 fedora 68.9 KiB libcom_err x86_64 1.47.3-2.fc43 fedora 63.1 KiB libcurl x86_64 8.15.0-2.fc43 fedora 903.2 KiB libeconf x86_64 0.7.9-2.fc43 fedora 64.9 KiB libevent x86_64 2.1.12-16.fc43 fedora 883.1 KiB libfdisk x86_64 2.41.1-16.fc43 fedora 380.4 KiB libffi x86_64 3.5.1-2.fc43 fedora 83.6 KiB libfsverity x86_64 1.6-3.fc43 fedora 28.5 KiB libgcc x86_64 15.2.1-1.fc43.2 copr_base 266.6 KiB libgcrypt x86_64 1.11.1-2.fc43 fedora 1.6 MiB libgomp x86_64 15.2.1-1.fc43.2 copr_base 541.1 KiB libgpg-error x86_64 1.55-2.fc43 fedora 915.3 KiB libidn2 x86_64 2.3.8-2.fc43 fedora 552.5 KiB libksba x86_64 1.6.7-4.fc43 fedora 398.5 KiB liblastlog2 x86_64 2.41.1-16.fc43 fedora 33.9 KiB libmount x86_64 2.41.1-16.fc43 fedora 372.7 KiB libnghttp2 x86_64 1.66.0-2.fc43 fedora 162.2 KiB libpkgconf x86_64 2.3.0-3.fc43 fedora 78.1 KiB libpsl x86_64 0.21.5-6.fc43 fedora 76.4 KiB libselinux x86_64 3.9-4.fc43 fedora 193.1 KiB libsemanage x86_64 3.9-3.fc43 fedora 308.5 KiB libsepol x86_64 3.9-2.fc43 fedora 822.0 KiB libsmartcols x86_64 2.41.1-16.fc43 fedora 180.5 KiB libssh x86_64 0.11.3-1.fc43 fedora 567.1 KiB libssh-config noarch 0.11.3-1.fc43 fedora 277.0 B libstdc++ x86_64 15.2.1-1.fc43.2 copr_base 2.8 MiB libtasn1 x86_64 4.20.0-2.fc43 fedora 176.3 KiB libtool-ltdl x86_64 2.5.4-7.fc43 fedora 70.1 KiB libunistring x86_64 1.1-10.fc43 fedora 1.7 MiB libusb1 x86_64 1.0.29-4.fc43 fedora 171.3 KiB libuuid x86_64 2.41.1-16.fc43 fedora 37.4 KiB libverto x86_64 0.3.2-11.fc43 fedora 25.4 KiB libxcrypt x86_64 4.4.38-8.fc43 fedora 284.5 KiB libxml2 x86_64 2.12.10-4.fc43 fedora 1.7 MiB libzstd x86_64 1.5.7-2.fc43 fedora 799.9 KiB lua-libs x86_64 5.4.8-2.fc43 fedora 280.8 KiB lua-srpm-macros noarch 1-16.fc43 fedora 1.3 KiB lz4-libs x86_64 1.10.0-3.fc43 fedora 161.4 KiB mpfr x86_64 4.2.2-2.fc43 fedora 832.8 KiB ncurses-base noarch 6.5-7.20250614.fc43 fedora 328.1 KiB ncurses-libs x86_64 6.5-7.20250614.fc43 fedora 946.3 KiB nettle x86_64 3.10.1-2.fc43 fedora 790.6 KiB npth x86_64 1.8-3.fc43 fedora 49.6 KiB ocaml-srpm-macros noarch 11-2.fc43 fedora 1.9 KiB openblas-srpm-macros noarch 2-20.fc43 fedora 112.0 B openldap x86_64 2.6.10-4.fc43 fedora 659.9 KiB openssl-libs x86_64 1:3.5.1-2.fc43 fedora 8.9 MiB p11-kit x86_64 0.25.8-1.fc43 fedora 2.3 MiB p11-kit-trust x86_64 0.25.8-1.fc43 fedora 446.5 KiB package-notes-srpm-macros noarch 0.5-14.fc43 fedora 1.6 KiB pam-libs x86_64 1.7.1-3.fc43 fedora 126.8 KiB pcre2 x86_64 10.46-1.fc43 fedora 697.7 KiB pcre2-syntax noarch 10.46-1.fc43 fedora 275.3 KiB perl-srpm-macros noarch 1-60.fc43 fedora 861.0 B pkgconf x86_64 2.3.0-3.fc43 fedora 88.5 KiB pkgconf-m4 noarch 2.3.0-3.fc43 fedora 14.4 KiB pkgconf-pkg-config x86_64 2.3.0-3.fc43 fedora 989.0 B popt x86_64 1.19-9.fc43 fedora 132.8 KiB publicsuffix-list-dafsa noarch 20250616-2.fc43 fedora 69.1 KiB pyproject-srpm-macros noarch 1.18.4-1.fc43 fedora 1.9 KiB python-srpm-macros noarch 3.14-5.fc43 fedora 51.5 KiB qt5-srpm-macros noarch 5.15.17-2.fc43 fedora 500.0 B qt6-srpm-macros noarch 6.9.2-1.fc43 fedora 464.0 B readline x86_64 8.3-2.fc43 fedora 511.7 KiB rpm x86_64 5.99.91-5.fc43 fedora 3.0 MiB rpm-build-libs x86_64 5.99.91-5.fc43 fedora 268.4 KiB rpm-libs x86_64 5.99.91-5.fc43 fedora 933.7 KiB rpm-sequoia x86_64 1.9.0-2.fc43 fedora 2.5 MiB rpm-sign-libs x86_64 5.99.91-5.fc43 fedora 39.7 KiB rust-srpm-macros noarch 26.4-1.fc43 fedora 4.8 KiB setup noarch 2.15.0-26.fc43 fedora 725.0 KiB sqlite-libs x86_64 3.50.2-2.fc43 fedora 1.5 MiB systemd-libs x86_64 258-1.fc43 fedora 2.3 MiB systemd-standalone-sysusers x86_64 258-1.fc43 fedora 293.5 KiB tpm2-tss x86_64 4.1.3-8.fc43 fedora 1.6 MiB tree-sitter-srpm-macros noarch 0.4.2-1.fc43 fedora 8.3 KiB util-linux-core x86_64 2.41.1-16.fc43 fedora 1.5 MiB xxhash-libs x86_64 0.8.3-3.fc43 fedora 90.2 KiB xz-libs x86_64 1:5.8.1-2.fc43 fedora 217.8 KiB zig-srpm-macros noarch 1-5.fc43 fedora 1.1 KiB zip x86_64 3.0-44.fc43 fedora 694.5 KiB zlib-ng-compat x86_64 2.2.5-2.fc43 fedora 137.6 KiB zstd x86_64 1.5.7-2.fc43 fedora 1.7 MiB Installing groups: Buildsystem building group Transaction Summary: Installing: 169 packages Total size of inbound packages is 59 MiB. Need to download 0 B. After this operation, 198 MiB extra will be used (install 198 MiB, remove 0 B). [ 1/169] tar-2:1.35-6.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 2/169] bzip2-0:1.0.8-21.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 3/169] redhat-rpm-config-0:343-11.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 4/169] rpm-build-0:5.99.91-5.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 5/169] unzip-0:6.0-67.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 6/169] cpio-0:2.15-6.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 7/169] which-0:2.23-3.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 8/169] bash-0:5.3.0-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 9/169] coreutils-0:9.7-5.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 10/169] grep-0:3.12-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 11/169] patch-0:2.8-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 12/169] sed-0:4.9-5.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 13/169] shadow-utils-2:4.18.0-3.fc43. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 14/169] diffutils-0:3.12-3.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 15/169] fedora-release-common-0:43-0. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 16/169] findutils-1:4.10.0-6.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 17/169] glibc-minimal-langpack-0:2.42 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 18/169] gzip-0:1.13-4.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 19/169] info-0:7.2-6.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 20/169] xz-1:5.8.1-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 21/169] util-linux-0:2.41.1-16.fc43.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 22/169] gawk-0:5.3.2-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 23/169] glibc-0:2.42-4.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 24/169] libacl-0:2.3.2-4.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 25/169] libselinux-0:3.9-4.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 26/169] bzip2-libs-0:1.0.8-21.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 27/169] ansible-srpm-macros-0:1-18.1. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 28/169] build-reproducibility-srpm-ma 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 29/169] dwz-0:0.16-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 30/169] efi-srpm-macros-0:6-4.fc43.no 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 31/169] file-0:5.46-7.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 32/169] filesystem-srpm-macros-0:3.18 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 33/169] fonts-srpm-macros-1:2.0.5-23. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 34/169] forge-srpm-macros-0:0.4.0-3.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 35/169] fpc-srpm-macros-0:1.3-15.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 36/169] gap-srpm-macros-0:1-1.fc43.no 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 37/169] ghc-srpm-macros-0:1.9.2-3.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 38/169] gnat-srpm-macros-0:6-8.fc43.n 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 39/169] go-srpm-macros-0:3.8.0-1.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 40/169] java-srpm-macros-0:1-7.fc43.n 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 41/169] kernel-srpm-macros-0:1.0-27.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 42/169] lua-srpm-macros-0:1-16.fc43.n 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 43/169] ocaml-srpm-macros-0:11-2.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 44/169] openblas-srpm-macros-0:2-20.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 45/169] package-notes-srpm-macros-0:0 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 46/169] perl-srpm-macros-0:1-60.fc43. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 47/169] pyproject-srpm-macros-0:1.18. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 48/169] python-srpm-macros-0:3.14-5.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 49/169] qt5-srpm-macros-0:5.15.17-2.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 50/169] qt6-srpm-macros-0:6.9.2-1.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 51/169] rpm-0:5.99.91-5.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 52/169] rust-srpm-macros-0:26.4-1.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 53/169] tree-sitter-srpm-macros-0:0.4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 54/169] zig-srpm-macros-0:1-5.fc43.no 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 55/169] zip-0:3.0-44.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 56/169] debugedit-0:5.2-3.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 57/169] elfutils-0:0.193-3.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 58/169] elfutils-libelf-0:0.193-3.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 59/169] libarchive-0:3.8.1-3.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 60/169] popt-0:1.19-9.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 61/169] readline-0:8.3-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 62/169] rpm-build-libs-0:5.99.91-5.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 63/169] rpm-libs-0:5.99.91-5.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 64/169] zstd-0:1.5.7-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 65/169] filesystem-0:3.18-50.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 66/169] ncurses-libs-0:6.5-7.20250614 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 67/169] coreutils-common-0:9.7-5.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 68/169] gmp-1:6.3.0-4.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 69/169] libattr-0:2.5.2-6.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 70/169] libcap-0:2.76-3.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 71/169] openssl-libs-1:3.5.1-2.fc43.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 72/169] systemd-libs-0:258-1.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 73/169] pcre2-0:10.46-1.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 74/169] ed-0:1.22.2-1.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 75/169] audit-libs-0:4.1.1-2.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 76/169] libeconf-0:0.7.9-2.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 77/169] libsemanage-0:3.9-3.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 78/169] libxcrypt-0:4.4.38-8.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 79/169] pam-libs-0:1.7.1-3.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 80/169] setup-0:2.15.0-26.fc43.noarch 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 81/169] fedora-repos-0:43-0.4.noarch 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 82/169] glibc-common-0:2.42-4.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 83/169] xz-libs-1:5.8.1-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 84/169] libblkid-0:2.41.1-16.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 85/169] libcap-ng-0:0.8.5-7.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 86/169] libfdisk-0:2.41.1-16.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 87/169] liblastlog2-0:2.41.1-16.fc43. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 88/169] libmount-0:2.41.1-16.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 89/169] libsmartcols-0:2.41.1-16.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 90/169] libuuid-0:2.41.1-16.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 91/169] util-linux-core-0:2.41.1-16.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 92/169] zlib-ng-compat-0:2.2.5-2.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 93/169] mpfr-0:4.2.2-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 94/169] glibc-gconv-extra-0:2.42-4.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 95/169] libsepol-0:3.9-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 96/169] add-determinism-0:0.6.0-2.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 97/169] file-libs-0:5.46-7.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 98/169] curl-0:8.15.0-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 99/169] elfutils-libs-0:0.193-3.fc43. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [100/169] elfutils-debuginfod-client-0: 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [101/169] libzstd-0:1.5.7-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [102/169] libxml2-0:2.12.10-4.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [103/169] lz4-libs-0:1.10.0-3.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [104/169] lua-libs-0:5.4.8-2.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [105/169] rpm-sign-libs-0:5.99.91-5.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [106/169] rpm-sequoia-0:1.9.0-2.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [107/169] sqlite-libs-0:3.50.2-2.fc43.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [108/169] ncurses-base-0:6.5-7.20250614 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [109/169] ca-certificates-0:2025.2.80_v 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [110/169] crypto-policies-0:20250714-4. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [111/169] pcre2-syntax-0:10.46-1.fc43.n 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [112/169] fedora-gpg-keys-0:43-0.4.noar 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [113/169] elfutils-default-yama-scope-0 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [114/169] json-c-0:0.18-7.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [115/169] gnupg2-0:2.4.8-4.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [116/169] ima-evm-utils-libs-0:1.6.2-6. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [117/169] libfsverity-0:1.6-3.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [118/169] gpgverify-0:2.2-3.fc43.noarch 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [119/169] gnupg2-dirmngr-0:2.4.8-4.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [120/169] gnupg2-gpg-agent-0:2.4.8-4.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [121/169] gnupg2-gpgconf-0:2.4.8-4.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [122/169] gnupg2-keyboxd-0:2.4.8-4.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [123/169] gnupg2-verify-0:2.4.8-4.fc43. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [124/169] libassuan-0:2.5.7-4.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [125/169] libgcrypt-0:1.11.1-2.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [126/169] libgpg-error-0:1.55-2.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [127/169] npth-0:1.8-3.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [128/169] tpm2-tss-0:4.1.3-8.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [129/169] gnutls-0:3.8.10-3.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [130/169] libksba-0:1.6.7-4.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [131/169] openldap-0:2.6.10-4.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [132/169] libusb1-0:1.0.29-4.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [133/169] libidn2-0:2.3.8-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [134/169] libtasn1-0:4.20.0-2.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [135/169] libunistring-0:1.1-10.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [136/169] nettle-0:3.10.1-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [137/169] p11-kit-0:0.25.8-1.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [138/169] cyrus-sasl-lib-0:2.1.28-33.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [139/169] libevent-0:2.1.12-16.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [140/169] libtool-ltdl-0:2.5.4-7.fc43.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [141/169] libffi-0:3.5.1-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [142/169] gdbm-libs-1:1.23-10.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [143/169] libgcc-0:15.2.1-1.fc43.2.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [144/169] binutils-0:2.45-1.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [145/169] p11-kit-trust-0:0.25.8-1.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [146/169] alternatives-0:1.33-2.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [147/169] jansson-0:2.14-3.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [148/169] pkgconf-pkg-config-0:2.3.0-3. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [149/169] pkgconf-0:2.3.0-3.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [150/169] pkgconf-m4-0:2.3.0-3.fc43.noa 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [151/169] libpkgconf-0:2.3.0-3.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [152/169] libstdc++-0:15.2.1-1.fc43.2.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [153/169] libgomp-0:15.2.1-1.fc43.2.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [154/169] fedora-release-0:43-0.22.noar 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [155/169] systemd-standalone-sysusers-0 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [156/169] gdb-minimal-0:16.3-5.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [157/169] xxhash-libs-0:0.8.3-3.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [158/169] fedora-release-identity-basic 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [159/169] libcurl-0:8.15.0-2.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [160/169] krb5-libs-0:1.21.3-7.fc43.x86 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [161/169] libbrotli-0:1.1.0-9.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [162/169] libnghttp2-0:1.66.0-2.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [163/169] libpsl-0:0.21.5-6.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [164/169] libssh-0:0.11.3-1.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [165/169] keyutils-libs-0:1.6.3-6.fc43. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [166/169] libcom_err-0:1.47.3-2.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [167/169] libverto-0:0.3.2-11.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [168/169] publicsuffix-list-dafsa-0:202 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [169/169] libssh-config-0:0.11.3-1.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded -------------------------------------------------------------------------------- [169/169] Total 100% | 0.0 B/s | 0.0 B | 00m00s Running transaction Importing OpenPGP key 0x31645531: UserID : "Fedora (43) " Fingerprint: C6E7F081CF80E13146676E88829B606631645531 From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-43-primary The key was successfully imported. [ 1/171] Verify package files 100% | 751.0 B/s | 169.0 B | 00m00s >>> Running %pretrans scriptlet: filesystem-0:3.18-50.fc43.x86_64 >>> Finished %pretrans scriptlet: filesystem-0:3.18-50.fc43.x86_64 >>> [RPM] /var/lib/mock/fedora-43-x86_64-1758760224.854724/root/var/cache/dnf/co [ 2/171] Prepare transaction 100% | 3.8 KiB/s | 169.0 B | 00m00s [ 3/171] Installing libgcc-0:15.2.1-1. 100% | 261.9 MiB/s | 268.2 KiB | 00m00s [ 4/171] Installing libssh-config-0:0. 100% | 0.0 B/s | 816.0 B | 00m00s [ 5/171] Installing publicsuffix-list- 100% | 0.0 B/s | 69.8 KiB | 00m00s [ 6/171] Installing fedora-release-ide 100% | 0.0 B/s | 916.0 B | 00m00s [ 7/171] Installing fedora-gpg-keys-0: 100% | 43.7 MiB/s | 179.0 KiB | 00m00s [ 8/171] Installing fedora-repos-0:43- 100% | 0.0 B/s | 5.7 KiB | 00m00s [ 9/171] Installing fedora-release-com 100% | 24.2 MiB/s | 24.7 KiB | 00m00s [ 10/171] Installing fedora-release-0:4 100% | 20.2 KiB/s | 124.0 B | 00m00s >>> Running sysusers scriptlet: setup-0:2.15.0-26.fc43.noarch >>> Finished sysusers scriptlet: setup-0:2.15.0-26.fc43.noarch >>> Scriptlet output: >>> Creating group 'adm' with GID 4. >>> Creating group 'audio' with GID 63. >>> Creating group 'cdrom' with GID 11. >>> Creating group 'clock' with GID 103. >>> Creating group 'dialout' with GID 18. >>> Creating group 'disk' with GID 6. >>> Creating group 'floppy' with GID 19. >>> Creating group 'ftp' with GID 50. >>> Creating group 'games' with GID 20. >>> Creating group 'input' with GID 104. >>> Creating group 'kmem' with GID 9. >>> Creating group 'kvm' with GID 36. >>> Creating group 'lock' with GID 54. >>> Creating group 'lp' with GID 7. >>> Creating group 'mail' with GID 12. >>> Creating group 'man' with GID 15. >>> Creating group 'mem' with GID 8. >>> Creating group 'nobody' with GID 65534. >>> Creating group 'render' with GID 105. >>> Creating group 'root' with GID 0. >>> Creating group 'sgx' with GID 106. >>> Creating group 'sys' with GID 3. >>> Creating group 'tape' with GID 33. >>> Creating group 'tty' with GID 5. >>> Creating group 'users' with GID 100. >>> Creating group 'utmp' with GID 22. >>> Creating group 'video' with GID 39. >>> Creating group 'wheel' with GID 10. >>> Creating user 'adm' (adm) with UID 3 and GID 4. >>> Creating group 'bin' with GID 1. >>> Creating user 'bin' (bin) with UID 1 and GID 1. >>> Creating group 'daemon' with GID 2. >>> Creating user 'daemon' (daemon) with UID 2 and GID 2. >>> Creating user 'ftp' (FTP User) with UID 14 and GID 50. >>> Creating user 'games' (games) with UID 12 and GID 100. >>> Creating user 'halt' (halt) with UID 7 and GID 0. >>> Creating user 'lp' (lp) with UID 4 and GID 7. >>> Creating user 'mail' (mail) with UID 8 and GID 12. >>> Creating user 'nobody' (Kernel Overflow User) with UID 65534 and GID 65534. >>> Creating user 'operator' (operator) with UID 11 and GID 0. >>> Creating user 'root' (Super User) with UID 0 and GID 0. >>> Creating user 'shutdown' (shutdown) with UID 6 and GID 0. >>> Creating user 'sync' (sync) with UID 5 and GID 0. >>> [ 11/171] Installing setup-0:2.15.0-26. 100% | 51.0 MiB/s | 730.6 KiB | 00m00s >>> [RPM] /etc/hosts created as /etc/hosts.rpmnew [ 12/171] Installing filesystem-0:3.18- 100% | 2.8 MiB/s | 212.8 KiB | 00m00s [ 13/171] Installing pkgconf-m4-0:2.3.0 100% | 0.0 B/s | 14.8 KiB | 00m00s [ 14/171] Installing pcre2-syntax-0:10. 100% | 271.2 MiB/s | 277.8 KiB | 00m00s [ 15/171] Installing ncurses-base-0:6.5 100% | 86.3 MiB/s | 353.5 KiB | 00m00s [ 16/171] Installing bash-0:5.3.0-2.fc4 100% | 263.4 MiB/s | 8.4 MiB | 00m00s [ 17/171] Installing glibc-common-0:2.4 100% | 63.8 MiB/s | 1.0 MiB | 00m00s [ 18/171] Installing glibc-gconv-extra- 100% | 281.1 MiB/s | 7.3 MiB | 00m00s [ 19/171] Installing glibc-0:2.42-4.fc4 100% | 186.2 MiB/s | 6.7 MiB | 00m00s [ 20/171] Installing ncurses-libs-0:6.5 100% | 310.1 MiB/s | 952.8 KiB | 00m00s [ 21/171] Installing glibc-minimal-lang 100% | 0.0 B/s | 124.0 B | 00m00s [ 22/171] Installing zlib-ng-compat-0:2 100% | 135.2 MiB/s | 138.4 KiB | 00m00s [ 23/171] Installing bzip2-libs-0:1.0.8 100% | 79.8 MiB/s | 81.7 KiB | 00m00s [ 24/171] Installing libgpg-error-0:1.5 100% | 60.0 MiB/s | 921.1 KiB | 00m00s [ 25/171] Installing libstdc++-0:15.2.1 100% | 406.3 MiB/s | 2.8 MiB | 00m00s [ 26/171] Installing xz-libs-1:5.8.1-2. 100% | 213.8 MiB/s | 218.9 KiB | 00m00s [ 27/171] Installing libassuan-0:2.5.7- 100% | 161.7 MiB/s | 165.6 KiB | 00m00s [ 28/171] Installing libgcrypt-0:1.11.1 100% | 393.8 MiB/s | 1.6 MiB | 00m00s [ 29/171] Installing readline-0:8.3-2.f 100% | 250.9 MiB/s | 513.9 KiB | 00m00s [ 30/171] Installing gmp-1:6.3.0-4.fc43 100% | 397.2 MiB/s | 813.5 KiB | 00m00s [ 31/171] Installing libuuid-0:2.41.1-1 100% | 0.0 B/s | 38.3 KiB | 00m00s [ 32/171] Installing popt-0:1.19-9.fc43 100% | 68.1 MiB/s | 139.4 KiB | 00m00s [ 33/171] Installing npth-0:1.8-3.fc43. 100% | 0.0 B/s | 50.7 KiB | 00m00s [ 34/171] Installing libblkid-0:2.41.1- 100% | 257.4 MiB/s | 263.5 KiB | 00m00s [ 35/171] Installing libxcrypt-0:4.4.38 100% | 280.4 MiB/s | 287.2 KiB | 00m00s [ 36/171] Installing libzstd-0:1.5.7-2. 100% | 391.2 MiB/s | 801.1 KiB | 00m00s [ 37/171] Installing elfutils-libelf-0: 100% | 388.8 MiB/s | 1.2 MiB | 00m00s [ 38/171] Installing sqlite-libs-0:3.50 100% | 379.1 MiB/s | 1.5 MiB | 00m00s [ 39/171] Installing gnupg2-gpgconf-0:2 100% | 20.5 MiB/s | 252.0 KiB | 00m00s [ 40/171] Installing libattr-0:2.5.2-6. 100% | 0.0 B/s | 25.4 KiB | 00m00s [ 41/171] Installing libacl-0:2.3.2-4.f 100% | 0.0 B/s | 36.8 KiB | 00m00s [ 42/171] Installing libtasn1-0:4.20.0- 100% | 173.9 MiB/s | 178.1 KiB | 00m00s [ 43/171] Installing libunistring-0:1.1 100% | 345.3 MiB/s | 1.7 MiB | 00m00s [ 44/171] Installing libidn2-0:2.3.8-2. 100% | 54.6 MiB/s | 558.7 KiB | 00m00s [ 45/171] Installing crypto-policies-0: 100% | 42.0 MiB/s | 172.0 KiB | 00m00s [ 46/171] Installing dwz-0:0.16-2.fc43. 100% | 20.1 MiB/s | 288.5 KiB | 00m00s [ 47/171] Installing gnupg2-verify-0:2. 100% | 28.5 MiB/s | 349.9 KiB | 00m00s [ 48/171] Installing mpfr-0:4.2.2-2.fc4 100% | 271.6 MiB/s | 834.4 KiB | 00m00s [ 49/171] Installing gawk-0:5.3.2-2.fc4 100% | 106.8 MiB/s | 1.8 MiB | 00m00s [ 50/171] Installing libksba-0:1.6.7-4. 100% | 195.8 MiB/s | 401.1 KiB | 00m00s [ 51/171] Installing unzip-0:6.0-67.fc4 100% | 29.3 MiB/s | 389.8 KiB | 00m00s [ 52/171] Installing file-libs-0:5.46-7 100% | 658.7 MiB/s | 11.9 MiB | 00m00s [ 53/171] Installing file-0:5.46-7.fc43 100% | 8.3 MiB/s | 101.7 KiB | 00m00s [ 54/171] Installing pcre2-0:10.46-1.fc 100% | 341.4 MiB/s | 699.1 KiB | 00m00s [ 55/171] Installing grep-0:3.12-2.fc43 100% | 66.8 MiB/s | 1.0 MiB | 00m00s [ 56/171] Installing xz-1:5.8.1-2.fc43. 100% | 78.3 MiB/s | 1.3 MiB | 00m00s [ 57/171] Installing libeconf-0:0.7.9-2 100% | 65.0 MiB/s | 66.5 KiB | 00m00s [ 58/171] Installing libcap-ng-0:0.8.5- 100% | 69.2 MiB/s | 70.8 KiB | 00m00s [ 59/171] Installing audit-libs-0:4.1.1 100% | 186.3 MiB/s | 381.5 KiB | 00m00s [ 60/171] Installing pam-libs-0:1.7.1-3 100% | 126.0 MiB/s | 129.0 KiB | 00m00s [ 61/171] Installing libcap-0:2.76-3.fc 100% | 17.4 MiB/s | 214.3 KiB | 00m00s [ 62/171] Installing systemd-libs-0:258 100% | 387.5 MiB/s | 2.3 MiB | 00m00s [ 63/171] Installing libsmartcols-0:2.4 100% | 177.3 MiB/s | 181.6 KiB | 00m00s [ 64/171] Installing libsepol-0:3.9-2.f 100% | 401.8 MiB/s | 822.9 KiB | 00m00s [ 65/171] Installing libselinux-0:3.9-4 100% | 189.8 MiB/s | 194.4 KiB | 00m00s [ 66/171] Installing sed-0:4.9-5.fc43.x 100% | 56.3 MiB/s | 865.5 KiB | 00m00s [ 67/171] Installing findutils-1:4.10.0 100% | 109.3 MiB/s | 1.9 MiB | 00m00s [ 68/171] Installing libmount-0:2.41.1- 100% | 364.9 MiB/s | 373.7 KiB | 00m00s [ 69/171] Installing lz4-libs-0:1.10.0- 100% | 158.6 MiB/s | 162.5 KiB | 00m00s [ 70/171] Installing lua-libs-0:5.4.8-2 100% | 275.3 MiB/s | 281.9 KiB | 00m00s [ 71/171] Installing json-c-0:0.18-7.fc 100% | 82.0 MiB/s | 84.0 KiB | 00m00s [ 72/171] Installing libffi-0:3.5.1-2.f 100% | 83.0 MiB/s | 85.0 KiB | 00m00s [ 73/171] Installing p11-kit-0:0.25.8-1 100% | 120.6 MiB/s | 2.3 MiB | 00m00s [ 74/171] Installing alternatives-0:1.3 100% | 5.7 MiB/s | 63.8 KiB | 00m00s [ 75/171] Installing p11-kit-trust-0:0. 100% | 20.8 MiB/s | 448.2 KiB | 00m00s [ 76/171] Installing zstd-0:1.5.7-2.fc4 100% | 106.9 MiB/s | 1.7 MiB | 00m00s [ 77/171] Installing util-linux-core-0: 100% | 82.2 MiB/s | 1.5 MiB | 00m00s [ 78/171] Installing tar-2:1.35-6.fc43. 100% | 147.9 MiB/s | 3.0 MiB | 00m00s [ 79/171] Installing libsemanage-0:3.9- 100% | 303.0 MiB/s | 310.3 KiB | 00m00s [ 80/171] Installing systemd-standalone 100% | 23.9 MiB/s | 294.1 KiB | 00m00s [ 81/171] Installing libusb1-0:1.0.29-4 100% | 168.9 MiB/s | 172.9 KiB | 00m00s [ 82/171] Installing zip-0:3.0-44.fc43. 100% | 52.5 MiB/s | 698.4 KiB | 00m00s [ 83/171] Installing gnupg2-keyboxd-0:2 100% | 33.0 MiB/s | 202.7 KiB | 00m00s [ 84/171] Installing libpsl-0:0.21.5-6. 100% | 75.7 MiB/s | 77.5 KiB | 00m00s [ 85/171] Installing liblastlog2-0:2.41 100% | 7.0 MiB/s | 36.0 KiB | 00m00s [ 86/171] Installing libfdisk-0:2.41.1- 100% | 186.3 MiB/s | 381.5 KiB | 00m00s [ 87/171] Installing nettle-0:3.10.1-2. 100% | 258.4 MiB/s | 793.7 KiB | 00m00s [ 88/171] Installing gnutls-0:3.8.10-3. 100% | 349.0 MiB/s | 3.8 MiB | 00m00s [ 89/171] Installing libxml2-0:2.12.10- 100% | 100.3 MiB/s | 1.7 MiB | 00m00s [ 90/171] Installing bzip2-0:1.0.8-21.f 100% | 8.1 MiB/s | 99.8 KiB | 00m00s [ 91/171] Installing add-determinism-0: 100% | 135.8 MiB/s | 2.4 MiB | 00m00s [ 92/171] Installing build-reproducibil 100% | 0.0 B/s | 1.0 KiB | 00m00s [ 93/171] Installing cpio-0:2.15-6.fc43 100% | 73.3 MiB/s | 1.1 MiB | 00m00s [ 94/171] Installing diffutils-0:3.12-3 100% | 97.6 MiB/s | 1.6 MiB | 00m00s [ 95/171] Installing ed-0:1.22.2-1.fc43 100% | 12.2 MiB/s | 150.4 KiB | 00m00s [ 96/171] Installing patch-0:2.8-2.fc43 100% | 18.3 MiB/s | 224.3 KiB | 00m00s [ 97/171] Installing libtool-ltdl-0:2.5 100% | 69.6 MiB/s | 71.2 KiB | 00m00s [ 98/171] Installing gdbm-libs-1:1.23-1 100% | 128.5 MiB/s | 131.6 KiB | 00m00s [ 99/171] Installing cyrus-sasl-lib-0:2 100% | 135.1 MiB/s | 2.3 MiB | 00m00s [100/171] Installing jansson-0:2.14-3.f 100% | 88.3 MiB/s | 90.5 KiB | 00m00s [101/171] Installing libpkgconf-0:2.3.0 100% | 0.0 B/s | 79.2 KiB | 00m00s [102/171] Installing pkgconf-0:2.3.0-3. 100% | 7.4 MiB/s | 91.0 KiB | 00m00s [103/171] Installing pkgconf-pkg-config 100% | 161.2 KiB/s | 1.8 KiB | 00m00s [104/171] Installing libgomp-0:15.2.1-1 100% | 264.9 MiB/s | 542.5 KiB | 00m00s [105/171] Installing xxhash-libs-0:0.8. 100% | 89.4 MiB/s | 91.6 KiB | 00m00s [106/171] Installing libbrotli-0:1.1.0- 100% | 272.0 MiB/s | 835.6 KiB | 00m00s [107/171] Installing libnghttp2-0:1.66. 100% | 159.5 MiB/s | 163.3 KiB | 00m00s [108/171] Installing keyutils-libs-0:1. 100% | 0.0 B/s | 55.7 KiB | 00m00s [109/171] Installing libcom_err-0:1.47. 100% | 0.0 B/s | 64.2 KiB | 00m00s [110/171] Installing libverto-0:0.3.2-1 100% | 0.0 B/s | 27.2 KiB | 00m00s [111/171] Installing filesystem-srpm-ma 100% | 0.0 B/s | 38.9 KiB | 00m00s [112/171] Installing elfutils-default-y 100% | 408.6 KiB/s | 2.0 KiB | 00m00s [113/171] Installing elfutils-libs-0:0. 100% | 223.1 MiB/s | 685.2 KiB | 00m00s [114/171] Installing coreutils-common-0 100% | 389.4 MiB/s | 11.3 MiB | 00m00s [115/171] Installing openssl-libs-1:3.5 100% | 423.9 MiB/s | 8.9 MiB | 00m00s [116/171] Installing coreutils-0:9.7-5. 100% | 165.0 MiB/s | 5.4 MiB | 00m00s [117/171] Installing ca-certificates-0: 100% | 2.0 MiB/s | 2.5 MiB | 00m01s [118/171] Installing libarchive-0:3.8.1 100% | 232.7 MiB/s | 953.1 KiB | 00m00s [119/171] Installing krb5-libs-0:1.21.3 100% | 152.8 MiB/s | 2.3 MiB | 00m00s >>> Running sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64 >>> Finished sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64 >>> Scriptlet output: >>> Creating group 'tss' with GID 59. >>> Creating user 'tss' (Account used for TPM access) with UID 59 and GID 59. >>> [120/171] Installing tpm2-tss-0:4.1.3-8 100% | 262.0 MiB/s | 1.6 MiB | 00m00s [121/171] Installing ima-evm-utils-libs 100% | 60.5 MiB/s | 62.0 KiB | 00m00s [122/171] Installing gnupg2-gpg-agent-0 100% | 31.4 MiB/s | 675.4 KiB | 00m00s [123/171] Installing libssh-0:0.11.3-1. 100% | 277.9 MiB/s | 569.2 KiB | 00m00s [124/171] Installing gzip-0:1.13-4.fc43 100% | 29.6 MiB/s | 394.4 KiB | 00m00s [125/171] Installing rpm-sequoia-0:1.9. 100% | 354.1 MiB/s | 2.5 MiB | 00m00s [126/171] Installing rpm-libs-0:5.99.91 100% | 304.4 MiB/s | 935.3 KiB | 00m00s [127/171] Installing libfsverity-0:1.6- 100% | 0.0 B/s | 29.5 KiB | 00m00s [128/171] Installing libevent-0:2.1.12- 100% | 288.7 MiB/s | 886.8 KiB | 00m00s [129/171] Installing openldap-0:2.6.10- 100% | 216.0 MiB/s | 663.7 KiB | 00m00s [130/171] Installing libcurl-0:8.15.0-2 100% | 294.4 MiB/s | 904.3 KiB | 00m00s [131/171] Installing elfutils-debuginfo 100% | 6.5 MiB/s | 86.2 KiB | 00m00s [132/171] Installing elfutils-0:0.193-3 100% | 153.6 MiB/s | 2.9 MiB | 00m00s [133/171] Installing binutils-0:2.45-1. 100% | 327.7 MiB/s | 26.5 MiB | 00m00s [134/171] Installing gdb-minimal-0:16.3 100% | 288.2 MiB/s | 13.3 MiB | 00m00s [135/171] Installing debugedit-0:5.2-3. 100% | 16.3 MiB/s | 217.3 KiB | 00m00s [136/171] Installing curl-0:8.15.0-2.fc 100% | 21.1 MiB/s | 476.3 KiB | 00m00s [137/171] Installing rpm-0:5.99.91-5.fc 100% | 77.9 MiB/s | 2.5 MiB | 00m00s [138/171] Installing efi-srpm-macros-0: 100% | 40.2 MiB/s | 41.1 KiB | 00m00s [139/171] Installing java-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [140/171] Installing lua-srpm-macros-0: 100% | 0.0 B/s | 1.9 KiB | 00m00s [141/171] Installing tree-sitter-srpm-m 100% | 0.0 B/s | 9.3 KiB | 00m00s [142/171] Installing zig-srpm-macros-0: 100% | 0.0 B/s | 1.7 KiB | 00m00s [143/171] Installing gnupg2-dirmngr-0:2 100% | 30.3 MiB/s | 621.1 KiB | 00m00s [144/171] Installing gnupg2-0:2.4.8-4.f 100% | 218.4 MiB/s | 6.6 MiB | 00m00s [145/171] Installing rpm-sign-libs-0:5. 100% | 39.6 MiB/s | 40.6 KiB | 00m00s [146/171] Installing rpm-build-libs-0:5 100% | 262.9 MiB/s | 269.2 KiB | 00m00s [147/171] Installing gpgverify-0:2.2-3. 100% | 0.0 B/s | 9.4 KiB | 00m00s [148/171] Installing rust-srpm-macros-0 100% | 0.0 B/s | 5.6 KiB | 00m00s [149/171] Installing qt6-srpm-macros-0: 100% | 0.0 B/s | 740.0 B | 00m00s [150/171] Installing qt5-srpm-macros-0: 100% | 0.0 B/s | 776.0 B | 00m00s [151/171] Installing perl-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [152/171] Installing package-notes-srpm 100% | 0.0 B/s | 2.0 KiB | 00m00s [153/171] Installing openblas-srpm-macr 100% | 0.0 B/s | 392.0 B | 00m00s [154/171] Installing ocaml-srpm-macros- 100% | 0.0 B/s | 2.1 KiB | 00m00s [155/171] Installing kernel-srpm-macros 100% | 0.0 B/s | 2.3 KiB | 00m00s [156/171] Installing gnat-srpm-macros-0 100% | 0.0 B/s | 1.3 KiB | 00m00s [157/171] Installing ghc-srpm-macros-0: 100% | 0.0 B/s | 1.0 KiB | 00m00s [158/171] Installing gap-srpm-macros-0: 100% | 0.0 B/s | 2.6 KiB | 00m00s [159/171] Installing fpc-srpm-macros-0: 100% | 0.0 B/s | 420.0 B | 00m00s [160/171] Installing ansible-srpm-macro 100% | 0.0 B/s | 36.2 KiB | 00m00s [161/171] Installing rpm-build-0:5.99.9 100% | 20.5 MiB/s | 294.4 KiB | 00m00s [162/171] Installing pyproject-srpm-mac 100% | 2.4 MiB/s | 2.5 KiB | 00m00s [163/171] Installing redhat-rpm-config- 100% | 92.3 MiB/s | 189.1 KiB | 00m00s [164/171] Installing forge-srpm-macros- 100% | 0.0 B/s | 40.3 KiB | 00m00s [165/171] Installing fonts-srpm-macros- 100% | 55.7 MiB/s | 57.0 KiB | 00m00s [166/171] Installing go-srpm-macros-0:3 100% | 0.0 B/s | 63.0 KiB | 00m00s [167/171] Installing python-srpm-macros 100% | 0.0 B/s | 52.8 KiB | 00m00s [168/171] Installing which-0:2.23-3.fc4 100% | 6.4 MiB/s | 85.7 KiB | 00m00s [169/171] Installing util-linux-0:2.41. 100% | 102.1 MiB/s | 3.6 MiB | 00m00s [170/171] Installing shadow-utils-2:4.1 100% | 136.9 MiB/s | 4.0 MiB | 00m00s [171/171] Installing info-0:7.2-6.fc43. 100% | 215.0 KiB/s | 354.3 KiB | 00m02s Warning: skipped OpenPGP checks for 3 packages from repository: copr_base Complete! Finish: installing minimal buildroot with dnf5 Start: creating root cache Finish: creating root cache Finish: chroot init INFO: Installed packages: INFO: add-determinism-0.6.0-2.fc43.x86_64 alternatives-1.33-2.fc43.x86_64 ansible-srpm-macros-1-18.1.fc43.noarch audit-libs-4.1.1-2.fc43.x86_64 bash-5.3.0-2.fc43.x86_64 binutils-2.45-1.fc43.x86_64 build-reproducibility-srpm-macros-0.6.0-2.fc43.noarch bzip2-1.0.8-21.fc43.x86_64 bzip2-libs-1.0.8-21.fc43.x86_64 ca-certificates-2025.2.80_v9.0.304-1.1.fc43.noarch coreutils-9.7-5.fc43.x86_64 coreutils-common-9.7-5.fc43.x86_64 cpio-2.15-6.fc43.x86_64 crypto-policies-20250714-4.gitcd6043a.fc43.noarch curl-8.15.0-2.fc43.x86_64 cyrus-sasl-lib-2.1.28-33.fc43.x86_64 debugedit-5.2-3.fc43.x86_64 diffutils-3.12-3.fc43.x86_64 dwz-0.16-2.fc43.x86_64 ed-1.22.2-1.fc43.x86_64 efi-srpm-macros-6-4.fc43.noarch elfutils-0.193-3.fc43.x86_64 elfutils-debuginfod-client-0.193-3.fc43.x86_64 elfutils-default-yama-scope-0.193-3.fc43.noarch elfutils-libelf-0.193-3.fc43.x86_64 elfutils-libs-0.193-3.fc43.x86_64 fedora-gpg-keys-43-0.4.noarch fedora-release-43-0.22.noarch fedora-release-common-43-0.22.noarch fedora-release-identity-basic-43-0.22.noarch fedora-repos-43-0.4.noarch file-5.46-7.fc43.x86_64 file-libs-5.46-7.fc43.x86_64 filesystem-3.18-50.fc43.x86_64 filesystem-srpm-macros-3.18-50.fc43.noarch findutils-4.10.0-6.fc43.x86_64 fonts-srpm-macros-2.0.5-23.fc43.noarch forge-srpm-macros-0.4.0-3.fc43.noarch fpc-srpm-macros-1.3-15.fc43.noarch gap-srpm-macros-1-1.fc43.noarch gawk-5.3.2-2.fc43.x86_64 gdb-minimal-16.3-5.fc43.x86_64 gdbm-libs-1.23-10.fc43.x86_64 ghc-srpm-macros-1.9.2-3.fc43.noarch glibc-2.42-4.fc43.x86_64 glibc-common-2.42-4.fc43.x86_64 glibc-gconv-extra-2.42-4.fc43.x86_64 glibc-minimal-langpack-2.42-4.fc43.x86_64 gmp-6.3.0-4.fc43.x86_64 gnat-srpm-macros-6-8.fc43.noarch gnupg2-2.4.8-4.fc43.x86_64 gnupg2-dirmngr-2.4.8-4.fc43.x86_64 gnupg2-gpg-agent-2.4.8-4.fc43.x86_64 gnupg2-gpgconf-2.4.8-4.fc43.x86_64 gnupg2-keyboxd-2.4.8-4.fc43.x86_64 gnupg2-verify-2.4.8-4.fc43.x86_64 gnutls-3.8.10-3.fc43.x86_64 go-srpm-macros-3.8.0-1.fc43.noarch gpg-pubkey-c6e7f081cf80e13146676e88829b606631645531-66b6dccf gpgverify-2.2-3.fc43.noarch grep-3.12-2.fc43.x86_64 gzip-1.13-4.fc43.x86_64 ima-evm-utils-libs-1.6.2-6.fc43.x86_64 info-7.2-6.fc43.x86_64 jansson-2.14-3.fc43.x86_64 java-srpm-macros-1-7.fc43.noarch json-c-0.18-7.fc43.x86_64 kernel-srpm-macros-1.0-27.fc43.noarch keyutils-libs-1.6.3-6.fc43.x86_64 krb5-libs-1.21.3-7.fc43.x86_64 libacl-2.3.2-4.fc43.x86_64 libarchive-3.8.1-3.fc43.x86_64 libassuan-2.5.7-4.fc43.x86_64 libattr-2.5.2-6.fc43.x86_64 libblkid-2.41.1-16.fc43.x86_64 libbrotli-1.1.0-9.fc43.x86_64 libcap-2.76-3.fc43.x86_64 libcap-ng-0.8.5-7.fc43.x86_64 libcom_err-1.47.3-2.fc43.x86_64 libcurl-8.15.0-2.fc43.x86_64 libeconf-0.7.9-2.fc43.x86_64 libevent-2.1.12-16.fc43.x86_64 libfdisk-2.41.1-16.fc43.x86_64 libffi-3.5.1-2.fc43.x86_64 libfsverity-1.6-3.fc43.x86_64 libgcc-15.2.1-1.fc43.2.x86_64 libgcrypt-1.11.1-2.fc43.x86_64 libgomp-15.2.1-1.fc43.2.x86_64 libgpg-error-1.55-2.fc43.x86_64 libidn2-2.3.8-2.fc43.x86_64 libksba-1.6.7-4.fc43.x86_64 liblastlog2-2.41.1-16.fc43.x86_64 libmount-2.41.1-16.fc43.x86_64 libnghttp2-1.66.0-2.fc43.x86_64 libpkgconf-2.3.0-3.fc43.x86_64 libpsl-0.21.5-6.fc43.x86_64 libselinux-3.9-4.fc43.x86_64 libsemanage-3.9-3.fc43.x86_64 libsepol-3.9-2.fc43.x86_64 libsmartcols-2.41.1-16.fc43.x86_64 libssh-0.11.3-1.fc43.x86_64 libssh-config-0.11.3-1.fc43.noarch libstdc++-15.2.1-1.fc43.2.x86_64 libtasn1-4.20.0-2.fc43.x86_64 libtool-ltdl-2.5.4-7.fc43.x86_64 libunistring-1.1-10.fc43.x86_64 libusb1-1.0.29-4.fc43.x86_64 libuuid-2.41.1-16.fc43.x86_64 libverto-0.3.2-11.fc43.x86_64 libxcrypt-4.4.38-8.fc43.x86_64 libxml2-2.12.10-4.fc43.x86_64 libzstd-1.5.7-2.fc43.x86_64 lua-libs-5.4.8-2.fc43.x86_64 lua-srpm-macros-1-16.fc43.noarch lz4-libs-1.10.0-3.fc43.x86_64 mpfr-4.2.2-2.fc43.x86_64 ncurses-base-6.5-7.20250614.fc43.noarch ncurses-libs-6.5-7.20250614.fc43.x86_64 nettle-3.10.1-2.fc43.x86_64 npth-1.8-3.fc43.x86_64 ocaml-srpm-macros-11-2.fc43.noarch openblas-srpm-macros-2-20.fc43.noarch openldap-2.6.10-4.fc43.x86_64 openssl-libs-3.5.1-2.fc43.x86_64 p11-kit-0.25.8-1.fc43.x86_64 p11-kit-trust-0.25.8-1.fc43.x86_64 package-notes-srpm-macros-0.5-14.fc43.noarch pam-libs-1.7.1-3.fc43.x86_64 patch-2.8-2.fc43.x86_64 pcre2-10.46-1.fc43.x86_64 pcre2-syntax-10.46-1.fc43.noarch perl-srpm-macros-1-60.fc43.noarch pkgconf-2.3.0-3.fc43.x86_64 pkgconf-m4-2.3.0-3.fc43.noarch pkgconf-pkg-config-2.3.0-3.fc43.x86_64 popt-1.19-9.fc43.x86_64 publicsuffix-list-dafsa-20250616-2.fc43.noarch pyproject-srpm-macros-1.18.4-1.fc43.noarch python-srpm-macros-3.14-5.fc43.noarch qt5-srpm-macros-5.15.17-2.fc43.noarch qt6-srpm-macros-6.9.2-1.fc43.noarch readline-8.3-2.fc43.x86_64 redhat-rpm-config-343-11.fc43.noarch rpm-5.99.91-5.fc43.x86_64 rpm-build-5.99.91-5.fc43.x86_64 rpm-build-libs-5.99.91-5.fc43.x86_64 rpm-libs-5.99.91-5.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 rpm-sign-libs-5.99.91-5.fc43.x86_64 rust-srpm-macros-26.4-1.fc43.noarch sed-4.9-5.fc43.x86_64 setup-2.15.0-26.fc43.noarch shadow-utils-4.18.0-3.fc43.x86_64 sqlite-libs-3.50.2-2.fc43.x86_64 systemd-libs-258-1.fc43.x86_64 systemd-standalone-sysusers-258-1.fc43.x86_64 tar-1.35-6.fc43.x86_64 tpm2-tss-4.1.3-8.fc43.x86_64 tree-sitter-srpm-macros-0.4.2-1.fc43.noarch unzip-6.0-67.fc43.x86_64 util-linux-2.41.1-16.fc43.x86_64 util-linux-core-2.41.1-16.fc43.x86_64 which-2.23-3.fc43.x86_64 xxhash-libs-0.8.3-3.fc43.x86_64 xz-5.8.1-2.fc43.x86_64 xz-libs-5.8.1-2.fc43.x86_64 zig-srpm-macros-1-5.fc43.noarch zip-3.0-44.fc43.x86_64 zlib-ng-compat-2.2.5-2.fc43.x86_64 zstd-1.5.7-2.fc43.x86_64 Start: buildsrpm Start: rpmbuild -bs Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1753920000 Wrote: /builddir/build/SRPMS/composable_kernel-6.4.2-1.fc43.src.rpm Finish: rpmbuild -bs INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-43-x86_64-1758760224.854724/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names Finish: buildsrpm INFO: Done(/var/lib/copr-rpmbuild/workspace/workdir-i_m0k2ik/composable_kernel/composable_kernel.spec) Config(child) 0 minutes 13 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot INFO: Start(/var/lib/copr-rpmbuild/results/composable_kernel-6.4.2-1.fc43.src.rpm) Config(fedora-43-x86_64) Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1758760224.854724/root. INFO: reusing tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1758760224.854724/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-1758760224.854724/root. INFO: calling preinit hooks INFO: enabled root cache Start: unpacking root cache Finish: unpacking root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-5.99.91-5.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 dnf5-5.2.17.0-1.fc43.x86_64 dnf5-plugins-5.2.17.0-1.fc43.x86_64 Finish: chroot init Start: build phase for composable_kernel-6.4.2-1.fc43.src.rpm Start: build setup for composable_kernel-6.4.2-1.fc43.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1753920000 Wrote: /builddir/build/SRPMS/composable_kernel-6.4.2-1.fc43.src.rpm Updating and loading repositories: Copr repository 100% | 102.1 KiB/s | 1.5 KiB | 00m00s fedora 100% | 203.4 KiB/s | 26.0 KiB | 00m00s updates 100% | 424.8 KiB/s | 30.2 KiB | 00m00s Repositories loaded. Package Arch Version Repository Size Installing: cmake x86_64 3.31.6-4.fc43 fedora 34.5 MiB fdupes x86_64 1:2.4.0-2.fc43 fedora 118.1 KiB gcc-c++ x86_64 15.2.1-1.fc43.2 copr_base 41.4 MiB git x86_64 2.51.0-2.fc43 fedora 56.4 KiB rocm-cmake noarch 7.0.1-1.fc43 copr_base 129.5 KiB rocm-comgr-devel x86_64 20-2.rocm7.0.0.fc43 copr_base 100.5 KiB rocm-compilersupport-macros noarch 20-2.rocm7.0.0.fc43 copr_base 160.0 B rocm-hip-devel x86_64 7.0.1-1.fc43 copr_base 3.0 MiB rocm-rpm-macros noarch 6.4.2-1.fc43 fedora 18.9 KiB rocm-runtime-devel x86_64 7.0.1-1.fc43 copr_base 678.5 KiB Installing dependencies: annobin-docs noarch 12.99-1.fc43 fedora 98.9 KiB annobin-plugin-gcc x86_64 12.99-1.fc43 fedora 1.0 MiB cmake-data noarch 3.31.6-4.fc43 fedora 8.5 MiB cmake-filesystem x86_64 3.31.6-4.fc43 fedora 0.0 B cmake-rpm-macros noarch 3.31.6-4.fc43 fedora 7.7 KiB cpp x86_64 15.2.1-1.fc43.2 copr_base 37.9 MiB emacs-filesystem noarch 1:30.0-5.fc43 fedora 0.0 B environment-modules x86_64 5.6.0-1.fc43 fedora 1.9 MiB expat x86_64 2.7.1-3.fc43 fedora 294.2 KiB gcc x86_64 15.2.1-1.fc43.2 copr_base 111.9 MiB gcc-plugin-annobin x86_64 15.2.1-1.fc43.2 copr_base 57.2 KiB git-core x86_64 2.51.0-2.fc43 fedora 23.6 MiB git-core-doc noarch 2.51.0-2.fc43 fedora 17.7 MiB glibc-devel x86_64 2.42-4.fc43 fedora 2.3 MiB groff-base x86_64 1.23.0-10.fc43 fedora 3.8 MiB hipcc x86_64 20-2.rocm7.0.0.fc43 copr_base 634.5 KiB hwdata noarch 0.399-1.fc43 fedora 9.6 MiB jsoncpp x86_64 1.9.6-2.fc43 fedora 257.6 KiB kernel-headers x86_64 6.17.0-0.rc6.49.fc43 fedora 6.7 MiB less x86_64 679-2.fc43 fedora 406.1 KiB libcbor x86_64 0.12.0-6.fc43 fedora 77.8 KiB libdrm x86_64 2.4.125-2.fc43 fedora 395.8 KiB libedit x86_64 3.1-56.20250104cvs.fc43 fedora 240.1 KiB libfido2 x86_64 1.16.0-3.fc43 fedora 238.5 KiB libmpc x86_64 1.3.1-8.fc43 fedora 160.6 KiB libpciaccess x86_64 0.16-16.fc43 fedora 44.5 KiB libpipeline x86_64 1.5.8-3.fc43 fedora 145.1 KiB libstdc++-devel x86_64 15.2.1-1.fc43.2 copr_base 37.3 MiB libtommath x86_64 1.3.1~rc1-6.fc43 fedora 126.4 KiB libuv x86_64 1:1.51.0-2.fc43 fedora 570.2 KiB libxcrypt-devel x86_64 4.4.38-8.fc43 fedora 30.8 KiB make x86_64 1:4.4.1-11.fc43 fedora 1.8 MiB man-db x86_64 2.13.1-2.fc43 fedora 2.9 MiB mpdecimal x86_64 4.0.1-2.fc43 fedora 217.2 KiB ncurses x86_64 6.5-7.20250614.fc43 fedora 609.8 KiB numactl-libs x86_64 2.0.19-3.fc43 fedora 56.9 KiB openssh x86_64 10.0p1-5.fc43 fedora 1.4 MiB openssh-clients x86_64 10.0p1-5.fc43 fedora 2.6 MiB pcre2-utf32 x86_64 10.46-1.fc43 fedora 602.2 KiB perl-AutoLoader noarch 5.74-520.fc43 fedora 20.6 KiB perl-B x86_64 1.89-520.fc43 fedora 501.3 KiB perl-Carp noarch 1.54-520.fc43 fedora 46.6 KiB perl-Class-Struct noarch 0.68-520.fc43 fedora 25.4 KiB perl-Data-Dumper x86_64 2.191-521.fc43 fedora 115.6 KiB perl-Digest noarch 1.20-520.fc43 fedora 35.3 KiB perl-Digest-MD5 x86_64 2.59-520.fc43 fedora 59.7 KiB perl-DynaLoader x86_64 1.57-520.fc43 fedora 32.1 KiB perl-Encode x86_64 4:3.21-520.fc43 fedora 4.7 MiB perl-Errno x86_64 1.38-520.fc43 fedora 8.4 KiB perl-Error noarch 1:0.17030-2.fc43 fedora 76.7 KiB perl-Exporter noarch 5.79-520.fc43 fedora 54.3 KiB perl-Fcntl x86_64 1.20-520.fc43 fedora 48.8 KiB perl-File-Basename noarch 2.86-520.fc43 fedora 14.0 KiB perl-File-Copy noarch 2.41-520.fc43 fedora 19.7 KiB perl-File-Path noarch 2.18-520.fc43 fedora 63.5 KiB perl-File-Temp noarch 1:0.231.100-520.fc43 fedora 162.3 KiB perl-File-Which noarch 1.27-14.fc43 fedora 30.4 KiB perl-File-stat noarch 1.14-520.fc43 fedora 12.5 KiB perl-FileHandle noarch 2.05-520.fc43 fedora 9.4 KiB perl-Getopt-Long noarch 1:2.58-520.fc43 fedora 144.5 KiB perl-Getopt-Std noarch 1.14-520.fc43 fedora 11.2 KiB perl-Git noarch 2.51.0-2.fc43 fedora 64.4 KiB perl-HTTP-Tiny noarch 0.090-521.fc43 fedora 154.4 KiB perl-IO x86_64 1.55-520.fc43 fedora 147.4 KiB perl-IO-Socket-IP noarch 0.43-521.fc43 fedora 100.3 KiB perl-IO-Socket-SSL noarch 2.095-2.fc43 fedora 714.5 KiB perl-IPC-Open3 noarch 1.24-520.fc43 fedora 27.7 KiB perl-MIME-Base32 noarch 1.303-24.fc43 fedora 30.7 KiB perl-MIME-Base64 x86_64 3.16-520.fc43 fedora 42.0 KiB perl-Net-SSLeay x86_64 1.94-11.fc43 fedora 1.3 MiB perl-POSIX x86_64 2.23-520.fc43 fedora 231.4 KiB perl-PathTools x86_64 3.94-520.fc43 fedora 180.0 KiB perl-Pod-Escapes noarch 1:1.07-520.fc43 fedora 24.9 KiB perl-Pod-Perldoc noarch 3.28.01-521.fc43 fedora 163.7 KiB perl-Pod-Simple noarch 1:3.47-3.fc43 fedora 565.3 KiB perl-Pod-Usage noarch 4:2.05-520.fc43 fedora 86.3 KiB perl-Scalar-List-Utils x86_64 5:1.70-1.fc43 fedora 144.9 KiB perl-SelectSaver noarch 1.02-520.fc43 fedora 2.2 KiB perl-Socket x86_64 4:2.040-2.fc43 fedora 120.3 KiB perl-Storable x86_64 1:3.37-521.fc43 fedora 231.2 KiB perl-Symbol noarch 1.09-520.fc43 fedora 6.8 KiB perl-Term-ANSIColor noarch 5.01-521.fc43 fedora 97.5 KiB perl-Term-Cap noarch 1.18-520.fc43 fedora 29.3 KiB perl-TermReadKey x86_64 2.38-26.fc43 fedora 64.0 KiB perl-Text-ParseWords noarch 3.31-520.fc43 fedora 13.6 KiB perl-Text-Tabs+Wrap noarch 2024.001-520.fc43 fedora 22.6 KiB perl-Time-Local noarch 2:1.350-520.fc43 fedora 69.0 KiB perl-URI noarch 5.32-2.fc43 fedora 261.2 KiB perl-base noarch 2.27-520.fc43 fedora 12.6 KiB perl-constant noarch 1.33-521.fc43 fedora 26.2 KiB perl-if noarch 0.61.000-520.fc43 fedora 5.8 KiB perl-interpreter x86_64 4:5.42.0-520.fc43 fedora 118.6 KiB perl-lib x86_64 0.65-520.fc43 fedora 8.5 KiB perl-libnet noarch 3.15-521.fc43 fedora 289.4 KiB perl-libs x86_64 4:5.42.0-520.fc43 fedora 11.5 MiB perl-locale noarch 1.13-520.fc43 fedora 6.1 KiB perl-mro x86_64 1.29-520.fc43 fedora 41.6 KiB perl-overload noarch 1.40-520.fc43 fedora 71.6 KiB perl-overloading noarch 0.02-520.fc43 fedora 4.9 KiB perl-parent noarch 1:0.244-520.fc43 fedora 10.3 KiB perl-podlators noarch 1:6.0.2-520.fc43 fedora 317.5 KiB perl-vars noarch 1.05-520.fc43 fedora 3.9 KiB procps-ng x86_64 4.0.4-7.fc43 fedora 1.0 MiB python-pip-wheel noarch 25.1.1-16.fc43 fedora 1.2 MiB python3 x86_64 3.14.0~rc2-1.fc43 fedora 28.9 KiB python3-libs x86_64 3.14.0~rc2-1.fc43 fedora 42.9 MiB rhash x86_64 1.4.5-3.fc43 fedora 351.1 KiB rocm-clang x86_64 20-2.rocm7.0.0.fc43 copr_base 68.5 MiB rocm-clang-devel x86_64 20-2.rocm7.0.0.fc43 copr_base 26.1 MiB rocm-clang-libs x86_64 20-2.rocm7.0.0.fc43 copr_base 94.1 MiB rocm-clang-runtime-devel x86_64 20-2.rocm7.0.0.fc43 copr_base 8.4 MiB rocm-comgr x86_64 20-2.rocm7.0.0.fc43 copr_base 126.3 MiB rocm-device-libs x86_64 20-2.rocm7.0.0.fc43 copr_base 3.2 MiB rocm-hip x86_64 7.0.1-1.fc43 copr_base 26.7 MiB rocm-libc++ x86_64 20-2.rocm7.0.0.fc43 copr_base 1.3 MiB rocm-libc++-devel x86_64 20-2.rocm7.0.0.fc43 copr_base 15.0 MiB rocm-lld x86_64 20-2.rocm7.0.0.fc43 copr_base 5.9 MiB rocm-llvm x86_64 20-2.rocm7.0.0.fc43 copr_base 52.5 MiB rocm-llvm-devel x86_64 20-2.rocm7.0.0.fc43 copr_base 28.3 MiB rocm-llvm-filesystem x86_64 20-2.rocm7.0.0.fc43 copr_base 0.0 B rocm-llvm-libs x86_64 20-2.rocm7.0.0.fc43 copr_base 91.6 MiB rocm-llvm-static x86_64 20-2.rocm7.0.0.fc43 copr_base 1.9 GiB rocm-runtime x86_64 7.0.1-1.fc43 copr_base 3.2 MiB tcl x86_64 1:9.0.2-1.fc43 fedora 4.3 MiB tzdata noarch 2025b-3.fc43 fedora 1.6 MiB vim-filesystem noarch 2:9.1.1775-1.fc43 fedora 40.0 B zlib-ng-compat-devel x86_64 2.2.5-2.fc43 fedora 107.0 KiB Transaction Summary: Installing: 137 packages Total size of inbound packages is 537 MiB. Need to download 107 MiB. After this operation, 3 GiB extra will be used (install 3 GiB, remove 0 B). [ 1/137] git-0:2.51.0-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 2/137] rocm-comgr-devel-0:20-2.rocm7 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 3/137] rocm-runtime-devel-0:7.0.1-1. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 4/137] git-core-0:2.51.0-2.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 5/137] git-core-doc-0:2.51.0-2.fc43. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 6/137] perl-File-Basename-0:2.86-520 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 7/137] perl-Getopt-Long-1:2.58-520.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 8/137] perl-Git-0:2.51.0-2.fc43.noar 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 9/137] perl-IPC-Open3-0:1.24-520.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 10/137] perl-PathTools-0:3.94-520.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 11/137] perl-TermReadKey-0:2.38-26.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 12/137] perl-interpreter-4:5.42.0-520 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 13/137] perl-lib-0:0.65-520.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 14/137] cmake-filesystem-0:3.31.6-4.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 15/137] expat-0:2.7.1-3.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 16/137] perl-File-Copy-0:2.41-520.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 17/137] perl-File-Which-0:1.27-14.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 18/137] perl-Getopt-Std-0:1.14-520.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 19/137] perl-Scalar-List-Utils-5:1.70 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 20/137] perl-URI-0:5.32-2.fc43.noarch 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 21/137] less-0:679-2.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 22/137] openssh-clients-0:10.0p1-5.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 23/137] perl-Carp-0:1.54-520.fc43.noa 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 24/137] perl-Exporter-0:5.79-520.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 25/137] perl-Pod-Usage-4:2.05-520.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 26/137] perl-Text-ParseWords-0:3.31-5 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 27/137] perl-base-0:2.27-520.fc43.noa 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 28/137] perl-constant-0:1.33-521.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 29/137] perl-overload-0:1.40-520.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 30/137] perl-Error-1:0.17030-2.fc43.n 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 31/137] perl-Fcntl-0:1.20-520.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 32/137] perl-IO-0:1.55-520.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 33/137] perl-POSIX-0:2.23-520.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 34/137] perl-Symbol-0:1.09-520.fc43.n 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 35/137] perl-Errno-0:1.38-520.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 36/137] perl-libs-4:5.42.0-520.fc43.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 37/137] perl-DynaLoader-0:1.57-520.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 38/137] perl-vars-0:1.05-520.fc43.noa 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 39/137] perl-Data-Dumper-0:2.191-521. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 40/137] perl-MIME-Base32-0:1.303-24.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 41/137] perl-MIME-Base64-0:3.16-520.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 42/137] perl-libnet-0:3.15-521.fc43.n 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 43/137] perl-parent-1:0.244-520.fc43. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 44/137] libedit-0:3.1-56.20250104cvs. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 45/137] libfido2-0:1.16.0-3.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 46/137] openssh-0:10.0p1-5.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 47/137] perl-Pod-Perldoc-0:3.28.01-52 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 48/137] perl-podlators-1:6.0.2-520.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 49/137] perl-mro-0:1.29-520.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 50/137] perl-overloading-0:0.02-520.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 51/137] perl-File-stat-0:1.14-520.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 52/137] perl-SelectSaver-0:1.02-520.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 53/137] perl-Socket-4:2.040-2.fc43.x8 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 54/137] perl-locale-0:1.13-520.fc43.n 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 55/137] perl-B-0:1.89-520.fc43.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 56/137] perl-Digest-MD5-0:2.59-520.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 57/137] perl-FileHandle-0:2.05-520.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 58/137] perl-IO-Socket-IP-0:0.43-521. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 59/137] perl-Time-Local-2:1.350-520.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 60/137] groff-base-0:1.23.0-10.fc43.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 61/137] libcbor-0:0.12.0-6.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 62/137] perl-File-Temp-1:0.231.100-52 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 63/137] perl-HTTP-Tiny-0:0.090-521.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 64/137] perl-Pod-Simple-1:3.47-3.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 65/137] perl-Term-ANSIColor-0:5.01-52 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 66/137] perl-Term-Cap-0:1.18-520.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 67/137] perl-Class-Struct-0:0.68-520. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 68/137] perl-if-0:0.61.000-520.fc43.n 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 69/137] perl-Digest-0:1.20-520.fc43.n 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 70/137] perl-File-Path-0:2.18-520.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 71/137] perl-IO-Socket-SSL-0:2.095-2. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 72/137] perl-Net-SSLeay-0:1.94-11.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 73/137] perl-Pod-Escapes-1:1.07-520.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 74/137] perl-Text-Tabs+Wrap-0:2024.00 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 75/137] ncurses-0:6.5-7.20250614.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 76/137] perl-AutoLoader-0:5.74-520.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 77/137] perl-Encode-4:3.21-520.fc43.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 78/137] perl-Storable-1:3.37-521.fc43 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 79/137] rocm-runtime-0:7.0.1-1.fc43.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 80/137] libdrm-0:2.4.125-2.fc43.x86_6 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 81/137] numactl-libs-0:2.0.19-3.fc43. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 82/137] libpciaccess-0:0.16-16.fc43.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 83/137] hwdata-0:0.399-1.fc43.noarch 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 84/137] hipcc-0:20-2.rocm7.0.0.fc43.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 85/137] rocm-comgr-0:20-2.rocm7.0.0.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 86/137] rocm-device-libs-0:20-2.rocm7 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 87/137] rocm-clang-devel-0:20-2.rocm7 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 88/137] rocm-lld-0:20-2.rocm7.0.0.fc4 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 89/137] rocm-llvm-static-0:20-2.rocm7 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 90/137] rocm-clang-0:20-2.rocm7.0.0.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 91/137] rocm-clang-libs-0:20-2.rocm7. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 92/137] rocm-clang-runtime-devel-0:20 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 93/137] rocm-libc++-devel-0:20-2.rocm 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 94/137] rocm-llvm-libs-0:20-2.rocm7.0 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 95/137] python3-0:3.14.0~rc2-1.fc43.x 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 96/137] python3-libs-0:3.14.0~rc2-1.f 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 97/137] mpdecimal-0:4.0.1-2.fc43.x86_ 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 98/137] python-pip-wheel-0:25.1.1-16. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [ 99/137] tzdata-0:2025b-3.fc43.noarch 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [100/137] rocm-llvm-devel-0:20-2.rocm7. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [101/137] rocm-libc++-0:20-2.rocm7.0.0. 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [102/137] rocm-llvm-filesystem-0:20-2.r 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [103/137] rocm-llvm-0:20-2.rocm7.0.0.fc 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [104/137] zlib-ng-compat-devel-0:2.2.5- 100% | 0.0 B/s | 0.0 B | 00m00s >>> Already downloaded [105/137] gcc-c++-0:15.2.1-1.fc43.2.x86 100% | 200.7 MiB/s | 15.3 MiB | 00m00s [106/137] rocm-cmake-0:7.0.1-1.fc43.noa 100% | 18.6 MiB/s | 38.1 KiB | 00m00s [107/137] rocm-compilersupport-macros-0 100% | 5.1 MiB/s | 15.5 KiB | 00m00s [108/137] rocm-hip-devel-0:7.0.1-1.fc43 100% | 125.7 MiB/s | 257.5 KiB | 00m00s [109/137] fdupes-1:2.4.0-2.fc43.x86_64 100% | 531.6 KiB/s | 59.0 KiB | 00m00s [110/137] rocm-rpm-macros-0:6.4.2-1.fc4 100% | 268.2 KiB/s | 16.1 KiB | 00m00s [111/137] pcre2-utf32-0:10.46-1.fc43.x8 100% | 4.1 MiB/s | 228.9 KiB | 00m00s [112/137] jsoncpp-0:1.9.6-2.fc43.x86_64 100% | 3.9 MiB/s | 101.1 KiB | 00m00s [113/137] libuv-1:1.51.0-2.fc43.x86_64 100% | 5.0 MiB/s | 266.1 KiB | 00m00s [114/137] make-1:4.4.1-11.fc43.x86_64 100% | 11.2 MiB/s | 585.2 KiB | 00m00s [115/137] cmake-0:3.31.6-4.fc43.x86_64 100% | 36.7 MiB/s | 12.2 MiB | 00m00s [116/137] rhash-0:1.4.5-3.fc43.x86_64 100% | 5.2 MiB/s | 197.9 KiB | 00m00s [117/137] libstdc++-devel-0:15.2.1-1.fc 100% | 161.0 MiB/s | 5.2 MiB | 00m00s [118/137] cmake-data-0:3.31.6-4.fc43.no 100% | 11.0 MiB/s | 2.5 MiB | 00m00s [119/137] libmpc-0:1.3.1-8.fc43.x86_64 100% | 2.5 MiB/s | 70.4 KiB | 00m00s [120/137] environment-modules-0:5.6.0-1 100% | 21.6 MiB/s | 795.3 KiB | 00m00s [121/137] emacs-filesystem-1:30.0-5.fc4 100% | 234.0 KiB/s | 7.5 KiB | 00m00s [122/137] vim-filesystem-2:9.1.1775-1.f 100% | 453.9 KiB/s | 15.4 KiB | 00m00s [123/137] gcc-0:15.2.1-1.fc43.2.x86_64 100% | 219.8 MiB/s | 39.6 MiB | 00m00s [124/137] cpp-0:15.2.1-1.fc43.2.x86_64 100% | 131.8 MiB/s | 12.9 MiB | 00m00s [125/137] man-db-0:2.13.1-2.fc43.x86_64 100% | 13.0 MiB/s | 1.4 MiB | 00m00s [126/137] libpipeline-0:1.5.8-3.fc43.x8 100% | 1.8 MiB/s | 59.9 KiB | 00m00s [127/137] procps-ng-0:4.0.4-7.fc43.x86_ 100% | 11.1 MiB/s | 364.5 KiB | 00m00s [128/137] libtommath-0:1.3.1~rc1-6.fc43 100% | 2.2 MiB/s | 64.3 KiB | 00m00s [129/137] tcl-1:9.0.2-1.fc43.x86_64 100% | 30.1 MiB/s | 1.2 MiB | 00m00s [130/137] rocm-hip-0:7.0.1-1.fc43.x86_6 100% | 247.0 MiB/s | 10.1 MiB | 00m00s [131/137] libxcrypt-devel-0:4.4.38-8.fc 100% | 1.1 MiB/s | 29.2 KiB | 00m00s [132/137] gcc-plugin-annobin-0:15.2.1-1 100% | 27.6 MiB/s | 56.6 KiB | 00m00s [133/137] glibc-devel-0:2.42-4.fc43.x86 100% | 13.5 MiB/s | 565.9 KiB | 00m00s [134/137] annobin-docs-0:12.99-1.fc43.n 100% | 3.6 MiB/s | 89.5 KiB | 00m00s [135/137] annobin-plugin-gcc-0:12.99-1. 100% | 20.3 MiB/s | 996.0 KiB | 00m00s [136/137] cmake-rpm-macros-0:3.31.6-4.f 100% | 616.9 KiB/s | 14.8 KiB | 00m00s [137/137] kernel-headers-0:6.17.0-0.rc6 100% | 23.9 MiB/s | 1.7 MiB | 00m00s -------------------------------------------------------------------------------- [137/137] Total 100% | 100.4 MiB/s | 106.8 MiB | 00m01s Running transaction [ 1/139] Verify package files 100% | 63.0 B/s | 137.0 B | 00m02s [ 2/139] Prepare transaction 100% | 1.3 KiB/s | 137.0 B | 00m00s [ 3/139] Installing cmake-filesystem-0 100% | 7.4 MiB/s | 7.6 KiB | 00m00s [ 4/139] Installing less-0:679-2.fc43. 100% | 28.6 MiB/s | 409.4 KiB | 00m00s [ 5/139] Installing libmpc-0:1.3.1-8.f 100% | 158.3 MiB/s | 162.1 KiB | 00m00s [ 6/139] Installing expat-0:2.7.1-3.fc 100% | 22.3 MiB/s | 296.3 KiB | 00m00s [ 7/139] Installing rocm-llvm-filesyst 100% | 6.2 MiB/s | 19.1 KiB | 00m00s [ 8/139] Installing rocm-libc++-0:20-2 100% | 46.0 MiB/s | 1.3 MiB | 00m00s [ 9/139] Installing rocm-llvm-libs-0:2 100% | 75.6 MiB/s | 91.6 MiB | 00m01s [ 10/139] Installing rocm-clang-libs-0: 100% | 74.2 MiB/s | 94.1 MiB | 00m01s [ 11/139] Installing rocm-comgr-0:20-2. 100% | 71.8 MiB/s | 126.3 MiB | 00m02s [ 12/139] Installing numactl-libs-0:2.0 100% | 8.1 MiB/s | 57.8 KiB | 00m00s [ 13/139] Installing groff-base-0:1.23. 100% | 113.1 MiB/s | 3.8 MiB | 00m00s [ 14/139] Installing vim-filesystem-2:9 100% | 4.6 MiB/s | 4.7 KiB | 00m00s [ 15/139] Installing emacs-filesystem-1 100% | 0.0 B/s | 544.0 B | 00m00s [ 16/139] Installing make-1:4.4.1-11.fc 100% | 105.9 MiB/s | 1.8 MiB | 00m00s [ 17/139] Installing rocm-lld-0:20-2.ro 100% | 67.7 MiB/s | 5.9 MiB | 00m00s [ 18/139] Installing rocm-libc++-devel- 100% | 116.3 MiB/s | 15.4 MiB | 00m00s [ 19/139] Installing cpp-0:15.2.1-1.fc4 100% | 335.9 MiB/s | 38.0 MiB | 00m00s [ 20/139] Installing zlib-ng-compat-dev 100% | 106.0 MiB/s | 108.5 KiB | 00m00s [ 21/139] Installing annobin-docs-0:12. 100% | 24.4 MiB/s | 100.1 KiB | 00m00s [ 22/139] Installing tzdata-0:2025b-3.f 100% | 65.2 MiB/s | 1.9 MiB | 00m00s [ 23/139] Installing python-pip-wheel-0 100% | 622.5 MiB/s | 1.2 MiB | 00m00s [ 24/139] Installing mpdecimal-0:4.0.1- 100% | 35.6 MiB/s | 218.8 KiB | 00m00s [ 25/139] Installing python3-libs-0:3.1 100% | 338.4 MiB/s | 43.3 MiB | 00m00s [ 26/139] Installing python3-0:3.14.0~r 100% | 2.3 MiB/s | 30.7 KiB | 00m00s [ 27/139] Installing cmake-rpm-macros-0 100% | 8.1 MiB/s | 8.3 KiB | 00m00s [ 28/139] Installing rocm-llvm-0:20-2.r 100% | 70.2 MiB/s | 52.5 MiB | 00m01s [ 29/139] Installing rocm-llvm-devel-0: 100% | 96.1 MiB/s | 28.7 MiB | 00m00s [ 30/139] Installing rocm-llvm-static-0 100% | 95.9 MiB/s | 1.9 GiB | 00m20s [ 31/139] Installing rocm-clang-runtime 100% | 130.5 MiB/s | 8.5 MiB | 00m00s [ 32/139] Installing kernel-headers-0:6 100% | 196.4 MiB/s | 6.9 MiB | 00m00s [ 33/139] Installing glibc-devel-0:2.42 100% | 168.1 MiB/s | 2.4 MiB | 00m00s [ 34/139] Installing libxcrypt-devel-0: 100% | 32.3 MiB/s | 33.1 KiB | 00m00s [ 35/139] Installing gcc-0:15.2.1-1.fc4 100% | 369.3 MiB/s | 111.9 MiB | 00m00s [ 36/139] Installing hwdata-0:0.399-1.f 100% | 456.9 MiB/s | 9.6 MiB | 00m00s [ 37/139] Installing libpciaccess-0:0.1 100% | 44.8 MiB/s | 45.9 KiB | 00m00s [ 38/139] Installing libdrm-0:2.4.125-2 100% | 195.1 MiB/s | 399.7 KiB | 00m00s [ 39/139] Installing rocm-runtime-0:7.0 100% | 459.8 MiB/s | 3.2 MiB | 00m00s [ 40/139] Installing rocm-runtime-devel 100% | 333.4 MiB/s | 682.8 KiB | 00m00s [ 41/139] Installing libtommath-0:1.3.1 100% | 124.5 MiB/s | 127.5 KiB | 00m00s [ 42/139] Installing tcl-1:9.0.2-1.fc43 100% | 160.6 MiB/s | 4.3 MiB | 00m00s [ 43/139] Installing procps-ng-0:4.0.4- 100% | 63.1 MiB/s | 1.0 MiB | 00m00s [ 44/139] Installing ncurses-0:6.5-7.20 100% | 40.1 MiB/s | 616.4 KiB | 00m00s [ 45/139] Installing perl-Digest-0:1.20 100% | 36.2 MiB/s | 37.1 KiB | 00m00s [ 46/139] Installing perl-Digest-MD5-0: 100% | 60.1 MiB/s | 61.6 KiB | 00m00s [ 47/139] Installing perl-B-0:1.89-520. 100% | 246.4 MiB/s | 504.7 KiB | 00m00s [ 48/139] Installing perl-FileHandle-0: 100% | 0.0 B/s | 9.8 KiB | 00m00s [ 49/139] Installing perl-libnet-0:3.15 100% | 143.9 MiB/s | 294.7 KiB | 00m00s [ 50/139] Installing perl-Data-Dumper-0 100% | 114.8 MiB/s | 117.5 KiB | 00m00s [ 51/139] Installing perl-MIME-Base32-0 100% | 0.0 B/s | 32.2 KiB | 00m00s [ 52/139] Installing perl-AutoLoader-0: 100% | 0.0 B/s | 21.0 KiB | 00m00s [ 53/139] Installing perl-URI-0:5.32-2. 100% | 89.2 MiB/s | 274.1 KiB | 00m00s [ 54/139] Installing perl-IO-Socket-IP- 100% | 99.8 MiB/s | 102.2 KiB | 00m00s [ 55/139] Installing perl-Net-SSLeay-0: 100% | 271.7 MiB/s | 1.4 MiB | 00m00s [ 56/139] Installing perl-IO-Socket-SSL 100% | 350.9 MiB/s | 718.6 KiB | 00m00s [ 57/139] Installing perl-Text-Tabs+Wra 100% | 0.0 B/s | 23.9 KiB | 00m00s [ 58/139] Installing perl-Pod-Escapes-1 100% | 0.0 B/s | 25.9 KiB | 00m00s [ 59/139] Installing perl-File-Path-0:2 100% | 0.0 B/s | 64.5 KiB | 00m00s [ 60/139] Installing perl-if-0:0.61.000 100% | 0.0 B/s | 6.2 KiB | 00m00s [ 61/139] Installing perl-Time-Local-2: 100% | 0.0 B/s | 70.6 KiB | 00m00s [ 62/139] Installing perl-locale-0:1.13 100% | 0.0 B/s | 6.5 KiB | 00m00s [ 63/139] Installing perl-Pod-Simple-1: 100% | 280.7 MiB/s | 574.9 KiB | 00m00s [ 64/139] Installing perl-HTTP-Tiny-0:0 100% | 152.8 MiB/s | 156.4 KiB | 00m00s [ 65/139] Installing perl-File-Temp-1:0 100% | 160.2 MiB/s | 164.1 KiB | 00m00s [ 66/139] Installing perl-Class-Struct- 100% | 0.0 B/s | 25.9 KiB | 00m00s [ 67/139] Installing perl-IPC-Open3-0:1 100% | 0.0 B/s | 28.5 KiB | 00m00s [ 68/139] Installing perl-Term-Cap-0:1. 100% | 0.0 B/s | 30.6 KiB | 00m00s [ 69/139] Installing perl-Term-ANSIColo 100% | 96.9 MiB/s | 99.2 KiB | 00m00s [ 70/139] Installing perl-POSIX-0:2.23- 100% | 227.2 MiB/s | 232.6 KiB | 00m00s [ 71/139] Installing perl-podlators-1:6 100% | 22.4 MiB/s | 321.4 KiB | 00m00s [ 72/139] Installing perl-Pod-Perldoc-0 100% | 12.7 MiB/s | 169.2 KiB | 00m00s [ 73/139] Installing perl-File-stat-0:1 100% | 0.0 B/s | 13.1 KiB | 00m00s [ 74/139] Installing perl-Socket-4:2.04 100% | 119.4 MiB/s | 122.3 KiB | 00m00s [ 75/139] Installing perl-SelectSaver-0 100% | 0.0 B/s | 2.6 KiB | 00m00s [ 76/139] Installing perl-Symbol-0:1.09 100% | 0.0 B/s | 7.3 KiB | 00m00s [ 77/139] Installing perl-Pod-Usage-4:2 100% | 6.6 MiB/s | 87.9 KiB | 00m00s [ 78/139] Installing perl-IO-0:1.55-520 100% | 148.1 MiB/s | 151.7 KiB | 00m00s [ 79/139] Installing perl-overloading-0 100% | 0.0 B/s | 5.6 KiB | 00m00s [ 80/139] Installing perl-mro-0:1.29-52 100% | 0.0 B/s | 42.7 KiB | 00m00s [ 81/139] Installing perl-Fcntl-0:1.20- 100% | 0.0 B/s | 49.9 KiB | 00m00s [ 82/139] Installing perl-base-0:2.27-5 100% | 0.0 B/s | 13.0 KiB | 00m00s [ 83/139] Installing perl-Text-ParseWor 100% | 0.0 B/s | 14.6 KiB | 00m00s [ 84/139] Installing perl-File-Basename 100% | 0.0 B/s | 14.6 KiB | 00m00s [ 85/139] Installing perl-Getopt-Long-1 100% | 143.8 MiB/s | 147.2 KiB | 00m00s [ 86/139] Installing perl-Storable-1:3. 100% | 227.4 MiB/s | 232.8 KiB | 00m00s [ 87/139] Installing perl-overload-0:1. 100% | 0.0 B/s | 72.0 KiB | 00m00s [ 88/139] Installing perl-parent-1:0.24 100% | 0.0 B/s | 11.0 KiB | 00m00s [ 89/139] Installing perl-MIME-Base64-0 100% | 43.2 MiB/s | 44.3 KiB | 00m00s [ 90/139] Installing perl-vars-0:1.05-5 100% | 0.0 B/s | 4.3 KiB | 00m00s [ 91/139] Installing perl-Errno-0:1.38- 100% | 0.0 B/s | 8.8 KiB | 00m00s [ 92/139] Installing perl-constant-0:1. 100% | 0.0 B/s | 27.4 KiB | 00m00s [ 93/139] Installing perl-Scalar-List-U 100% | 145.2 MiB/s | 148.7 KiB | 00m00s [ 94/139] Installing perl-Getopt-Std-0: 100% | 0.0 B/s | 11.8 KiB | 00m00s [ 95/139] Installing perl-Encode-4:3.21 100% | 187.8 MiB/s | 4.7 MiB | 00m00s [ 96/139] Installing perl-DynaLoader-0: 100% | 0.0 B/s | 32.5 KiB | 00m00s [ 97/139] Installing perl-PathTools-0:3 100% | 180.2 MiB/s | 184.6 KiB | 00m00s [ 98/139] Installing perl-Exporter-0:5. 100% | 0.0 B/s | 55.6 KiB | 00m00s [ 99/139] Installing perl-Carp-0:1.54-5 100% | 23.3 MiB/s | 47.7 KiB | 00m00s [100/139] Installing perl-libs-4:5.42.0 100% | 291.2 MiB/s | 11.6 MiB | 00m00s [101/139] Installing perl-interpreter-4 100% | 9.0 MiB/s | 120.3 KiB | 00m00s [102/139] Installing perl-TermReadKey-0 100% | 64.6 MiB/s | 66.2 KiB | 00m00s [103/139] Installing perl-lib-0:0.65-52 100% | 0.0 B/s | 8.9 KiB | 00m00s [104/139] Installing perl-File-Copy-0:2 100% | 0.0 B/s | 20.2 KiB | 00m00s [105/139] Installing perl-File-Which-0: 100% | 0.0 B/s | 31.4 KiB | 00m00s [106/139] Installing perl-Error-1:0.170 100% | 78.1 MiB/s | 80.0 KiB | 00m00s [107/139] Installing libcbor-0:0.12.0-6 100% | 77.3 MiB/s | 79.2 KiB | 00m00s [108/139] Installing libfido2-0:1.16.0- 100% | 234.4 MiB/s | 240.0 KiB | 00m00s [109/139] Installing libpipeline-0:1.5. 100% | 13.0 MiB/s | 146.6 KiB | 00m00s [110/139] Installing man-db-0:2.13.1-2. 100% | 80.9 MiB/s | 2.9 MiB | 00m00s [111/139] Installing environment-module 100% | 62.9 MiB/s | 1.9 MiB | 00m00s [112/139] Installing openssh-0:10.0p1-5 100% | 81.9 MiB/s | 1.4 MiB | 00m00s [113/139] Installing libedit-0:3.1-56.2 100% | 118.1 MiB/s | 241.8 KiB | 00m00s [114/139] Installing openssh-clients-0: 100% | 96.6 MiB/s | 2.6 MiB | 00m00s [115/139] Installing git-core-0:2.51.0- 100% | 328.5 MiB/s | 23.7 MiB | 00m00s [116/139] Installing git-core-doc-0:2.5 100% | 344.1 MiB/s | 17.9 MiB | 00m00s [117/139] Installing git-0:2.51.0-2.fc4 100% | 56.4 MiB/s | 57.7 KiB | 00m00s [118/139] Installing perl-Git-0:2.51.0- 100% | 63.8 MiB/s | 65.4 KiB | 00m00s [119/139] Installing rocm-clang-0:20-2. 100% | 74.7 MiB/s | 68.5 MiB | 00m01s [120/139] Installing rocm-clang-devel-0 100% | 123.3 MiB/s | 26.3 MiB | 00m00s [121/139] Installing rocm-device-libs-0 100% | 93.0 MiB/s | 3.3 MiB | 00m00s [122/139] Installing rocm-comgr-devel-0 100% | 99.5 MiB/s | 101.9 KiB | 00m00s [123/139] Installing hipcc-0:20-2.rocm7 100% | 29.6 MiB/s | 635.9 KiB | 00m00s [124/139] Installing rocm-hip-0:7.0.1-1 100% | 342.1 MiB/s | 26.7 MiB | 00m00s [125/139] Installing libstdc++-devel-0: 100% | 441.0 MiB/s | 37.5 MiB | 00m00s [126/139] Installing rhash-0:1.4.5-3.fc 100% | 23.2 MiB/s | 356.4 KiB | 00m00s [127/139] Installing libuv-1:1.51.0-2.f 100% | 279.8 MiB/s | 573.0 KiB | 00m00s [128/139] Installing jsoncpp-0:1.9.6-2. 100% | 126.5 MiB/s | 259.2 KiB | 00m00s [129/139] Installing cmake-0:3.31.6-4.f 100% | 290.0 MiB/s | 34.5 MiB | 00m00s [130/139] Installing cmake-data-0:3.31. 100% | 119.3 MiB/s | 9.1 MiB | 00m00s [131/139] Installing pcre2-utf32-0:10.4 100% | 294.5 MiB/s | 603.1 KiB | 00m00s [132/139] Installing fdupes-1:2.4.0-2.f 100% | 8.4 MiB/s | 120.0 KiB | 00m00s [133/139] Installing rocm-cmake-0:7.0.1 100% | 131.4 MiB/s | 134.6 KiB | 00m00s [134/139] Installing gcc-c++-0:15.2.1-1 100% | 318.2 MiB/s | 41.4 MiB | 00m00s [135/139] Installing rocm-hip-devel-0:7 100% | 159.9 MiB/s | 3.0 MiB | 00m00s [136/139] Installing rocm-rpm-macros-0: 100% | 19.0 MiB/s | 19.5 KiB | 00m00s [137/139] Installing gcc-plugin-annobin 100% | 4.1 MiB/s | 58.8 KiB | 00m00s [138/139] Installing annobin-plugin-gcc 100% | 58.1 MiB/s | 1.0 MiB | 00m00s [139/139] Installing rocm-compilersuppo 100% | 2.8 KiB/s | 440.0 B | 00m00s Warning: skipped OpenPGP checks for 27 packages from repository: copr_base Complete! Finish: build setup for composable_kernel-6.4.2-1.fc43.src.rpm Start: rpmbuild composable_kernel-6.4.2-1.fc43.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1753920000 Executing(%mkbuilddir): /bin/sh -e /var/tmp/rpm-tmp.kc1FEU Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.tR8G35 + umask 022 + cd /builddir/build/BUILD/composable_kernel-6.4.2-build + cd /builddir/build/BUILD/composable_kernel-6.4.2-build + rm -rf composable_kernel-rocm-6.4.2 + /usr/lib/rpm/rpmuncompress -x /builddir/build/SOURCES/composable_kernel-6.4.2.tar.gz + STATUS=0 + '[' 0 -ne 0 ']' + cd composable_kernel-rocm-6.4.2 + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w . + sed -i -e 's@add_compile_options(-Werror)@#add_compile_options(-Werror)@' CMakeLists.txt + sed -i -e /-Werror/d cmake/EnableCompilerWarnings.cmake + sed -i -e 's@add_compile_options(-Weverything)@#add_compile_options(-Weverything)@' CMakeLists.txt + sed -i -e /-Wextra/d cmake/EnableCompilerWarnings.cmake + sed -i -e /-Wunused/d cmake/EnableCompilerWarnings.cmake + sed -i -e /-Weverything/d cmake/EnableCompilerWarnings.cmake + sed -i -e 's@-Wno-unknown-warning-option@-Wno-unknown-warning-option -Wno-unused-parameter@' cmake/EnableCompilerWarnings.cmake + sed -i -e 's@CK_TIME_KERNEL 1@CK_TIME_KERNEL 0@' include/ck/ck.hpp + sed -i -e 's@add_subdirectory(example)@#add_subdirectory(example)@' CMakeLists.txt + sed -i -e 's@add_subdirectory(profiler)@#add_subdirectory(profiler)@' CMakeLists.txt + sed -i -e s@STATIC@SHARED@ library/src/utility/CMakeLists.txt library/src/tensor_operation_instance/gpu/CMakeLists.txt + sed -i -e 's@POSITION_INDEPENDENT_CODE ON@POSITION_INDEPENDENT_CODE ON SOVERSION \"6.4.2\"@' library/src/utility/CMakeLists.txt library/src/tensor_operation_instance/gpu/CMakeLists.txt + RPM_EC=0 ++ jobs -p + exit 0 Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.VtIQxz + umask 022 + cd /builddir/build/BUILD/composable_kernel-6.4.2-build + CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=thin -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=hipcc + export CC + CXX=hipcc + export CXX + cd composable_kernel-rocm-6.4.2 ++ cat /proc/cpuinfo ++ grep -m 1 'cpu cores' ++ awk '{ print $4 }' + COMPILE_JOBS=2 + '[' 2x = x ']' + '[' 2 = 1 ']' + BUILD_MEM=6 + MEM_KB=0 ++ cat /proc/meminfo ++ grep MemTotal ++ awk '{ print $2 }' + MEM_KB=7953344 ++ eval 'expr 7953344 / 1024' +++ expr 7953344 / 1024 + MEM_MB=7766 ++ eval 'expr 7766 / 1024' +++ expr 7766 / 1024 + MEM_GB=7 ++ eval 'expr 1 + 7 / 6' +++ expr 1 + 7 / 6 + COMPILE_JOBS_MEM=2 + '[' 2 -lt 2 ']' + LINK_MEM=12 ++ eval 'expr 1 + 7 / 12' +++ expr 1 + 7 / 12 + LINK_JOBS=1 + JOBS=2 + '[' 1 -lt 2 ']' + JOBS=1 + CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=thin -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=hipcc + export CC + CXX=hipcc + export CXX + /usr/bin/cmake -S . -B redhat-linux-build -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON -DBUILD_TESTING=OFF -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_COMPILER=/usr/lib64/rocm/llvm/bin/clang++ -DCMAKE_CXX_FLAGS=-fuse-ld=bfd -DCMAKE_EXPORT_COMPILE_COMMANDS=OFF -DCMAKE_HIP_ARCHITECTURES=gfx1100 -DCMAKE_HIP_COMPILER=/usr/lib64/rocm/llvm/bin/clang++ -DCMAKE_INSTALL_LIBDIR=/usr/lib64 -DGPU_TARGETS=gfx1100 -DHIP_PLATFORM=amd -DROCM_SYMLINK_LIBS=OFF -- The CXX compiler identification is Clang 20.0.0 -- The HIP compiler identification is Clang 20.0.0 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/lib64/rocm/llvm/bin/clang++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Detecting HIP compiler ABI info -- Detecting HIP compiler ABI info - done -- Check for working HIP compiler: /usr/lib64/rocm/llvm/bin/clang++ - skipped -- Detecting HIP compile features -- Detecting HIP compile features - done -- Found Python3: /usr/bin/python3.14 (found suitable version "3.14.0", minimum required is "3.8") found components: Interpreter -- Found Git: /usr/bin/git (found version "2.51.0") fatal: not a git repository (or any of the parent directories): .git CMake Deprecation Warning at /usr/share/rocm/cmake/ROCMConfig.cmake:12 (message): Use of find_package(ROCM) is deprecated as of ROCm 6.4. Please use find_package(ROCmCMakeBuildTools) Call Stack (most recent call first): CMakeLists.txt:124 (find_package) GPU_TARGETS= gfx1100 GPU_ARCHS= -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success hip_version_flat=700051831 checking which targets are supported -- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 - Success Building CK for the following targets: gfx1100 Enabling WMMA instances -- Performing Test HAS_NO_OFFLOAD_UNIFORM_BLOCK -- Performing Test HAS_NO_OFFLOAD_UNIFORM_BLOCK - Success Adding the fno-offload-uniform-block compiler flag -- Performing Test HAS_LSR_DROP_SOLUTION -- Performing Test HAS_LSR_DROP_SOLUTION - Success Adding the lsr-drop-solution=1 compiler flag -- Performing Test HAS_ENABLE_POST_MISCHED -- Performing Test HAS_ENABLE_POST_MISCHED - Success Adding the enable-post-misched=0 compiler flag -- Performing Test check-coerce -- Performing Test check-coerce - Success Adding the amdgpu-coerce-illegal-types=1 Adding -amdgpu-early-inline-all=true and -amdgpu-function-calls=false CMAKE_CXX_COMPILER: /usr/lib64/rocm/llvm/bin/clang++ CMAKE_HIP_COMPILER: /usr/lib64/rocm/llvm/bin/clang++ OpenMP_CXX_LIB_NAMES: libomp;libgomp;libiomp5 OpenMP_gomp_LIBRARY: OpenMP_pthread_LIBRARY: OpenMP_CXX_FLAGS: -fopenmp=libomp -Wno-unused-command-line-argument -- Build with HIP -- Clang tidy found: 20.0.0git -- Clang tidy checks: *,-abseil-*,-android-cloexec-fopen,-cert-msc30-c,-bugprone-exception-escape,-bugprone-macro-parentheses,-cert-env33-c,-cert-msc32-c,-cert-msc50-cpp,-cert-msc51-cpp,-cert-dcl37-c,-cert-dcl51-cpp,-clang-analyzer-alpha.core.CastToStruct,-clang-analyzer-optin.performance.Padding,-clang-diagnostic-deprecated-declarations,-clang-diagnostic-extern-c-compat,-clang-diagnostic-unused-command-line-argument,-cppcoreguidelines-avoid-c-arrays,-cppcoreguidelines-avoid-magic-numbers,-cppcoreguidelines-explicit-virtual-functions,-cppcoreguidelines-init-variables,-cppcoreguidelines-macro-usage,-cppcoreguidelines-non-private-member-variables-in-classes,-cppcoreguidelines-pro-bounds-array-to-pointer-decay,-cppcoreguidelines-pro-bounds-constant-array-index,-cppcoreguidelines-pro-bounds-pointer-arithmetic,-cppcoreguidelines-pro-type-member-init,-cppcoreguidelines-pro-type-reinterpret-cast,-cppcoreguidelines-pro-type-union-access,-cppcoreguidelines-pro-type-vararg,-cppcoreguidelines-special-member-functions,-fuchsia-*,-google-explicit-constructor,-google-readability-braces-around-statements,-google-readability-todo,-google-runtime-int,-google-runtime-references,-hicpp-vararg,-hicpp-braces-around-statements,-hicpp-explicit-conversions,-hicpp-named-parameter,-hicpp-no-array-decay,-hicpp-avoid-c-arrays,-hicpp-signed-bitwise,-hicpp-special-member-functions,-hicpp-uppercase-literal-suffix,-hicpp-use-auto,-hicpp-use-equals-default,-hicpp-use-override,-llvm-header-guard,-llvm-include-order,-llvmlibc-restrict-system-libc-headers,-llvmlibc-callee-namespace,-llvmlibc-implementation-in-namespace,-llvm-else-after-return,-llvm-qualified-auto,-misc-misplaced-const,-misc-non-private-member-variables-in-classes,-misc-no-recursion,-modernize-avoid-bind,-modernize-avoid-c-arrays,-modernize-pass-by-value,-modernize-use-auto,-modernize-use-default-member-init,-modernize-use-equals-default,-modernize-use-trailing-return-type,-modernize-use-transparent-functors,-performance-unnecessary-value-param,-readability-braces-around-statements,-readability-else-after-return,-readability-function-cognitive-complexity,-readability-isolate-declaration,-readability-magic-numbers,-readability-named-parameter,-readability-uppercase-literal-suffix,-readability-convert-member-functions-to-static,-readability-qualified-auto,-readability-redundant-string-init,-bugprone-narrowing-conversions,-cppcoreguidelines-narrowing-conversions,-altera-struct-pack-align,-cppcoreguidelines-prefer-member-initializer,-modernize-use-override,-readability-non-const-parameter CMAKE_CXX_FLAGS: -fuse-ld=bfd adding instance device_avg_pool2d_bwd_instance add_instance_library device_avg_pool2d_bwd_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd adding instance device_avg_pool3d_bwd_instance add_instance_library device_avg_pool3d_bwd_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/avg_pool3d_bwd Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/batched_gemm Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/batched_gemm_add_relu_gemm_add Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/batched_gemm_bias_permute Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/batched_gemm_gemm Found only dl instances, but DL_KERNELS is not set. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/batched_gemm_reduce Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute adding instance device_batchnorm_instance add_instance_library device_batchnorm_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/batchnorm instance should be built for all types! adding instance device_column_to_image_instance add_instance_library device_column_to_image_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/column_to_image Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/contraction_bilinear Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/contraction_scale Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/conv1d_bwd_data Found only xdl and dl instances, but gfx9 is not on the targets listand DL_KERNELS is not set. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/conv2d_bwd_data Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/conv2d_fwd Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/conv2d_fwd_bias_relu Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/conv2d_fwd_bias_relu_add Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/conv3d_bwd_data instance should be built for all types! adding instance device_elementwise_instance add_instance_library device_elementwise_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/elementwise adding instance device_elementwise_normalization_instance add_instance_library device_elementwise_normalization_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/elementwise_normalization adding instance device_gemm_instance removing dpp instance device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp removing dl instance device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp removing dl instance device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp removing dl instance device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp removing dl instance device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp removing xdl instance device_gemm_xdl_f64_f64_f64_mk_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_f64_f64_f64_mk_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_f64_f64_f64_km_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_f64_f64_f64_km_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_f32_f32_f32_mk_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_f32_f32_f32_mk_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_f32_f32_f32_km_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_f32_f32_f32_km_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_f32_f32_f32_mk_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_f32_f32_f32_mk_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_f32_f32_f32_km_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_f32_f32_f32_km_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_lds_direct_load_f32_f32_f32_km_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_lds_direct_load_f32_f32_f32_km_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_lds_direct_load_f32_f32_f32_mk_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_lds_direct_load_f32_f32_f32_mk_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_f16_f16_f16_mk_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_f16_f16_f16_mk_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_f16_f16_f16_km_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_f16_f16_f16_km_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_2_stage_f16_f16_f16_mk_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_lds_direct_load_f16_f16_f16_mk_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp removing xdl instance device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_bf16_bf16_bf16_mk_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_bf16_bf16_bf16_mk_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_bf16_bf16_bf16_km_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_bf16_bf16_bf16_km_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_default_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_interwave_default_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v2_default_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_padded_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_interwave_padded_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v2_padded_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_nk_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_fp8_fp8_fp8_km_kn_mn_instance.cpp removing xdl instance device_gemm_xdl_c_shuffle_fp8_fp8_fp8_km_nk_mn_instance.cpp add_instance_library device_gemm_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_ab_scale Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_add Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_add_add_fastgelu Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_add_fastgelu Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_add_multiply Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_add_relu Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_add_relu_add_layernorm Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_add_silu Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_b_scale Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_bias_add_reduce adding instance device_gemm_bilinear_instance removing xdl instance device_gemm_bilinear_xdl_c_shuffle_f16_f16_f16_f16_km_kn_mn_mn_instance.cpp removing xdl instance device_gemm_bilinear_xdl_c_shuffle_f16_f16_f16_f16_km_nk_mn_mn_instance.cpp removing xdl instance device_gemm_bilinear_xdl_c_shuffle_f16_f16_f16_f16_mk_kn_mn_mn_instance.cpp removing xdl instance device_gemm_bilinear_xdl_c_shuffle_f16_f16_f16_f16_mk_nk_mn_mn_instance.cpp add_instance_library device_gemm_bilinear_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_bilinear Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_fastgelu Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_multi_abd Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_multiply_add Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_multiply_multiply Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_reduce Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_splitk Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_streamk Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_universal Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_universal_batched Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_universal_reduce Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/gemm_universal_streamk Found only xdl and dl instances, but gfx9 is not on the targets listand DL_KERNELS is not set. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv1d_fwd adding instance device_grouped_conv2d_bwd_data_instance removing xdl instance xdl/device_grouped_conv2d_bwd_data_xdl_gnhwc_gkyxc_gnhwk_f16_instance.cpp removing xdl instance xdl/device_grouped_conv2d_bwd_data_xdl_gnhwc_gkyxc_gnhwk_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv2d_bwd_data_xdl_gnhwc_gkyxc_gnhwk_f32_instance.cpp removing xdl instance xdl/device_grouped_conv2d_bwd_data_xdl_nhwgc_gkyxc_nhwgk_f16_instance.cpp removing xdl instance xdl/device_grouped_conv2d_bwd_data_xdl_nhwgc_gkyxc_nhwgk_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv2d_bwd_data_xdl_nhwgc_gkyxc_nhwgk_f32_instance.cpp add_instance_library device_grouped_conv2d_bwd_data_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data Found only xdl and dl instances, but gfx9 is not on the targets listand DL_KERNELS is not set. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight adding instance device_grouped_conv2d_fwd_instance removing dl instance dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp removing dl instance dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp removing dl instance dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp removing dl instance dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp removing xdl instance xdl/device_grouped_conv2d_fwd_xdl_gnhwc_gkyxc_gnhwk_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv2d_fwd_xdl_gnhwc_gkyxc_gnhwk_f16_instance.cpp removing xdl instance xdl/device_grouped_conv2d_fwd_xdl_gnhwc_gkyxc_gnhwk_f32_instance.cpp removing xdl instance xdl/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_f16_instance.cpp removing xdl instance xdl/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_f32_instance.cpp removing xdl instance xdl/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_instance.cpp removing xdl instance xdl/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_f16_instance.cpp removing xdl instance xdl/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_f32_instance.cpp removing xdl instance xdl/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_int8_instance.cpp removing xdl instance xdl/large_tensor/device_grouped_conv2d_fwd_xdl_large_tensor_nhwgc_gkyxc_nhwgk_bf16_instance.cpp removing xdl instance xdl/large_tensor/device_grouped_conv2d_fwd_xdl_large_tensor_nhwgc_gkyxc_nhwgk_f16_instance.cpp removing xdl instance xdl/large_tensor/device_grouped_conv2d_fwd_xdl_large_tensor_nhwgc_gkyxc_nhwgk_f32_instance.cpp removing xdl instance xdl/large_tensor/device_grouped_conv2d_fwd_xdl_large_tensor_nhwgc_gkyxc_nhwgk_int8_instance.cpp removing xdl instance xdl/merged_groups/device_grouped_conv2d_fwd_xdl_merged_groups_nhwgc_gkyxc_nhwgk_bf16_instance.cpp removing xdl instance xdl/merged_groups/device_grouped_conv2d_fwd_xdl_merged_groups_nhwgc_gkyxc_nhwgk_f16_instance.cpp removing xdl instance xdl/merged_groups/device_grouped_conv2d_fwd_xdl_merged_groups_nhwgc_gkyxc_nhwgk_f32_instance.cpp removing xdl instance xdl/merged_groups/device_grouped_conv2d_fwd_xdl_merged_groups_nhwgc_gkyxc_nhwgk_int8_instance.cpp removing xdl instance xdl/merged_groups/device_grouped_conv2d_fwd_xdl_merged_groups_ngchw_gkyxc_ngkhw_bf16_instance.cpp removing xdl instance xdl/merged_groups/device_grouped_conv2d_fwd_xdl_merged_groups_ngchw_gkyxc_ngkhw_f16_instance.cpp removing xdl instance xdl/merged_groups/device_grouped_conv2d_fwd_xdl_merged_groups_ngchw_gkyxc_ngkhw_f32_instance.cpp removing xdl instance xdl/merged_groups/device_grouped_conv2d_fwd_xdl_merged_groups_ngchw_gkyxc_ngkhw_int8_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_bf16_mem_intra_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_f16_mem_intra_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_f32_mem_intra_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_mem_intra_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_bf16_mem_inter_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_f16_mem_inter_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_f32_mem_inter_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_mem_inter_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_bf16_mem_intra_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_f16_mem_intra_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_f32_mem_intra_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_int8_mem_intra_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_bf16_mem_inter_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_f16_mem_inter_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_f32_mem_inter_instance.cpp removing xdl instance xdl/mem/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_int8_mem_inter_instance.cpp removing xdl instance xdl/comp/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_bf16_comp_instance.cpp removing xdl instance xdl/comp/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_f16_comp_instance.cpp removing xdl instance xdl/comp/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_f32_comp_instance.cpp removing xdl instance xdl/comp/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_comp_instance.cpp removing xdl instance xdl/comp/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_bf16_comp_instance.cpp removing xdl instance xdl/comp/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_f16_comp_instance.cpp removing xdl instance xdl/comp/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_f32_comp_instance.cpp removing xdl instance xdl/comp/device_grouped_conv2d_fwd_xdl_ngchw_gkyxc_ngkhw_int8_comp_instance.cpp add_instance_library device_grouped_conv2d_fwd_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd_dynamic_op adding instance device_grouped_conv3d_bwd_data_instance removing xdl instance xdl/device_grouped_conv3d_bwd_data_xdl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_data_xdl_gndhwc_gkzyxc_gndhwk_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_data_xdl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_data_xdl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_data_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_data_xdl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_data_xdl_ndhwgc_gkzyxc_ndhwgk_input_f16_comp_bf8_f8_instance.cpp add_instance_library device_grouped_conv3d_bwd_data_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data_bilinear Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data_scale adding instance device_grouped_conv3d_bwd_weight_instance removing xdl instance xdl/device_grouped_conv3d_bwd_weight_xdl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_xdl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_xdl_gndhwc_gkzyxc_gndhwk_bf16_f32_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_xdl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_xdl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_f32_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ndhwgc_gkzyxc_ndhwgk_f16_pipev2_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ndhwgc_gkzyxc_ndhwgk_f16_pipev5_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ngcdhw_gkzyxc_ngkdhw_f16_pipev2_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ngcdhw_gkzyxc_ngkdhw_f16_pipev5_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_pipev2_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_pipev5_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ngcdhw_gkzyxc_ngkdhw_bf16_pipev2_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ngcdhw_gkzyxc_ngkdhw_bf16_pipev5_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ndhwgc_gkzyxc_ndhwgk_f16_pipev1_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ngcdhw_gkzyxc_ngkdhw_f16_pipev1_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_pipev1_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ngcdhw_gkzyxc_ngkdhw_bf16_pipev1_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ndhwgc_gkzyxc_ndhwgk_f16_pipev2_irregular_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ndhwgc_gkzyxc_ndhwgk_f16_pipev5_irregular_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_pipev2_irregular_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_two_stage_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_pipev5_irregular_instance.cpp removing xdl instance xdl/device_grouped_conv3d_bwd_weight_xdl_ndhwgc_gkzyxc_ndhwgk_f16_comp_bf8_fp8_instance.cpp add_instance_library device_grouped_conv3d_bwd_weight_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight_bilinear Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight_scale adding instance device_grouped_conv3d_fwd_instance removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_gndhwc_gkzyxc_gndhwk_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_gndhwc_gkzyxc_gndhwk_int8_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_int8_instance.cpp removing xdl instance xdl/large_tensor/device_grouped_conv3d_fwd_xdl_large_tensor_ndhwgc_gkzyxc_ndhwgk_bf16_instance.cpp removing xdl instance xdl/large_tensor/device_grouped_conv3d_fwd_xdl_large_tensor_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp removing xdl instance xdl/large_tensor/device_grouped_conv3d_fwd_xdl_large_tensor_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp removing xdl instance xdl/merged_groups/device_grouped_conv3d_fwd_xdl_merged_groups_ndhwgc_gkzyxc_ndhwgk_bf16_instance.cpp removing xdl instance xdl/merged_groups/device_grouped_conv3d_fwd_xdl_merged_groups_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp removing xdl instance xdl/merged_groups/device_grouped_conv3d_fwd_xdl_merged_groups_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp removing xdl instance xdl/mem/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_mem_inter_instance.cpp removing xdl instance xdl/mem/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_f16_mem_inter_instance.cpp removing xdl instance xdl/mem/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_f32_mem_inter_instance.cpp removing xdl instance xdl/mem/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_mem_intra_instance.cpp removing xdl instance xdl/mem/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_f16_mem_intra_instance.cpp removing xdl instance xdl/mem/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_f32_mem_intra_instance.cpp removing xdl instance xdl/comp/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_comp_instance.cpp removing xdl instance xdl/comp/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_f16_comp_instance.cpp removing xdl instance xdl/comp/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_f32_comp_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_f16_comp_fp8_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_fp8_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_bf8_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_fp8_bf8_instance.cpp removing xdl instance xdl/device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_bf8_fp8_instance.cpp add_instance_library device_grouped_conv3d_fwd_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_bilinear Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_convinvscale Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_convscale Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_convscale_add Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_convscale_relu Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_dynamic_op Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_scale Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_scaleadd_ab Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_scaleadd_scaleadd_relu Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_gemm Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_gemm_bias Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_gemm_fastgelu Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_gemm_fixed_nk Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_gemm_fixed_nk_multi_abd Found only xdl instances, but gfx9 is not on the targets list. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/grouped_gemm_tile_loop instance should be built for all types! adding instance device_image_to_column_instance add_instance_library device_image_to_column_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/image_to_column adding instance device_max_pool_bwd_instance add_instance_library device_max_pool_bwd_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/max_pool_bwd instance should be built for all types! -- Could NOT find Python3 (missing: Python3_INCLUDE_DIRS Python3_LIBRARIES Development Development.Module Development.Embed) (found version "3.14.0") adding instance device_mha_instance removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d32_fp16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d32_fp16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_fp16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_fp16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_fp16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d256_fp16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d256_fp16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d32_bf16_batch_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d32_bf16_group_b128x64x16x32x32x32_r2x1x1_r2x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_bf16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_bf16_group_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_bf16_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d128_bf16_group_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_bias_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_bias_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_alibi_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_alibi_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d256_bf16_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_mask.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_mask.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_bias_mask.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_lse_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask_dropout.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d256_bf16_group_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_vc_psskddv_alibi_mask.cpp removing mha instance fmha_fwd_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_squant.cpp removing mha instance fmha_fwd_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_bias_squant.cpp removing mha instance fmha_fwd_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_alibi_squant.cpp removing mha instance fmha_fwd_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_mask_squant.cpp removing mha instance fmha_fwd_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_bias_mask_squant.cpp removing mha instance fmha_fwd_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_alibi_mask_squant.cpp removing mha instance fmha_fwd_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_squant.cpp removing mha instance fmha_fwd_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_bias_squant.cpp removing mha instance fmha_fwd_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_alibi_squant.cpp removing mha instance fmha_fwd_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_mask_squant.cpp removing mha instance fmha_fwd_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_bias_mask_squant.cpp removing mha instance fmha_fwd_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_alibi_mask_squant.cpp removing mha instance fmha_fwd_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_squant.cpp removing mha instance fmha_fwd_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_bias_squant.cpp removing mha instance fmha_fwd_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_alibi_squant.cpp removing mha instance fmha_fwd_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_mask_squant.cpp removing mha instance fmha_fwd_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_bias_mask_squant.cpp removing mha instance fmha_fwd_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_alibi_mask_squant.cpp removing mha instance fmha_fwd_api.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_batch_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_batch_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_batch_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_batch_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_batch_b32_unused_pdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_batch_b32_unused_pdv.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_batch_b32_unused_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_batch_b32_unused.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_group_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_group_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_group_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_fp16_group_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_batch_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_batch_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_batch_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_batch_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_batch_b32_unused_pdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_batch_b32_unused_pdv.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_batch_b32_unused_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_batch_b32_unused.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_group_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_group_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_group_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp16_group_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_batch_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_batch_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_batch_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_batch_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_batch_b32_unused_pdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_batch_b32_unused_pdv.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_batch_b32_unused_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_batch_b32_unused.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_group_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_group_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_group_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp16_group_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_batch_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_batch_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_batch_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_batch_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_batch_b32_unused_pdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_batch_b32_unused_pdv.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_batch_b32_unused_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_batch_b32_unused.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_group_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_group_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_group_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp16_group_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_batch_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_batch_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_batch_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_batch_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_batch_b32_unused_pdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_batch_b32_unused_pdv.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_batch_b32_unused_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_batch_b32_unused.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_group_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_group_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_group_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d32_bf16_group_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_batch_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_batch_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_batch_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_batch_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_batch_b32_unused_pdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_batch_b32_unused_pdv.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_batch_b32_unused_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_batch_b32_unused.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_group_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_group_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_group_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d64_bf16_group_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_batch_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_batch_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_batch_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_batch_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_batch_b32_unused_pdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_batch_b32_unused_pdv.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_batch_b32_unused_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_batch_b32_unused.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_group_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_group_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_group_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d128_bf16_group_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_batch_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_batch_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_batch_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_batch_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_batch_b32_unused_pdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_batch_b32_unused_pdv.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_batch_b32_unused_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_batch_b32_unused.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_group_b32_unused_psdv_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_group_b32_unused_psdv.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_group_b32_unused_ps_lse.cpp removing mha instance fmha_fwd_splitkv_combine_d256_bf16_group_b32_unused_ps.cpp removing mha instance fmha_fwd_splitkv_combine_d64_fp8_batch_b32_unused_squant.cpp removing mha instance fmha_fwd_splitkv_combine_d128_fp8_batch_b32_unused_squant.cpp removing mha instance fmha_fwd_splitkv_combine_d256_fp8_batch_b32_unused_squant.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_fp16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_fp16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_fp16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_batch_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d32_bf16_group_b32x64x16x32x32x32_r2x1x1_r2x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_batch_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_bf16_group_b64x64x32x64x32x64_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_batch_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d128_bf16_group_b64x128x32x128x32x128_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psk_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_batch_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_bias_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse_pagedkv.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vr_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d256_bf16_group_b64x128x32x256x32x256_r4x1x1_r4x1x1_w16x16x16_w16x16x16_qr_vc_psskddv_alibi_mask_lse.cpp removing mha instance fmha_fwd_splitkv_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_bias_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_alibi_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_mask_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_bias_mask_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d64_fp8_batch_b128x64x32x64x32x64_r2x1x1_r2x1x1_w32x32x32_w32x32x32_qr_vc_alibi_mask_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_bias_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_alibi_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_mask_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_bias_mask_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d128_fp8_batch_b128x128x32x128x32x128_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_alibi_mask_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_bias_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_alibi_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_mask_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_bias_mask_lse_squant.cpp removing mha instance fmha_fwd_splitkv_d256_fp8_batch_b128x128x32x256x32x256_r4x1x1_r4x1x1_w32x32x32_w32x32x32_qr_vc_alibi_mask_lse_squant.cpp removing mha instance fmha_fwd_splitkv_api.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_psk.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_psskddv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vr_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_psk.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_psskddv.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d32_fp16_b64x64x32x32_vc_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_psk.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_psskddv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vr_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_psk.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_psskddv.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d64_fp16_b64x64x64x64_vc_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_psk.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_psskddv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vr_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_psk.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_psskddv.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d128_fp16_b64x64x128x128_vc_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_psk.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_psskddv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vr_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_psk.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_psskddv.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d256_fp16_b64x64x256x256_vc_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_psk.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_psskddv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vr_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_psk.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_psskddv.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d32_bf16_b64x64x32x32_vc_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_psk.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_psskddv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vr_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_psk.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_psskddv.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d64_bf16_b64x64x64x64_vc_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_psk.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_psskddv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vr_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_psk.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_psskddv.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d128_bf16_b64x64x128x128_vc_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_psk.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_psskddv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vr_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_psk_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_psskddv_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_pskd_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_psskddv_inter_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_pskd_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_psskddv_half_pagedkv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_psk.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_psskddv.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_pskd_inter.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_psskddv_inter.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_pskd_half.cpp removing mha instance fmha_fwd_appendkv_d256_bf16_b64x64x256x256_vc_psskddv_half.cpp removing mha instance fmha_fwd_appendkv_d64_fp8_b64x64x64x64_vc_psskddv.cpp removing mha instance fmha_fwd_appendkv_d128_fp8_b64x64x128x128_vc_psskddv.cpp removing mha instance fmha_fwd_appendkv_d256_fp8_b64x64x256x256_vc_psskddv.cpp removing mha instance fmha_fwd_appendkv_api.cpp removing mha instance fmha_bwd_dot_do_o_d32_fp16_batch_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d32_fp16_batch_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d32_fp16_batch_o2_pdv.cpp removing mha instance fmha_bwd_dot_do_o_d32_fp16_batch_o2.cpp removing mha instance fmha_bwd_dot_do_o_d32_fp16_group_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d32_fp16_group_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d64_fp16_batch_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d64_fp16_batch_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d64_fp16_batch_o2_pdv.cpp removing mha instance fmha_bwd_dot_do_o_d64_fp16_batch_o2.cpp removing mha instance fmha_bwd_dot_do_o_d64_fp16_group_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d64_fp16_group_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d128_fp16_batch_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d128_fp16_batch_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d128_fp16_batch_o2_pdv.cpp removing mha instance fmha_bwd_dot_do_o_d128_fp16_batch_o2.cpp removing mha instance fmha_bwd_dot_do_o_d128_fp16_group_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d128_fp16_group_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d256_fp16_batch_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d256_fp16_batch_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d256_fp16_batch_o2_pdv.cpp removing mha instance fmha_bwd_dot_do_o_d256_fp16_batch_o2.cpp removing mha instance fmha_bwd_dot_do_o_d256_fp16_group_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d256_fp16_group_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d32_bf16_batch_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d32_bf16_batch_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d32_bf16_batch_o2_pdv.cpp removing mha instance fmha_bwd_dot_do_o_d32_bf16_batch_o2.cpp removing mha instance fmha_bwd_dot_do_o_d32_bf16_group_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d32_bf16_group_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d64_bf16_batch_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d64_bf16_batch_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d64_bf16_batch_o2_pdv.cpp removing mha instance fmha_bwd_dot_do_o_d64_bf16_batch_o2.cpp removing mha instance fmha_bwd_dot_do_o_d64_bf16_group_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d64_bf16_group_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d128_bf16_batch_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d128_bf16_batch_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d128_bf16_batch_o2_pdv.cpp removing mha instance fmha_bwd_dot_do_o_d128_bf16_batch_o2.cpp removing mha instance fmha_bwd_dot_do_o_d128_bf16_group_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d128_bf16_group_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d256_bf16_batch_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d256_bf16_batch_o2_ps.cpp removing mha instance fmha_bwd_dot_do_o_d256_bf16_batch_o2_pdv.cpp removing mha instance fmha_bwd_dot_do_o_d256_bf16_batch_o2.cpp removing mha instance fmha_bwd_dot_do_o_d256_bf16_group_o2_psdv.cpp removing mha instance fmha_bwd_dot_do_o_d256_bf16_group_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_batch_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_batch_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_batch_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_batch_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_batch_o2_pd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_batch_o2_pd.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_batch_o2_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_batch_o2.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_group_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_group_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_group_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_fp16_b64x128_group_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_batch_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_batch_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_batch_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_batch_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_batch_o2_pd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_batch_o2_pd.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_batch_o2_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_batch_o2.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_group_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_group_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_group_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_fp16_b64x128_group_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_batch_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_batch_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_batch_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_batch_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_batch_o2_pd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_batch_o2_pd.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_batch_o2_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_batch_o2.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_group_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_group_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_group_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_fp16_b64x128_group_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_batch_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_batch_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_batch_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_batch_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_batch_o2_pd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_batch_o2_pd.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_batch_o2_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_batch_o2.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_group_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_group_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_group_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_fp16_b64x64_group_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_batch_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_batch_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_batch_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_batch_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_batch_o2_pd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_batch_o2_pd.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_batch_o2_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_batch_o2.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_group_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_group_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_group_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d32_bf16_b64x128_group_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_batch_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_batch_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_batch_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_batch_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_batch_o2_pd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_batch_o2_pd.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_batch_o2_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_batch_o2.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_group_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_group_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_group_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d64_bf16_b64x128_group_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_batch_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_batch_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_batch_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_batch_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_batch_o2_pd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_batch_o2_pd.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_batch_o2_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_batch_o2.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_group_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_group_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_group_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d128_bf16_b64x128_group_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_batch_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_batch_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_batch_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_batch_o2_ps.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_batch_o2_pd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_batch_o2_pd.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_batch_o2_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_batch_o2.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_group_o2_psd_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_group_o2_psd.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_group_o2_ps_deterministic.cpp removing mha instance fmha_bwd_convert_dq_d256_bf16_b64x64_group_o2_ps.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_fp16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_fp16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_fp16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_fp16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_batch_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d32_bf16_group_b32x128x32x32x32x32x64x32x32_r1x4x1_r4x1x1_r2x2x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_batch_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d64_bf16_group_b32x128x64x32x64x32x32x64x64_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_batch_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d128_bf16_group_b16x128x128x16x128x16x32x128x128_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_ps_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_psk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_pddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_batch_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_psskddv_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_d256_bf16_group_b16x64x256x16x256x16x32x256x256_r1x4x1_r4x1x1_r1x4x1_w16x16x32_w16x16x16_o1_kr_ktr_vr_iglp_pssk_alibi_mask_dropout_wg16_storerandval.cpp removing mha instance fmha_bwd_api.cpp skip_instance_libary device_mha_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/mha adding instance device_normalization_bwd_data_instance add_instance_library device_normalization_bwd_data_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/normalization_bwd_data adding instance device_normalization_bwd_gamma_beta_instance add_instance_library device_normalization_bwd_gamma_beta_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta adding instance device_normalization_fwd_instance add_instance_library device_normalization_fwd_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/normalization_fwd adding instance device_permute_scale_instance add_instance_library device_permute_scale_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/permute_scale adding instance device_pool2d_fwd_instance add_instance_library device_pool2d_fwd_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/pool2d_fwd adding instance device_pool3d_fwd_instance add_instance_library device_pool3d_fwd_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/pool3d_fwd Found only xdl and dl instances, but gfx9 is not on the targets listand DL_KERNELS is not set. Skipping. skip_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/quantization adding instance device_reduce_instance add_instance_library device_reduce_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/reduce adding instance device_softmax_instance add_instance_library device_softmax_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/softmax instance should be built for all types! adding instance device_transpose_instance add_instance_library device_transpose_instance add_instance_directory /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/transpose -- Configuring done (5.5s) -- Generating done (0.3s) CMake Warning: Manually-specified variables were not used by the project: CMAKE_CXX_FLAGS_RELEASE CMAKE_C_FLAGS_RELEASE CMAKE_Fortran_FLAGS_RELEASE CMAKE_INSTALL_DO_STRIP LIB_SUFFIX SHARE_INSTALL_PREFIX SYSCONF_INSTALL_DIR -- Build files have been written to: /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build + /usr/bin/cmake --build redhat-linux-build --verbose -j 1 Change Dir: '/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build' Run Build Command(s): /usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile -j1 /usr/bin/cmake -S/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2 -B/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build/CMakeFiles /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build//CMakeFiles/progress.marks /usr/bin/gmake -f CMakeFiles/Makefile2 all gmake[1]: Entering directory '/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build' /usr/bin/gmake -f library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/build.make library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/depend gmake[2]: Entering directory '/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build' cd /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2 /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/transpose /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build/library/src/tensor_operation_instance/gpu/transpose /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build/library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/DependInfo.cmake "--color=" gmake[2]: Leaving directory '/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build' /usr/bin/gmake -f library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/build.make library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/build gmake[2]: Entering directory '/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build' [ 0%] Building CXX object library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o cd /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build/library/src/tensor_operation_instance/gpu/transpose && /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_WMMA -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/include -I/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/include -I/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++17 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics --offload-compress -x hip --offload-arch=gfx1100 --offload-arch=gfx1100 -MD -MT library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o -MF CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o.d -o CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o -c /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/transpose/device_transpose_instances_3d.cpp In file included from /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/src/tensor_operation_instance/gpu/transpose/device_transpose_instances_3d.cpp:5: In file included from /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/library/include/ck/library/tensor_operation_instance/gpu/transpose/device_transpose_instance.hpp:6: In file included from /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/include/ck/tensor_operation/gpu/device/impl/device_elementwise_dynamic_vector_dims_impl.hpp:12: In file included from /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/include/ck/tensor_operation/gpu/grid/gridwise_elementwise_2d.hpp:6: In file included from /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/include/ck/tensor_description/cluster_descriptor.hpp:6: In file included from /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/include/ck/utility/common_header.hpp:33: In file included from /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/include/ck/utility/thread_group.hpp:6: /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/include/ck/utility/get_id.hpp:10:39: error: constexpr function never produces a constant expression [-Winvalid-constexpr] 10 | __host__ __device__ constexpr index_t get_warp_size() | ^~~~~~~~~~~~~ /builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/include/ck/utility/get_id.hpp:13:12: note: non-constexpr function 'operator int' cannot be used in a constant expression 13 | return warpSize; | ^ /usr/include/hip/amd_detail/amd_warp_functions.h:89:5: note: declared here 89 | operator int() const noexcept { | ^ 1 error generated when compiling for gfx1100. gmake[2]: *** [library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/build.make:82: library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o] Error 1 gmake[2]: Leaving directory '/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build' gmake[1]: Leaving directory '/builddir/build/BUILD/composable_kernel-6.4.2-build/composable_kernel-rocm-6.4.2/redhat-linux-build' gmake[1]: *** [CMakeFiles/Makefile2:11179: library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/all] Error 2 gmake: *** [Makefile:159: all] Error 2 error: Bad exit status from /var/tmp/rpm-tmp.VtIQxz (%build) Bad exit status from /var/tmp/rpm-tmp.VtIQxz (%build) RPM build errors: Finish: rpmbuild composable_kernel-6.4.2-1.fc43.src.rpm Finish: build phase for composable_kernel-6.4.2-1.fc43.src.rpm INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-43-x86_64-1758760224.854724/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names ERROR: Exception(/var/lib/copr-rpmbuild/results/composable_kernel-6.4.2-1.fc43.src.rpm) Config(fedora-43-x86_64) 0 minutes 56 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_failure=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot ERROR: Command failed: # /usr/bin/systemd-nspawn -q -M 3a38d3f573ba4f86a36cdc165ee73236 -D /var/lib/mock/fedora-43-x86_64-1758760224.854724/root -a -u mockbuild --capability=cap_ipc_lock --rlimit=RLIMIT_NOFILE=10240 --capability=cap_ipc_lock --bind=/tmp/mock-resolv.rwc1mprd:/etc/resolv.conf --bind=/dev/btrfs-control --bind=/dev/mapper/control --bind=/dev/fuse --bind=/dev/loop-control --bind=/dev/loop0 --bind=/dev/loop1 --bind=/dev/loop2 --bind=/dev/loop3 --bind=/dev/loop4 --bind=/dev/loop5 --bind=/dev/loop6 --bind=/dev/loop7 --bind=/dev/loop8 --bind=/dev/loop9 --bind=/dev/loop10 --bind=/dev/loop11 --console=pipe --setenv=TERM=vt100 --setenv=SHELL=/bin/bash --setenv=HOME=/builddir --setenv=HOSTNAME=mock --setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin '--setenv=PROMPT_COMMAND=printf "\033]0;\007"' '--setenv=PS1= \s-\v\$ ' --setenv=LANG=C.UTF-8 --resolv-conf=off bash --login -c '/usr/bin/rpmbuild -bb --target x86_64 --nodeps /builddir/build/originals/composable_kernel.spec' Copr build error: Build failed