Warning: Permanently added '44.205.20.13' (ED25519) to the list of known hosts. You can reproduce this build on your computer by running: sudo dnf install copr-rpmbuild /usr/bin/copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.fedorainfracloud.org/backend/get-build-task/9808137-fedora-43-x86_64 --chroot fedora-43-x86_64 Version: 1.6 PID: 8774 Logging PID: 8776 Task: {'allow_user_ssh': False, 'appstream': False, 'background': True, 'build_id': 9808137, 'buildroot_pkgs': [], 'chroot': 'fedora-43-x86_64', 'enable_net': False, 'fedora_review': False, 'git_hash': 'c764b47d888ca7c9122e784f68f8b958d584340d', 'git_repo': 'https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/RH/composable_kernel', 'isolation': 'default', 'memory_reqs': 2048, 'package_name': 'composable_kernel', 'package_version': '7.1.0-2', 'project_dirname': 'RH', 'project_name': 'RH', 'project_owner': '@rocm-packagers-sig', 'repo_priority': None, 'repos': [{'baseurl': 'https://download.copr.fedorainfracloud.org/results/@rocm-packagers-sig/RH/fedora-43-x86_64/', 'id': 'copr_base', 'name': 'Copr repository', 'priority': None}], 'sandbox': '@rocm-packagers-sig/RH--https://src.fedoraproject.org/user/trix', 'source_json': {}, 'source_type': None, 'ssh_public_keys': None, 'storage': 0, 'submitter': 'https://src.fedoraproject.org/user/trix', 'tags': [], 'task_id': '9808137-fedora-43-x86_64', 'timeout': 180000, 'uses_devel_repo': False, 'with_opts': [], 'without_opts': []} Running: git clone https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/RH/composable_kernel /var/lib/copr-rpmbuild/workspace/workdir-rra3r7ts/composable_kernel --depth 500 --no-single-branch --recursive cmd: ['git', 'clone', 'https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/RH/composable_kernel', '/var/lib/copr-rpmbuild/workspace/workdir-rra3r7ts/composable_kernel', '--depth', '500', '--no-single-branch', '--recursive'] cwd: . rc: 0 stdout: stderr: Cloning into '/var/lib/copr-rpmbuild/workspace/workdir-rra3r7ts/composable_kernel'... Running: git checkout c764b47d888ca7c9122e784f68f8b958d584340d -- cmd: ['git', 'checkout', 'c764b47d888ca7c9122e784f68f8b958d584340d', '--'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-rra3r7ts/composable_kernel rc: 0 stdout: stderr: Note: switching to 'c764b47d888ca7c9122e784f68f8b958d584340d'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example: git switch -c Or undo this operation with: git switch - Turn off this advice by setting config variable advice.detachedHead to false HEAD is now at c764b47 automatic import of composable_kernel Running: dist-git-client sources cmd: ['dist-git-client', 'sources'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-rra3r7ts/composable_kernel rc: 0 stdout: stderr: INFO: Reading stdout from command: git rev-parse --abbrev-ref HEAD INFO: Reading stdout from command: git rev-parse HEAD INFO: Reading sources specification file: sources INFO: Downloading composable_kernel-7.1.0.tar.gz INFO: Reading stdout from command: curl --help all INFO: Calling: curl -H Pragma: -o composable_kernel-7.1.0.tar.gz --location --connect-timeout 60 --retry 3 --retry-delay 10 --remote-time --show-error --fail --retry-all-errors https://copr-dist-git.fedorainfracloud.org/repo/pkgs/@rocm-packagers-sig/RH/composable_kernel/composable_kernel-7.1.0.tar.gz/md5/1d8f397b684a9582489474a9e94ce7bd/composable_kernel-7.1.0.tar.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 5238k 100 5238k 0 0 17.9M 0 --:--:-- --:--:-- --:--:-- 18.0M INFO: Reading stdout from command: md5sum composable_kernel-7.1.0.tar.gz tail: /var/lib/copr-rpmbuild/main.log: file truncated Running (timeout=180000): unbuffer mock --spec /var/lib/copr-rpmbuild/workspace/workdir-rra3r7ts/composable_kernel/composable_kernel.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-rra3r7ts/composable_kernel --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1763473447.786242 -r /var/lib/copr-rpmbuild/results/configs/child.cfg INFO: mock.py version 6.5 starting (python version = 3.13.7, NVR = mock-6.5-1.fc42), args: /usr/libexec/mock/mock --spec /var/lib/copr-rpmbuild/workspace/workdir-rra3r7ts/composable_kernel/composable_kernel.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-rra3r7ts/composable_kernel --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1763473447.786242 -r /var/lib/copr-rpmbuild/results/configs/child.cfg Start(bootstrap): init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish(bootstrap): init plugins Start: init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish: init plugins INFO: Signal handler active Start: run INFO: Start(/var/lib/copr-rpmbuild/workspace/workdir-rra3r7ts/composable_kernel/composable_kernel.spec) Config(fedora-43-x86_64) Start: clean chroot Finish: clean chroot Mock Version: 6.5 INFO: Mock Version: 6.5 Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1763473447.786242/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata INFO: Guessed host environment type: unknown INFO: Using container image: registry.fedoraproject.org/fedora:43 INFO: Pulling image: registry.fedoraproject.org/fedora:43 INFO: Tagging container image as mock-bootstrap-b2e0efa1-3c7d-4b4d-b575-add257564e3b INFO: Checking that e28bdd8606c81a9f135f4da84f825e68b8fe0855ecdf0ca2d75ab44fbf1deb18 image matches host's architecture INFO: Copy content of container e28bdd8606c81a9f135f4da84f825e68b8fe0855ecdf0ca2d75ab44fbf1deb18 to /var/lib/mock/fedora-43-x86_64-bootstrap-1763473447.786242/root INFO: mounting e28bdd8606c81a9f135f4da84f825e68b8fe0855ecdf0ca2d75ab44fbf1deb18 with podman image mount INFO: image e28bdd8606c81a9f135f4da84f825e68b8fe0855ecdf0ca2d75ab44fbf1deb18 as /var/lib/containers/storage/overlay/bc703f52c5bbbeafdacb6c17c66dcb3a22df36ef9642dba23b1fff0135b08cbe/merged INFO: umounting image e28bdd8606c81a9f135f4da84f825e68b8fe0855ecdf0ca2d75ab44fbf1deb18 (/var/lib/containers/storage/overlay/bc703f52c5bbbeafdacb6c17c66dcb3a22df36ef9642dba23b1fff0135b08cbe/merged) with podman image umount INFO: Removing image mock-bootstrap-b2e0efa1-3c7d-4b4d-b575-add257564e3b INFO: Package manager dnf5 detected and used (fallback) INFO: Not updating bootstrap chroot, bootstrap_image_ready=True Start(bootstrap): creating root cache Finish(bootstrap): creating root cache Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-1763473447.786242/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Package manager dnf5 detected and used (direct choice) INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-6.0.0-1.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 dnf5-5.2.17.0-2.fc43.x86_64 dnf5-plugins-5.2.17.0-2.fc43.x86_64 Start: installing minimal buildroot with dnf5 Updating and loading repositories: Copr repository 100% | 534.3 KiB/s | 223.3 KiB | 00m00s fedora 100% | 26.0 MiB/s | 35.4 MiB | 00m01s updates 100% | 5.4 MiB/s | 6.0 MiB | 00m01s Repositories loaded. Package Arch Version Repository Size Installing group/module packages: bash x86_64 5.3.0-2.fc43 fedora 8.4 MiB bzip2 x86_64 1.0.8-21.fc43 fedora 95.3 KiB coreutils x86_64 9.7-6.fc43 fedora 5.4 MiB cpio x86_64 2.15-6.fc43 fedora 1.1 MiB diffutils x86_64 3.12-3.fc43 fedora 1.6 MiB fedora-release-common noarch 43-25 fedora 20.6 KiB findutils x86_64 1:4.10.0-6.fc43 fedora 1.8 MiB gawk x86_64 5.3.2-2.fc43 fedora 1.8 MiB glibc-minimal-langpack x86_64 2.42-4.fc43 fedora 0.0 B grep x86_64 3.12-2.fc43 fedora 1.0 MiB gzip x86_64 1.13-4.fc43 fedora 388.8 KiB info x86_64 7.2-6.fc43 fedora 353.9 KiB patch x86_64 2.8-2.fc43 fedora 222.8 KiB redhat-rpm-config noarch 343-11.fc43 fedora 182.9 KiB rpm-build x86_64 6.0.0-1.fc43 fedora 287.4 KiB sed x86_64 4.9-5.fc43 fedora 857.3 KiB shadow-utils x86_64 2:4.18.0-3.fc43 fedora 3.9 MiB tar x86_64 2:1.35-6.fc43 fedora 2.9 MiB unzip x86_64 6.0-67.fc43 fedora 386.3 KiB util-linux x86_64 2.41.1-17.fc43 fedora 3.5 MiB which x86_64 2.23-3.fc43 fedora 83.5 KiB xz x86_64 1:5.8.1-2.fc43 fedora 1.3 MiB Installing dependencies: add-determinism x86_64 0.6.0-2.fc43 fedora 2.4 MiB alternatives x86_64 1.33-3.fc43 updates 62.2 KiB ansible-srpm-macros noarch 1-18.1.fc43 fedora 35.7 KiB audit-libs x86_64 4.1.2-2.fc43 updates 378.8 KiB binutils x86_64 2.45.50-9.fc43 copr_base 27.0 MiB build-reproducibility-srpm-macros noarch 0.6.0-2.fc43 fedora 735.0 B bzip2-libs x86_64 1.0.8-21.fc43 fedora 80.6 KiB ca-certificates noarch 2025.2.80_v9.0.304-1.1.fc43 fedora 2.7 MiB coreutils-common x86_64 9.7-6.fc43 fedora 11.3 MiB crypto-policies noarch 20250714-5.gitcd6043a.fc43 fedora 146.9 KiB curl x86_64 8.15.0-3.fc43 updates 461.5 KiB cyrus-sasl-lib x86_64 2.1.28-33.fc43 fedora 2.3 MiB debugedit x86_64 5.2-3.fc43 fedora 214.0 KiB dwz x86_64 0.16-2.fc43 fedora 287.1 KiB ed x86_64 1.22.2-1.fc43 fedora 148.1 KiB efi-srpm-macros noarch 6-4.fc43 fedora 40.1 KiB elfutils x86_64 0.194-1.fc43 updates 2.9 MiB elfutils-debuginfod-client x86_64 0.194-1.fc43 updates 84.0 KiB elfutils-default-yama-scope noarch 0.194-1.fc43 updates 1.8 KiB elfutils-libelf x86_64 0.194-1.fc43 updates 1.1 MiB elfutils-libs x86_64 0.194-1.fc43 updates 687.5 KiB fedora-gpg-keys noarch 43-1 fedora 131.2 KiB fedora-release noarch 43-25 fedora 0.0 B fedora-release-identity-basic noarch 43-25 fedora 631.0 B fedora-repos noarch 43-1 fedora 4.9 KiB file x86_64 5.46-8.fc43 fedora 100.2 KiB file-libs x86_64 5.46-8.fc43 fedora 11.9 MiB filesystem x86_64 3.18-50.fc43 fedora 112.0 B filesystem-srpm-macros noarch 3.18-50.fc43 fedora 38.2 KiB fonts-srpm-macros noarch 1:2.0.5-23.fc43 fedora 55.8 KiB forge-srpm-macros noarch 0.4.0-3.fc43 fedora 38.9 KiB fpc-srpm-macros noarch 1.3-15.fc43 fedora 144.0 B gap-srpm-macros noarch 2-1.fc43 fedora 2.1 KiB gdb-minimal x86_64 16.3-6.fc43 fedora 13.3 MiB gdbm-libs x86_64 1:1.23-10.fc43 fedora 129.9 KiB ghc-srpm-macros noarch 1.9.2-3.fc43 fedora 779.0 B glibc x86_64 2.42-4.fc43 fedora 6.7 MiB glibc-common x86_64 2.42-4.fc43 fedora 1.0 MiB glibc-gconv-extra x86_64 2.42-4.fc43 fedora 7.2 MiB gmp x86_64 1:6.3.0-4.fc43 fedora 811.2 KiB gnat-srpm-macros noarch 6-8.fc43 fedora 1.0 KiB gnulib-l10n noarch 20241231-1.fc43 fedora 655.0 KiB gnupg2 x86_64 2.4.8-4.fc43 fedora 6.5 MiB gnupg2-dirmngr x86_64 2.4.8-4.fc43 fedora 618.4 KiB gnupg2-gpg-agent x86_64 2.4.8-4.fc43 fedora 671.4 KiB gnupg2-gpgconf x86_64 2.4.8-4.fc43 fedora 250.0 KiB gnupg2-keyboxd x86_64 2.4.8-4.fc43 fedora 201.4 KiB gnupg2-verify x86_64 2.4.8-4.fc43 fedora 348.5 KiB gnutls x86_64 3.8.10-3.fc43 fedora 3.8 MiB go-srpm-macros noarch 3.8.0-1.fc43 fedora 61.9 KiB gpgverify noarch 2.2-3.fc43 fedora 8.7 KiB ima-evm-utils-libs x86_64 1.6.2-6.fc43 fedora 60.7 KiB jansson x86_64 2.14-3.fc43 fedora 89.1 KiB java-srpm-macros noarch 1-7.fc43 fedora 870.0 B json-c x86_64 0.18-7.fc43 fedora 82.7 KiB kernel-srpm-macros noarch 1.0-27.fc43 fedora 1.9 KiB keyutils-libs x86_64 1.6.3-6.fc43 fedora 54.3 KiB krb5-libs x86_64 1.21.3-7.fc43 fedora 2.3 MiB libacl x86_64 2.3.2-4.fc43 fedora 35.9 KiB libarchive x86_64 3.8.1-3.fc43 fedora 951.1 KiB libassuan x86_64 2.5.7-4.fc43 fedora 163.8 KiB libattr x86_64 2.5.2-6.fc43 fedora 24.4 KiB libblkid x86_64 2.41.1-17.fc43 fedora 262.4 KiB libbrotli x86_64 1.1.0-10.fc43 fedora 833.3 KiB libcap x86_64 2.76-3.fc43 fedora 209.1 KiB libcap-ng x86_64 0.8.5-8.fc43 fedora 68.9 KiB libcom_err x86_64 1.47.3-2.fc43 fedora 63.1 KiB libcurl x86_64 8.15.0-3.fc43 updates 903.2 KiB libeconf x86_64 0.7.9-2.fc43 fedora 64.9 KiB libevent x86_64 2.1.12-16.fc43 fedora 883.1 KiB libfdisk x86_64 2.41.1-17.fc43 fedora 380.4 KiB libffi x86_64 3.5.1-2.fc43 fedora 83.6 KiB libfsverity x86_64 1.6-3.fc43 fedora 28.5 KiB libgcc x86_64 15.2.1-4.fc43 copr_base 266.6 KiB libgcrypt x86_64 1.11.1-2.fc43 fedora 1.6 MiB libgomp x86_64 15.2.1-4.fc43 copr_base 541.6 KiB libgpg-error x86_64 1.55-2.fc43 fedora 915.3 KiB libidn2 x86_64 2.3.8-2.fc43 fedora 552.5 KiB libksba x86_64 1.6.7-4.fc43 fedora 398.5 KiB liblastlog2 x86_64 2.41.1-17.fc43 fedora 33.9 KiB libmount x86_64 2.41.1-17.fc43 fedora 372.7 KiB libnghttp2 x86_64 1.66.0-2.fc43 fedora 162.2 KiB libpkgconf x86_64 2.3.0-3.fc43 fedora 78.1 KiB libpsl x86_64 0.21.5-6.fc43 fedora 76.4 KiB libselinux x86_64 3.9-5.fc43 fedora 193.1 KiB libsemanage x86_64 3.9-4.fc43 fedora 308.5 KiB libsepol x86_64 3.9-2.fc43 fedora 822.0 KiB libsmartcols x86_64 2.41.1-17.fc43 fedora 180.5 KiB libssh x86_64 0.11.3-1.fc43 fedora 567.1 KiB libssh-config noarch 0.11.3-1.fc43 fedora 277.0 B libstdc++ x86_64 15.2.1-4.fc43 copr_base 2.8 MiB libtasn1 x86_64 4.20.0-2.fc43 fedora 176.3 KiB libtool-ltdl x86_64 2.5.4-7.fc43 fedora 70.1 KiB libunistring x86_64 1.1-10.fc43 fedora 1.7 MiB libusb1 x86_64 1.0.29-4.fc43 fedora 171.3 KiB libuuid x86_64 2.41.1-17.fc43 fedora 37.4 KiB libverto x86_64 0.3.2-11.fc43 fedora 25.4 KiB libxcrypt x86_64 4.5.2-1.fc43 updates 285.3 KiB libxml2 x86_64 2.12.10-5.fc43 fedora 1.7 MiB libzstd x86_64 1.5.7-2.fc43 fedora 799.9 KiB lua-libs x86_64 5.4.8-3.fc43 updates 280.8 KiB lua-srpm-macros noarch 1-16.fc43 fedora 1.3 KiB lz4-libs x86_64 1.10.0-3.fc43 fedora 161.4 KiB mpfr x86_64 4.2.2-2.fc43 fedora 832.8 KiB ncurses-base noarch 6.5-7.20250614.fc43 fedora 328.1 KiB ncurses-libs x86_64 6.5-7.20250614.fc43 fedora 946.3 KiB nettle x86_64 3.10.1-2.fc43 fedora 790.6 KiB npth x86_64 1.8-3.fc43 fedora 49.6 KiB ocaml-srpm-macros noarch 11-2.fc43 fedora 1.9 KiB openblas-srpm-macros noarch 2-20.fc43 fedora 112.0 B openldap x86_64 2.6.10-4.fc43 fedora 659.9 KiB openssl-libs x86_64 1:3.5.4-1.fc43 updates 8.9 MiB p11-kit x86_64 0.25.8-1.fc43 fedora 2.3 MiB p11-kit-trust x86_64 0.25.8-1.fc43 fedora 446.5 KiB package-notes-srpm-macros noarch 0.5-14.fc43 fedora 1.6 KiB pam-libs x86_64 1.7.1-3.fc43 fedora 126.8 KiB pcre2 x86_64 10.46-1.fc43 fedora 697.7 KiB pcre2-syntax noarch 10.46-1.fc43 fedora 275.3 KiB perl-srpm-macros noarch 1-60.fc43 fedora 861.0 B pkgconf x86_64 2.3.0-3.fc43 fedora 88.5 KiB pkgconf-m4 noarch 2.3.0-3.fc43 fedora 14.4 KiB pkgconf-pkg-config x86_64 2.3.0-3.fc43 fedora 989.0 B popt x86_64 1.19-9.fc43 fedora 132.8 KiB publicsuffix-list-dafsa noarch 20250616-2.fc43 fedora 69.1 KiB pyproject-srpm-macros noarch 1.18.5-1.fc43 updates 1.9 KiB python-srpm-macros noarch 3.14-5.fc43 fedora 51.5 KiB qt5-srpm-macros noarch 5.15.18-1.fc43 updates 500.0 B qt6-srpm-macros noarch 6.10.0-1.fc43 updates 464.0 B readline x86_64 8.3-2.fc43 fedora 511.7 KiB rpm x86_64 6.0.0-1.fc43 fedora 3.1 MiB rpm-build-libs x86_64 6.0.0-1.fc43 fedora 268.4 KiB rpm-libs x86_64 6.0.0-1.fc43 fedora 933.7 KiB rpm-sequoia x86_64 1.9.0-2.fc43 fedora 2.5 MiB rpm-sign-libs x86_64 6.0.0-1.fc43 fedora 39.7 KiB rust-srpm-macros noarch 26.4-1.fc43 fedora 4.8 KiB setup noarch 2.15.0-26.fc43 fedora 725.0 KiB sqlite-libs x86_64 3.50.2-2.fc43 fedora 1.5 MiB systemd-libs x86_64 258.2-1.fc43 updates 2.3 MiB systemd-standalone-sysusers x86_64 258.2-1.fc43 updates 293.5 KiB tpm2-tss x86_64 4.1.3-8.fc43 fedora 1.6 MiB tree-sitter-srpm-macros noarch 0.4.2-1.fc43 fedora 8.3 KiB util-linux-core x86_64 2.41.1-17.fc43 fedora 1.5 MiB xxhash-libs x86_64 0.8.3-3.fc43 fedora 90.2 KiB xz-libs x86_64 1:5.8.1-2.fc43 fedora 217.8 KiB zig-srpm-macros noarch 1-5.fc43 fedora 1.1 KiB zip x86_64 3.0-44.fc43 fedora 694.5 KiB zlib-ng-compat x86_64 2.2.5-2.fc43 fedora 137.6 KiB zstd x86_64 1.5.7-2.fc43 fedora 1.7 MiB Installing groups: Buildsystem building group Transaction Summary: Installing: 170 packages Total size of inbound packages is 59 MiB. Need to download 59 MiB. After this operation, 199 MiB extra will be used (install 199 MiB, remove 0 B). [ 1/170] bzip2-0:1.0.8-21.fc43.x86_64 100% | 3.6 MiB/s | 51.6 KiB | 00m00s [ 2/170] bash-0:5.3.0-2.fc43.x86_64 100% | 77.9 MiB/s | 1.9 MiB | 00m00s [ 3/170] cpio-0:2.15-6.fc43.x86_64 100% | 28.6 MiB/s | 293.1 KiB | 00m00s [ 4/170] fedora-release-common-0:43-25 100% | 12.0 MiB/s | 24.6 KiB | 00m00s [ 5/170] diffutils-0:3.12-3.fc43.x86_6 100% | 95.8 MiB/s | 392.3 KiB | 00m00s [ 6/170] coreutils-0:9.7-6.fc43.x86_64 100% | 38.0 MiB/s | 1.1 MiB | 00m00s [ 7/170] glibc-minimal-langpack-0:2.42 100% | 12.5 MiB/s | 38.3 KiB | 00m00s [ 8/170] findutils-1:4.10.0-6.fc43.x86 100% | 89.5 MiB/s | 550.0 KiB | 00m00s [ 9/170] grep-0:3.12-2.fc43.x86_64 100% | 73.0 MiB/s | 299.1 KiB | 00m00s [ 10/170] gzip-0:1.13-4.fc43.x86_64 100% | 27.7 MiB/s | 170.1 KiB | 00m00s [ 11/170] info-0:7.2-6.fc43.x86_64 100% | 44.6 MiB/s | 182.9 KiB | 00m00s [ 12/170] patch-0:2.8-2.fc43.x86_64 100% | 27.8 MiB/s | 113.8 KiB | 00m00s [ 13/170] redhat-rpm-config-0:343-11.fc 100% | 25.8 MiB/s | 79.1 KiB | 00m00s [ 14/170] rpm-build-0:6.0.0-1.fc43.x86_ 100% | 44.9 MiB/s | 138.0 KiB | 00m00s [ 15/170] sed-0:4.9-5.fc43.x86_64 100% | 77.4 MiB/s | 317.1 KiB | 00m00s [ 16/170] tar-2:1.35-6.fc43.x86_64 100% | 119.5 MiB/s | 856.4 KiB | 00m00s [ 17/170] unzip-0:6.0-67.fc43.x86_64 100% | 29.9 MiB/s | 183.7 KiB | 00m00s [ 18/170] shadow-utils-2:4.18.0-3.fc43. 100% | 128.3 MiB/s | 1.3 MiB | 00m00s [ 19/170] which-0:2.23-3.fc43.x86_64 100% | 20.4 MiB/s | 41.7 KiB | 00m00s [ 20/170] xz-1:5.8.1-2.fc43.x86_64 100% | 111.8 MiB/s | 572.5 KiB | 00m00s [ 21/170] util-linux-0:2.41.1-17.fc43.x 100% | 119.1 MiB/s | 1.2 MiB | 00m00s [ 22/170] gawk-0:5.3.2-2.fc43.x86_64 100% | 93.7 MiB/s | 1.1 MiB | 00m00s [ 23/170] filesystem-0:3.18-50.fc43.x86 100% | 102.6 MiB/s | 1.3 MiB | 00m00s [ 24/170] ncurses-libs-0:6.5-7.20250614 100% | 46.4 MiB/s | 332.7 KiB | 00m00s [ 25/170] bzip2-libs-0:1.0.8-21.fc43.x8 100% | 10.5 MiB/s | 43.1 KiB | 00m00s [ 26/170] glibc-0:2.42-4.fc43.x86_64 100% | 157.4 MiB/s | 2.2 MiB | 00m00s [ 27/170] gmp-1:6.3.0-4.fc43.x86_64 100% | 52.0 MiB/s | 319.3 KiB | 00m00s [ 28/170] libacl-0:2.3.2-4.fc43.x86_64 100% | 7.9 MiB/s | 24.3 KiB | 00m00s [ 29/170] coreutils-common-0:9.7-6.fc43 100% | 175.0 MiB/s | 2.1 MiB | 00m00s [ 30/170] libattr-0:2.5.2-6.fc43.x86_64 100% | 5.8 MiB/s | 17.9 KiB | 00m00s [ 31/170] libcap-0:2.76-3.fc43.x86_64 100% | 28.3 MiB/s | 86.9 KiB | 00m00s [ 32/170] fedora-repos-0:43-1.noarch 100% | 3.0 MiB/s | 9.1 KiB | 00m00s [ 33/170] libselinux-0:3.9-5.fc43.x86_6 100% | 31.8 MiB/s | 97.7 KiB | 00m00s [ 34/170] glibc-common-0:2.42-4.fc43.x8 100% | 105.8 MiB/s | 325.2 KiB | 00m00s [ 35/170] pcre2-0:10.46-1.fc43.x86_64 100% | 85.4 MiB/s | 262.2 KiB | 00m00s [ 36/170] ansible-srpm-macros-0:1-18.1. 100% | 9.7 MiB/s | 19.9 KiB | 00m00s [ 37/170] ed-0:1.22.2-1.fc43.x86_64 100% | 20.4 MiB/s | 83.7 KiB | 00m00s [ 38/170] build-reproducibility-srpm-ma 100% | 5.8 MiB/s | 11.8 KiB | 00m00s [ 39/170] efi-srpm-macros-0:6-4.fc43.no 100% | 10.9 MiB/s | 22.4 KiB | 00m00s [ 40/170] dwz-0:0.16-2.fc43.x86_64 100% | 44.1 MiB/s | 135.5 KiB | 00m00s [ 41/170] filesystem-srpm-macros-0:3.18 100% | 25.8 MiB/s | 26.4 KiB | 00m00s [ 42/170] file-0:5.46-8.fc43.x86_64 100% | 23.8 MiB/s | 48.8 KiB | 00m00s [ 43/170] fpc-srpm-macros-0:1.3-15.fc43 100% | 3.9 MiB/s | 7.9 KiB | 00m00s [ 44/170] forge-srpm-macros-0:0.4.0-3.f 100% | 6.5 MiB/s | 20.1 KiB | 00m00s [ 45/170] fonts-srpm-macros-1:2.0.5-23. 100% | 6.6 MiB/s | 27.2 KiB | 00m00s [ 46/170] gap-srpm-macros-0:2-1.fc43.no 100% | 4.4 MiB/s | 9.0 KiB | 00m00s [ 47/170] ghc-srpm-macros-0:1.9.2-3.fc4 100% | 8.5 MiB/s | 8.7 KiB | 00m00s [ 48/170] gnat-srpm-macros-0:6-8.fc43.n 100% | 4.1 MiB/s | 8.5 KiB | 00m00s [ 49/170] java-srpm-macros-0:1-7.fc43.n 100% | 1.6 MiB/s | 7.9 KiB | 00m00s [ 50/170] go-srpm-macros-0:3.8.0-1.fc43 100% | 3.9 MiB/s | 28.3 KiB | 00m00s [ 51/170] kernel-srpm-macros-0:1.0-27.f 100% | 1.5 MiB/s | 8.9 KiB | 00m00s [ 52/170] lua-srpm-macros-0:1-16.fc43.n 100% | 2.9 MiB/s | 8.8 KiB | 00m00s [ 53/170] ocaml-srpm-macros-0:11-2.fc43 100% | 3.0 MiB/s | 9.3 KiB | 00m00s [ 54/170] openblas-srpm-macros-0:2-20.f 100% | 2.5 MiB/s | 7.6 KiB | 00m00s [ 55/170] package-notes-srpm-macros-0:0 100% | 4.4 MiB/s | 9.0 KiB | 00m00s [ 56/170] perl-srpm-macros-0:1-60.fc43. 100% | 4.0 MiB/s | 8.3 KiB | 00m00s [ 57/170] python-srpm-macros-0:3.14-5.f 100% | 11.4 MiB/s | 23.4 KiB | 00m00s [ 58/170] tree-sitter-srpm-macros-0:0.4 100% | 4.3 MiB/s | 13.4 KiB | 00m00s [ 59/170] rust-srpm-macros-0:26.4-1.fc4 100% | 2.7 MiB/s | 11.1 KiB | 00m00s [ 60/170] rpm-0:6.0.0-1.fc43.x86_64 100% | 80.4 MiB/s | 576.3 KiB | 00m00s [ 61/170] zig-srpm-macros-0:1-5.fc43.no 100% | 2.1 MiB/s | 8.4 KiB | 00m00s [ 62/170] zip-0:3.0-44.fc43.x86_64 100% | 42.6 MiB/s | 261.6 KiB | 00m00s [ 63/170] debugedit-0:5.2-3.fc43.x86_64 100% | 27.9 MiB/s | 85.6 KiB | 00m00s [ 64/170] libarchive-0:3.8.1-3.fc43.x86 100% | 102.8 MiB/s | 421.1 KiB | 00m00s [ 65/170] popt-0:1.19-9.fc43.x86_64 100% | 21.4 MiB/s | 65.7 KiB | 00m00s [ 66/170] readline-0:8.3-2.fc43.x86_64 100% | 54.8 MiB/s | 224.6 KiB | 00m00s [ 67/170] rpm-build-libs-0:6.0.0-1.fc43 100% | 41.6 MiB/s | 127.9 KiB | 00m00s [ 68/170] rpm-libs-0:6.0.0-1.fc43.x86_6 100% | 97.7 MiB/s | 400.2 KiB | 00m00s [ 69/170] libeconf-0:0.7.9-2.fc43.x86_6 100% | 34.4 MiB/s | 35.2 KiB | 00m00s [ 70/170] zstd-0:1.5.7-2.fc43.x86_64 100% | 94.9 MiB/s | 485.9 KiB | 00m00s [ 71/170] pam-libs-0:1.7.1-3.fc43.x86_6 100% | 28.1 MiB/s | 57.5 KiB | 00m00s [ 72/170] libsemanage-0:3.9-4.fc43.x86_ 100% | 40.2 MiB/s | 123.5 KiB | 00m00s [ 73/170] setup-0:2.15.0-26.fc43.noarch 100% | 51.2 MiB/s | 157.3 KiB | 00m00s [ 74/170] xz-libs-1:5.8.1-2.fc43.x86_64 100% | 36.8 MiB/s | 112.9 KiB | 00m00s [ 75/170] mpfr-0:4.2.2-2.fc43.x86_64 100% | 112.9 MiB/s | 347.0 KiB | 00m00s [ 76/170] libcap-ng-0:0.8.5-8.fc43.x86_ 100% | 15.7 MiB/s | 32.1 KiB | 00m00s [ 77/170] libblkid-0:2.41.1-17.fc43.x86 100% | 30.1 MiB/s | 123.1 KiB | 00m00s [ 78/170] libfdisk-0:2.41.1-17.fc43.x86 100% | 52.5 MiB/s | 161.3 KiB | 00m00s [ 79/170] liblastlog2-0:2.41.1-17.fc43. 100% | 7.6 MiB/s | 23.2 KiB | 00m00s [ 80/170] libmount-0:2.41.1-17.fc43.x86 100% | 39.7 MiB/s | 162.5 KiB | 00m00s [ 81/170] libsmartcols-0:2.41.1-17.fc43 100% | 20.5 MiB/s | 84.0 KiB | 00m00s [ 82/170] libuuid-0:2.41.1-17.fc43.x86_ 100% | 12.8 MiB/s | 26.2 KiB | 00m00s [ 83/170] util-linux-core-0:2.41.1-17.f 100% | 134.5 MiB/s | 550.9 KiB | 00m00s [ 84/170] zlib-ng-compat-0:2.2.5-2.fc43 100% | 19.3 MiB/s | 79.2 KiB | 00m00s [ 85/170] ncurses-base-0:6.5-7.20250614 100% | 17.2 MiB/s | 88.2 KiB | 00m00s [ 86/170] gnulib-l10n-0:20241231-1.fc43 100% | 20.9 MiB/s | 150.2 KiB | 00m00s [ 87/170] glibc-gconv-extra-0:2.42-4.fc 100% | 121.9 MiB/s | 1.6 MiB | 00m00s [ 88/170] libsepol-0:3.9-2.fc43.x86_64 100% | 56.2 MiB/s | 345.4 KiB | 00m00s [ 89/170] fedora-gpg-keys-0:43-1.noarch 100% | 27.1 MiB/s | 138.9 KiB | 00m00s [ 90/170] pcre2-syntax-0:10.46-1.fc43.n 100% | 39.6 MiB/s | 162.2 KiB | 00m00s [ 91/170] add-determinism-0:0.6.0-2.fc4 100% | 128.3 MiB/s | 919.3 KiB | 00m00s [ 92/170] file-libs-0:5.46-8.fc43.x86_6 100% | 103.8 MiB/s | 850.3 KiB | 00m00s [ 93/170] libxml2-0:2.12.10-5.fc43.x86_ 100% | 75.2 MiB/s | 692.7 KiB | 00m00s [ 94/170] libzstd-0:1.5.7-2.fc43.x86_64 100% | 38.4 MiB/s | 314.6 KiB | 00m00s [ 95/170] lz4-libs-0:1.10.0-3.fc43.x86_ 100% | 10.9 MiB/s | 78.0 KiB | 00m00s [ 96/170] rpm-sign-libs-0:6.0.0-1.fc43. 100% | 9.2 MiB/s | 28.2 KiB | 00m00s [ 97/170] rpm-sequoia-0:1.9.0-2.fc43.x8 100% | 151.9 MiB/s | 933.3 KiB | 00m00s [ 98/170] sqlite-libs-0:3.50.2-2.fc43.x 100% | 92.8 MiB/s | 760.5 KiB | 00m00s [ 99/170] gnupg2-0:2.4.8-4.fc43.x86_64 100% | 137.0 MiB/s | 1.6 MiB | 00m00s [100/170] ima-evm-utils-libs-0:1.6.2-6. 100% | 4.8 MiB/s | 29.3 KiB | 00m00s [101/170] libfsverity-0:1.6-3.fc43.x86_ 100% | 4.5 MiB/s | 18.6 KiB | 00m00s [102/170] gpgverify-0:2.2-3.fc43.noarch 100% | 5.4 MiB/s | 11.1 KiB | 00m00s [103/170] gnupg2-dirmngr-0:2.4.8-4.fc43 100% | 67.1 MiB/s | 274.6 KiB | 00m00s [104/170] gnupg2-gpg-agent-0:2.4.8-4.fc 100% | 53.3 MiB/s | 272.9 KiB | 00m00s [105/170] gnupg2-gpgconf-0:2.4.8-4.fc43 100% | 28.1 MiB/s | 115.0 KiB | 00m00s [106/170] gnupg2-keyboxd-0:2.4.8-4.fc43 100% | 46.2 MiB/s | 94.7 KiB | 00m00s [107/170] gnupg2-verify-0:2.4.8-4.fc43. 100% | 83.6 MiB/s | 171.2 KiB | 00m00s [108/170] libassuan-0:2.5.7-4.fc43.x86_ 100% | 32.9 MiB/s | 67.4 KiB | 00m00s [109/170] libgcrypt-0:1.11.1-2.fc43.x86 100% | 194.0 MiB/s | 595.8 KiB | 00m00s [110/170] libgpg-error-0:1.55-2.fc43.x8 100% | 79.5 MiB/s | 244.3 KiB | 00m00s [111/170] npth-0:1.8-3.fc43.x86_64 100% | 12.5 MiB/s | 25.7 KiB | 00m00s [112/170] tpm2-tss-0:4.1.3-8.fc43.x86_6 100% | 104.0 MiB/s | 425.9 KiB | 00m00s [113/170] libksba-0:1.6.7-4.fc43.x86_64 100% | 31.3 MiB/s | 160.4 KiB | 00m00s [114/170] gnutls-0:3.8.10-3.fc43.x86_64 100% | 155.8 MiB/s | 1.4 MiB | 00m00s [115/170] openldap-0:2.6.10-4.fc43.x86_ 100% | 36.2 MiB/s | 259.6 KiB | 00m00s [116/170] json-c-0:0.18-7.fc43.x86_64 100% | 8.8 MiB/s | 45.0 KiB | 00m00s [117/170] libusb1-0:1.0.29-4.fc43.x86_6 100% | 26.0 MiB/s | 79.9 KiB | 00m00s [118/170] crypto-policies-0:20250714-5. 100% | 48.1 MiB/s | 98.5 KiB | 00m00s [119/170] libidn2-0:2.3.8-2.fc43.x86_64 100% | 85.4 MiB/s | 174.9 KiB | 00m00s [120/170] libtasn1-0:4.20.0-2.fc43.x86_ 100% | 36.4 MiB/s | 74.5 KiB | 00m00s [121/170] libunistring-0:1.1-10.fc43.x8 100% | 176.7 MiB/s | 542.9 KiB | 00m00s [122/170] nettle-0:3.10.1-2.fc43.x86_64 100% | 138.1 MiB/s | 424.2 KiB | 00m00s [123/170] p11-kit-0:0.25.8-1.fc43.x86_6 100% | 123.0 MiB/s | 503.8 KiB | 00m00s [124/170] cyrus-sasl-lib-0:2.1.28-33.fc 100% | 153.9 MiB/s | 787.9 KiB | 00m00s [125/170] libevent-0:2.1.12-16.fc43.x86 100% | 50.3 MiB/s | 257.8 KiB | 00m00s [126/170] libtool-ltdl-0:2.5.4-7.fc43.x 100% | 11.8 MiB/s | 36.2 KiB | 00m00s [127/170] libffi-0:3.5.1-2.fc43.x86_64 100% | 20.0 MiB/s | 40.9 KiB | 00m00s [128/170] gdbm-libs-1:1.23-10.fc43.x86_ 100% | 27.7 MiB/s | 56.8 KiB | 00m00s [129/170] libxcrypt-0:4.5.2-1.fc43.x86_ 100% | 62.5 MiB/s | 128.1 KiB | 00m00s [130/170] systemd-libs-0:258.2-1.fc43.x 100% | 201.0 MiB/s | 823.5 KiB | 00m00s [131/170] audit-libs-0:4.1.2-2.fc43.x86 100% | 67.6 MiB/s | 138.4 KiB | 00m00s [132/170] libgcc-0:15.2.1-4.fc43.x86_64 100% | 6.6 MiB/s | 134.6 KiB | 00m00s [133/170] openssl-libs-1:3.5.4-1.fc43.x 100% | 238.0 MiB/s | 2.6 MiB | 00m00s [134/170] ca-certificates-0:2025.2.80_v 100% | 119.1 MiB/s | 975.4 KiB | 00m00s [135/170] lua-libs-0:5.4.8-3.fc43.x86_6 100% | 25.8 MiB/s | 131.8 KiB | 00m00s [136/170] elfutils-libelf-0:0.194-1.fc4 100% | 100.2 MiB/s | 205.2 KiB | 00m00s [137/170] elfutils-libs-0:0.194-1.fc43. 100% | 132.6 MiB/s | 271.5 KiB | 00m00s [138/170] elfutils-0:0.194-1.fc43.x86_6 100% | 140.5 MiB/s | 575.6 KiB | 00m00s [139/170] libgomp-0:15.2.1-4.fc43.x86_6 100% | 33.3 MiB/s | 375.2 KiB | 00m00s [140/170] elfutils-debuginfod-client-0: 100% | 22.9 MiB/s | 46.9 KiB | 00m00s [141/170] jansson-0:2.14-3.fc43.x86_64 100% | 22.1 MiB/s | 45.3 KiB | 00m00s [142/170] pkgconf-pkg-config-0:2.3.0-3. 100% | 4.7 MiB/s | 9.6 KiB | 00m00s [143/170] pkgconf-0:2.3.0-3.fc43.x86_64 100% | 21.8 MiB/s | 44.6 KiB | 00m00s [144/170] pkgconf-m4-0:2.3.0-3.fc43.noa 100% | 6.8 MiB/s | 13.9 KiB | 00m00s [145/170] libpkgconf-0:2.3.0-3.fc43.x86 100% | 18.5 MiB/s | 37.9 KiB | 00m00s [146/170] curl-0:8.15.0-3.fc43.x86_64 100% | 74.3 MiB/s | 228.2 KiB | 00m00s [147/170] pyproject-srpm-macros-0:1.18. 100% | 13.0 MiB/s | 13.3 KiB | 00m00s [148/170] qt5-srpm-macros-0:5.15.18-1.f 100% | 8.4 MiB/s | 8.6 KiB | 00m00s [149/170] qt6-srpm-macros-0:6.10.0-1.fc 100% | 9.1 MiB/s | 9.4 KiB | 00m00s [150/170] p11-kit-trust-0:0.25.8-1.fc43 100% | 45.4 MiB/s | 139.6 KiB | 00m00s [151/170] alternatives-0:1.33-3.fc43.x8 100% | 19.8 MiB/s | 40.6 KiB | 00m00s [152/170] elfutils-default-yama-scope-0 100% | 6.0 MiB/s | 12.4 KiB | 00m00s [153/170] fedora-release-0:43-25.noarch 100% | 13.2 MiB/s | 13.5 KiB | 00m00s [154/170] systemd-standalone-sysusers-0 100% | 71.2 MiB/s | 145.8 KiB | 00m00s [155/170] gdb-minimal-0:16.3-6.fc43.x86 100% | 275.4 MiB/s | 4.4 MiB | 00m00s [156/170] xxhash-libs-0:0.8.3-3.fc43.x8 100% | 12.5 MiB/s | 38.5 KiB | 00m00s [157/170] fedora-release-identity-basic 100% | 4.6 MiB/s | 14.3 KiB | 00m00s [158/170] libcurl-0:8.15.0-3.fc43.x86_6 100% | 131.6 MiB/s | 404.3 KiB | 00m00s [159/170] krb5-libs-0:1.21.3-7.fc43.x86 100% | 185.3 MiB/s | 758.9 KiB | 00m00s [160/170] libbrotli-0:1.1.0-10.fc43.x86 100% | 55.2 MiB/s | 339.1 KiB | 00m00s [161/170] libnghttp2-0:1.66.0-2.fc43.x8 100% | 35.4 MiB/s | 72.5 KiB | 00m00s [162/170] libpsl-0:0.21.5-6.fc43.x86_64 100% | 21.1 MiB/s | 65.0 KiB | 00m00s [163/170] libssh-0:0.11.3-1.fc43.x86_64 100% | 75.8 MiB/s | 232.8 KiB | 00m00s [164/170] keyutils-libs-0:1.6.3-6.fc43. 100% | 15.3 MiB/s | 31.4 KiB | 00m00s [165/170] libcom_err-0:1.47.3-2.fc43.x8 100% | 26.2 MiB/s | 26.8 KiB | 00m00s [166/170] libverto-0:0.3.2-11.fc43.x86_ 100% | 10.1 MiB/s | 20.7 KiB | 00m00s [167/170] publicsuffix-list-dafsa-0:202 100% | 14.4 MiB/s | 59.2 KiB | 00m00s [168/170] libssh-config-0:0.11.3-1.fc43 100% | 8.9 MiB/s | 9.1 KiB | 00m00s [169/170] libstdc++-0:15.2.1-4.fc43.x86 100% | 6.5 MiB/s | 921.9 KiB | 00m00s [170/170] binutils-0:2.45.50-9.fc43.x86 100% | 20.2 MiB/s | 5.9 MiB | 00m00s -------------------------------------------------------------------------------- [170/170] Total 100% | 103.3 MiB/s | 59.1 MiB | 00m01s Running transaction Importing OpenPGP key 0x31645531: UserID : "Fedora (43) " Fingerprint: C6E7F081CF80E13146676E88829B606631645531 From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-43-primary The key was successfully imported. [ 1/172] Verify package files 100% | 699.0 B/s | 170.0 B | 00m00s >>> Running %pretrans scriptlet: filesystem-0:3.18-50.fc43.x86_64 >>> Finished %pretrans scriptlet: filesystem-0:3.18-50.fc43.x86_64 >>> [RPM] /var/lib/mock/fedora-43-x86_64-1763473447.786242/root/var/cache/dnf/co [ 2/172] Prepare transaction 100% | 3.7 KiB/s | 170.0 B | 00m00s [ 3/172] Installing libgcc-0:15.2.1-4. 100% | 262.0 MiB/s | 268.3 KiB | 00m00s [ 4/172] Installing libssh-config-0:0. 100% | 0.0 B/s | 816.0 B | 00m00s [ 5/172] Installing publicsuffix-list- 100% | 0.0 B/s | 69.8 KiB | 00m00s [ 6/172] Installing fedora-release-ide 100% | 0.0 B/s | 888.0 B | 00m00s [ 7/172] Installing fedora-gpg-keys-0: 100% | 43.7 MiB/s | 179.0 KiB | 00m00s [ 8/172] Installing fedora-repos-0:43- 100% | 0.0 B/s | 5.7 KiB | 00m00s [ 9/172] Installing fedora-release-com 100% | 24.3 MiB/s | 24.9 KiB | 00m00s [ 10/172] Installing fedora-release-0:4 100% | 15.1 KiB/s | 124.0 B | 00m00s >>> Running sysusers scriptlet: setup-0:2.15.0-26.fc43.noarch >>> Finished sysusers scriptlet: setup-0:2.15.0-26.fc43.noarch >>> Scriptlet output: >>> Creating group 'adm' with GID 4. >>> Creating group 'audio' with GID 63. >>> Creating group 'cdrom' with GID 11. >>> Creating group 'clock' with GID 103. >>> Creating group 'dialout' with GID 18. >>> Creating group 'disk' with GID 6. >>> Creating group 'floppy' with GID 19. >>> Creating group 'ftp' with GID 50. >>> Creating group 'games' with GID 20. >>> Creating group 'input' with GID 104. >>> Creating group 'kmem' with GID 9. >>> Creating group 'kvm' with GID 36. >>> Creating group 'lock' with GID 54. >>> Creating group 'lp' with GID 7. >>> Creating group 'mail' with GID 12. >>> Creating group 'man' with GID 15. >>> Creating group 'mem' with GID 8. >>> Creating group 'nobody' with GID 65534. >>> Creating group 'render' with GID 105. >>> Creating group 'root' with GID 0. >>> Creating group 'sgx' with GID 106. >>> Creating group 'sys' with GID 3. >>> Creating group 'tape' with GID 33. >>> Creating group 'tty' with GID 5. >>> Creating group 'users' with GID 100. >>> Creating group 'utmp' with GID 22. >>> Creating group 'video' with GID 39. >>> Creating group 'wheel' with GID 10. >>> Creating user 'adm' (adm) with UID 3 and GID 4. >>> Creating group 'bin' with GID 1. >>> Creating user 'bin' (bin) with UID 1 and GID 1. >>> Creating group 'daemon' with GID 2. >>> Creating user 'daemon' (daemon) with UID 2 and GID 2. >>> Creating user 'ftp' (FTP User) with UID 14 and GID 50. >>> Creating user 'games' (games) with UID 12 and GID 100. >>> Creating user 'halt' (halt) with UID 7 and GID 0. >>> Creating user 'lp' (lp) with UID 4 and GID 7. >>> Creating user 'mail' (mail) with UID 8 and GID 12. >>> Creating user 'nobody' (Kernel Overflow User) with UID 65534 and GID 65534. >>> Creating user 'operator' (operator) with UID 11 and GID 0. >>> Creating user 'root' (Super User) with UID 0 and GID 0. >>> Creating user 'shutdown' (shutdown) with UID 6 and GID 0. >>> Creating user 'sync' (sync) with UID 5 and GID 0. >>> [ 11/172] Installing setup-0:2.15.0-26. 100% | 47.6 MiB/s | 730.6 KiB | 00m00s >>> [RPM] /etc/hosts created as /etc/hosts.rpmnew [ 12/172] Installing filesystem-0:3.18- 100% | 2.7 MiB/s | 212.8 KiB | 00m00s [ 13/172] Installing qt6-srpm-macros-0: 100% | 0.0 B/s | 740.0 B | 00m00s [ 14/172] Installing qt5-srpm-macros-0: 100% | 0.0 B/s | 776.0 B | 00m00s [ 15/172] Installing pkgconf-m4-0:2.3.0 100% | 0.0 B/s | 14.8 KiB | 00m00s [ 16/172] Installing pcre2-syntax-0:10. 100% | 271.2 MiB/s | 277.8 KiB | 00m00s [ 17/172] Installing gnulib-l10n-0:2024 100% | 161.6 MiB/s | 661.9 KiB | 00m00s [ 18/172] Installing coreutils-common-0 100% | 376.4 MiB/s | 11.3 MiB | 00m00s [ 19/172] Installing ncurses-base-0:6.5 100% | 86.3 MiB/s | 353.5 KiB | 00m00s [ 20/172] Installing bash-0:5.3.0-2.fc4 100% | 255.5 MiB/s | 8.4 MiB | 00m00s [ 21/172] Installing glibc-common-0:2.4 100% | 60.0 MiB/s | 1.0 MiB | 00m00s [ 22/172] Installing glibc-gconv-extra- 100% | 270.7 MiB/s | 7.3 MiB | 00m00s [ 23/172] Installing glibc-0:2.42-4.fc4 100% | 171.9 MiB/s | 6.7 MiB | 00m00s [ 24/172] Installing ncurses-libs-0:6.5 100% | 232.6 MiB/s | 952.8 KiB | 00m00s [ 25/172] Installing glibc-minimal-lang 100% | 0.0 B/s | 124.0 B | 00m00s [ 26/172] Installing zlib-ng-compat-0:2 100% | 135.2 MiB/s | 138.4 KiB | 00m00s [ 27/172] Installing bzip2-libs-0:1.0.8 100% | 79.8 MiB/s | 81.7 KiB | 00m00s [ 28/172] Installing libgpg-error-0:1.5 100% | 56.2 MiB/s | 921.1 KiB | 00m00s [ 29/172] Installing libstdc++-0:15.2.1 100% | 355.5 MiB/s | 2.8 MiB | 00m00s [ 30/172] Installing xz-libs-1:5.8.1-2. 100% | 213.8 MiB/s | 218.9 KiB | 00m00s [ 31/172] Installing libassuan-0:2.5.7- 100% | 161.7 MiB/s | 165.6 KiB | 00m00s [ 32/172] Installing libgcrypt-0:1.11.1 100% | 393.8 MiB/s | 1.6 MiB | 00m00s [ 33/172] Installing readline-0:8.3-2.f 100% | 250.9 MiB/s | 513.9 KiB | 00m00s [ 34/172] Installing gmp-1:6.3.0-4.fc43 100% | 397.2 MiB/s | 813.5 KiB | 00m00s [ 35/172] Installing libuuid-0:2.41.1-1 100% | 37.6 MiB/s | 38.5 KiB | 00m00s [ 36/172] Installing popt-0:1.19-9.fc43 100% | 68.1 MiB/s | 139.4 KiB | 00m00s [ 37/172] Installing npth-0:1.8-3.fc43. 100% | 0.0 B/s | 50.7 KiB | 00m00s [ 38/172] Installing libblkid-0:2.41.1- 100% | 257.2 MiB/s | 263.4 KiB | 00m00s [ 39/172] Installing libzstd-0:1.5.7-2. 100% | 391.2 MiB/s | 801.1 KiB | 00m00s [ 40/172] Installing elfutils-libelf-0: 100% | 373.7 MiB/s | 1.1 MiB | 00m00s [ 41/172] Installing sqlite-libs-0:3.50 100% | 379.1 MiB/s | 1.5 MiB | 00m00s [ 42/172] Installing libxcrypt-0:4.5.2- 100% | 281.3 MiB/s | 288.0 KiB | 00m00s [ 43/172] Installing gnupg2-gpgconf-0:2 100% | 17.6 MiB/s | 252.0 KiB | 00m00s [ 44/172] Installing libattr-0:2.5.2-6. 100% | 0.0 B/s | 25.4 KiB | 00m00s [ 45/172] Installing libacl-0:2.3.2-4.f 100% | 0.0 B/s | 36.8 KiB | 00m00s [ 46/172] Installing libtasn1-0:4.20.0- 100% | 173.9 MiB/s | 178.1 KiB | 00m00s [ 47/172] Installing libunistring-0:1.1 100% | 345.3 MiB/s | 1.7 MiB | 00m00s [ 48/172] Installing libidn2-0:2.3.8-2. 100% | 54.6 MiB/s | 558.7 KiB | 00m00s [ 49/172] Installing crypto-policies-0: 100% | 33.6 MiB/s | 172.0 KiB | 00m00s [ 50/172] Installing dwz-0:0.16-2.fc43. 100% | 18.8 MiB/s | 288.5 KiB | 00m00s [ 51/172] Installing gnupg2-verify-0:2. 100% | 26.3 MiB/s | 349.9 KiB | 00m00s [ 52/172] Installing mpfr-0:4.2.2-2.fc4 100% | 271.6 MiB/s | 834.4 KiB | 00m00s [ 53/172] Installing gawk-0:5.3.2-2.fc4 100% | 100.9 MiB/s | 1.8 MiB | 00m00s [ 54/172] Installing libksba-0:1.6.7-4. 100% | 195.8 MiB/s | 401.1 KiB | 00m00s [ 55/172] Installing unzip-0:6.0-67.fc4 100% | 29.3 MiB/s | 389.8 KiB | 00m00s [ 56/172] Installing file-libs-0:5.46-8 100% | 624.1 MiB/s | 11.9 MiB | 00m00s [ 57/172] Installing file-0:5.46-8.fc43 100% | 7.6 MiB/s | 101.7 KiB | 00m00s [ 58/172] Installing pcre2-0:10.46-1.fc 100% | 341.4 MiB/s | 699.1 KiB | 00m00s [ 59/172] Installing grep-0:3.12-2.fc43 100% | 62.7 MiB/s | 1.0 MiB | 00m00s [ 60/172] Installing xz-1:5.8.1-2.fc43. 100% | 74.0 MiB/s | 1.3 MiB | 00m00s [ 61/172] Installing libeconf-0:0.7.9-2 100% | 65.0 MiB/s | 66.5 KiB | 00m00s [ 62/172] Installing libcap-ng-0:0.8.5- 100% | 69.2 MiB/s | 70.8 KiB | 00m00s [ 63/172] Installing audit-libs-0:4.1.2 100% | 186.3 MiB/s | 381.5 KiB | 00m00s [ 64/172] Installing pam-libs-0:1.7.1-3 100% | 126.0 MiB/s | 129.0 KiB | 00m00s [ 65/172] Installing libcap-0:2.76-3.fc 100% | 16.1 MiB/s | 214.3 KiB | 00m00s [ 66/172] Installing systemd-libs-0:258 100% | 333.8 MiB/s | 2.3 MiB | 00m00s [ 67/172] Installing libsmartcols-0:2.4 100% | 177.3 MiB/s | 181.6 KiB | 00m00s [ 68/172] Installing libsepol-0:3.9-2.f 100% | 267.9 MiB/s | 822.9 KiB | 00m00s [ 69/172] Installing libselinux-0:3.9-5 100% | 189.8 MiB/s | 194.4 KiB | 00m00s [ 70/172] Installing findutils-1:4.10.0 100% | 103.2 MiB/s | 1.9 MiB | 00m00s [ 71/172] Installing sed-0:4.9-5.fc43.x 100% | 52.8 MiB/s | 865.5 KiB | 00m00s [ 72/172] Installing libmount-0:2.41.1- 100% | 182.5 MiB/s | 373.7 KiB | 00m00s [ 73/172] Installing lz4-libs-0:1.10.0- 100% | 158.6 MiB/s | 162.5 KiB | 00m00s [ 74/172] Installing json-c-0:0.18-7.fc 100% | 82.0 MiB/s | 84.0 KiB | 00m00s [ 75/172] Installing libffi-0:3.5.1-2.f 100% | 83.0 MiB/s | 85.0 KiB | 00m00s [ 76/172] Installing p11-kit-0:0.25.8-1 100% | 114.5 MiB/s | 2.3 MiB | 00m00s [ 77/172] Installing lua-libs-0:5.4.8-3 100% | 275.4 MiB/s | 282.0 KiB | 00m00s [ 78/172] Installing alternatives-0:1.3 100% | 5.2 MiB/s | 63.8 KiB | 00m00s [ 79/172] Installing p11-kit-trust-0:0. 100% | 20.8 MiB/s | 448.2 KiB | 00m00s [ 80/172] Installing openssl-libs-1:3.5 100% | 356.5 MiB/s | 8.9 MiB | 00m00s [ 81/172] Installing coreutils-0:9.7-6. 100% | 155.8 MiB/s | 5.5 MiB | 00m00s [ 82/172] Installing ca-certificates-0: 100% | 1.8 MiB/s | 2.5 MiB | 00m01s [ 83/172] Installing gzip-0:1.13-4.fc43 100% | 25.7 MiB/s | 394.4 KiB | 00m00s [ 84/172] Installing rpm-sequoia-0:1.9. 100% | 354.1 MiB/s | 2.5 MiB | 00m00s [ 85/172] Installing libfsverity-0:1.6- 100% | 28.8 MiB/s | 29.5 KiB | 00m00s [ 86/172] Installing libevent-0:2.1.12- 100% | 288.7 MiB/s | 886.8 KiB | 00m00s [ 87/172] Installing zstd-0:1.5.7-2.fc4 100% | 100.6 MiB/s | 1.7 MiB | 00m00s [ 88/172] Installing util-linux-core-0: 100% | 82.2 MiB/s | 1.5 MiB | 00m00s [ 89/172] Installing tar-2:1.35-6.fc43. 100% | 147.9 MiB/s | 3.0 MiB | 00m00s [ 90/172] Installing libsemanage-0:3.9- 100% | 151.5 MiB/s | 310.2 KiB | 00m00s [ 91/172] Installing systemd-standalone 100% | 22.1 MiB/s | 294.1 KiB | 00m00s [ 92/172] Installing rpm-libs-0:6.0.0-1 100% | 304.4 MiB/s | 935.2 KiB | 00m00s [ 93/172] Installing libusb1-0:1.0.29-4 100% | 18.8 MiB/s | 172.9 KiB | 00m00s >>> Running sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64 >>> Finished sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64 >>> Scriptlet output: >>> Creating group 'tss' with GID 59. >>> Creating user 'tss' (Account used for TPM access) with UID 59 and GID 59. >>> [ 94/172] Installing tpm2-tss-0:4.1.3-8 100% | 262.0 MiB/s | 1.6 MiB | 00m00s [ 95/172] Installing ima-evm-utils-libs 100% | 60.5 MiB/s | 62.0 KiB | 00m00s [ 96/172] Installing gnupg2-gpg-agent-0 100% | 30.0 MiB/s | 675.4 KiB | 00m00s [ 97/172] Installing zip-0:3.0-44.fc43. 100% | 45.5 MiB/s | 698.4 KiB | 00m00s [ 98/172] Installing gnupg2-keyboxd-0:2 100% | 28.3 MiB/s | 202.7 KiB | 00m00s [ 99/172] Installing libpsl-0:0.21.5-6. 100% | 75.7 MiB/s | 77.5 KiB | 00m00s [100/172] Installing liblastlog2-0:2.41 100% | 5.8 MiB/s | 35.9 KiB | 00m00s [101/172] Installing libfdisk-0:2.41.1- 100% | 186.2 MiB/s | 381.4 KiB | 00m00s [102/172] Installing nettle-0:3.10.1-2. 100% | 258.4 MiB/s | 793.7 KiB | 00m00s [103/172] Installing gnutls-0:3.8.10-3. 100% | 349.0 MiB/s | 3.8 MiB | 00m00s [104/172] Installing libxml2-0:2.12.10- 100% | 94.7 MiB/s | 1.7 MiB | 00m00s [105/172] Installing libarchive-0:3.8.1 100% | 310.2 MiB/s | 953.1 KiB | 00m00s [106/172] Installing bzip2-0:1.0.8-21.f 100% | 7.5 MiB/s | 99.8 KiB | 00m00s [107/172] Installing add-determinism-0: 100% | 128.6 MiB/s | 2.4 MiB | 00m00s [108/172] Installing build-reproducibil 100% | 0.0 B/s | 1.0 KiB | 00m00s [109/172] Installing cpio-0:2.15-6.fc43 100% | 68.7 MiB/s | 1.1 MiB | 00m00s [110/172] Installing diffutils-0:3.12-3 100% | 91.8 MiB/s | 1.6 MiB | 00m00s [111/172] Installing ed-0:1.22.2-1.fc43 100% | 11.3 MiB/s | 150.4 KiB | 00m00s [112/172] Installing patch-0:2.8-2.fc43 100% | 16.9 MiB/s | 224.3 KiB | 00m00s [113/172] Installing libtool-ltdl-0:2.5 100% | 69.6 MiB/s | 71.2 KiB | 00m00s [114/172] Installing gdbm-libs-1:1.23-1 100% | 128.5 MiB/s | 131.6 KiB | 00m00s [115/172] Installing cyrus-sasl-lib-0:2 100% | 120.8 MiB/s | 2.3 MiB | 00m00s [116/172] Installing openldap-0:2.6.10- 100% | 216.0 MiB/s | 663.7 KiB | 00m00s [117/172] Installing gnupg2-dirmngr-0:2 100% | 28.9 MiB/s | 621.1 KiB | 00m00s [118/172] Installing gnupg2-0:2.4.8-4.f 100% | 218.4 MiB/s | 6.6 MiB | 00m00s [119/172] Installing rpm-sign-libs-0:6. 100% | 39.6 MiB/s | 40.6 KiB | 00m00s [120/172] Installing gpgverify-0:2.2-3. 100% | 0.0 B/s | 9.4 KiB | 00m00s [121/172] Installing libgomp-0:15.2.1-4 100% | 265.1 MiB/s | 543.0 KiB | 00m00s [122/172] Installing jansson-0:2.14-3.f 100% | 88.3 MiB/s | 90.5 KiB | 00m00s [123/172] Installing libpkgconf-0:2.3.0 100% | 77.4 MiB/s | 79.2 KiB | 00m00s [124/172] Installing pkgconf-0:2.3.0-3. 100% | 6.8 MiB/s | 91.0 KiB | 00m00s [125/172] Installing pkgconf-pkg-config 100% | 147.8 KiB/s | 1.8 KiB | 00m00s [126/172] Installing xxhash-libs-0:0.8. 100% | 89.4 MiB/s | 91.6 KiB | 00m00s [127/172] Installing libbrotli-0:1.1.0- 100% | 272.0 MiB/s | 835.6 KiB | 00m00s [128/172] Installing libnghttp2-0:1.66. 100% | 159.5 MiB/s | 163.3 KiB | 00m00s [129/172] Installing keyutils-libs-0:1. 100% | 54.4 MiB/s | 55.7 KiB | 00m00s [130/172] Installing libcom_err-0:1.47. 100% | 0.0 B/s | 64.2 KiB | 00m00s [131/172] Installing libverto-0:0.3.2-1 100% | 26.6 MiB/s | 27.2 KiB | 00m00s [132/172] Installing krb5-libs-0:1.21.3 100% | 286.5 MiB/s | 2.3 MiB | 00m00s [133/172] Installing libssh-0:0.11.3-1. 100% | 277.9 MiB/s | 569.2 KiB | 00m00s [134/172] Installing libcurl-0:8.15.0-3 100% | 294.4 MiB/s | 904.3 KiB | 00m00s [135/172] Installing curl-0:8.15.0-3.fc 100% | 18.9 MiB/s | 464.0 KiB | 00m00s [136/172] Installing rpm-0:6.0.0-1.fc43 100% | 75.7 MiB/s | 2.6 MiB | 00m00s [137/172] Installing efi-srpm-macros-0: 100% | 40.2 MiB/s | 41.1 KiB | 00m00s [138/172] Installing java-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [139/172] Installing lua-srpm-macros-0: 100% | 0.0 B/s | 1.9 KiB | 00m00s [140/172] Installing tree-sitter-srpm-m 100% | 0.0 B/s | 9.3 KiB | 00m00s [141/172] Installing zig-srpm-macros-0: 100% | 0.0 B/s | 1.7 KiB | 00m00s [142/172] Installing filesystem-srpm-ma 100% | 0.0 B/s | 38.9 KiB | 00m00s [143/172] Installing elfutils-default-y 100% | 340.5 KiB/s | 2.0 KiB | 00m00s [144/172] Installing elfutils-libs-0:0. 100% | 224.4 MiB/s | 689.3 KiB | 00m00s [145/172] Installing elfutils-debuginfo 100% | 6.0 MiB/s | 86.3 KiB | 00m00s [146/172] Installing elfutils-0:0.194-1 100% | 146.5 MiB/s | 2.9 MiB | 00m00s [147/172] Installing binutils-0:2.45.50 100% | 314.9 MiB/s | 27.1 MiB | 00m00s [148/172] Installing gdb-minimal-0:16.3 100% | 259.9 MiB/s | 13.3 MiB | 00m00s [149/172] Installing debugedit-0:5.2-3. 100% | 15.2 MiB/s | 217.3 KiB | 00m00s [150/172] Installing rpm-build-libs-0:6 100% | 262.9 MiB/s | 269.2 KiB | 00m00s [151/172] Installing rust-srpm-macros-0 100% | 0.0 B/s | 5.6 KiB | 00m00s [152/172] Installing perl-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [153/172] Installing package-notes-srpm 100% | 0.0 B/s | 2.0 KiB | 00m00s [154/172] Installing openblas-srpm-macr 100% | 0.0 B/s | 392.0 B | 00m00s [155/172] Installing ocaml-srpm-macros- 100% | 0.0 B/s | 2.1 KiB | 00m00s [156/172] Installing kernel-srpm-macros 100% | 0.0 B/s | 2.3 KiB | 00m00s [157/172] Installing gnat-srpm-macros-0 100% | 0.0 B/s | 1.3 KiB | 00m00s [158/172] Installing ghc-srpm-macros-0: 100% | 0.0 B/s | 1.0 KiB | 00m00s [159/172] Installing gap-srpm-macros-0: 100% | 0.0 B/s | 2.7 KiB | 00m00s [160/172] Installing fpc-srpm-macros-0: 100% | 0.0 B/s | 420.0 B | 00m00s [161/172] Installing ansible-srpm-macro 100% | 0.0 B/s | 36.2 KiB | 00m00s [162/172] Installing rpm-build-0:6.0.0- 100% | 19.3 MiB/s | 296.5 KiB | 00m00s [163/172] Installing pyproject-srpm-mac 100% | 2.4 MiB/s | 2.5 KiB | 00m00s [164/172] Installing redhat-rpm-config- 100% | 92.3 MiB/s | 189.1 KiB | 00m00s [165/172] Installing forge-srpm-macros- 100% | 39.3 MiB/s | 40.3 KiB | 00m00s [166/172] Installing fonts-srpm-macros- 100% | 55.7 MiB/s | 57.0 KiB | 00m00s [167/172] Installing go-srpm-macros-0:3 100% | 61.6 MiB/s | 63.0 KiB | 00m00s [168/172] Installing python-srpm-macros 100% | 25.8 MiB/s | 52.8 KiB | 00m00s [169/172] Installing util-linux-0:2.41. 100% | 94.0 MiB/s | 3.6 MiB | 00m00s [170/172] Installing shadow-utils-2:4.1 100% | 128.0 MiB/s | 4.0 MiB | 00m00s [171/172] Installing which-0:2.23-3.fc4 100% | 6.4 MiB/s | 85.7 KiB | 00m00s [172/172] Installing info-0:7.2-6.fc43. 100% | 203.3 KiB/s | 354.3 KiB | 00m02s Warning: skipped OpenPGP checks for 4 packages from repository: copr_base Complete! Finish: installing minimal buildroot with dnf5 Start: creating root cache Finish: creating root cache Finish: chroot init INFO: Installed packages: INFO: add-determinism-0.6.0-2.fc43.x86_64 alternatives-1.33-3.fc43.x86_64 ansible-srpm-macros-1-18.1.fc43.noarch audit-libs-4.1.2-2.fc43.x86_64 bash-5.3.0-2.fc43.x86_64 binutils-2.45.50-9.fc43.x86_64 build-reproducibility-srpm-macros-0.6.0-2.fc43.noarch bzip2-1.0.8-21.fc43.x86_64 bzip2-libs-1.0.8-21.fc43.x86_64 ca-certificates-2025.2.80_v9.0.304-1.1.fc43.noarch coreutils-9.7-6.fc43.x86_64 coreutils-common-9.7-6.fc43.x86_64 cpio-2.15-6.fc43.x86_64 crypto-policies-20250714-5.gitcd6043a.fc43.noarch curl-8.15.0-3.fc43.x86_64 cyrus-sasl-lib-2.1.28-33.fc43.x86_64 debugedit-5.2-3.fc43.x86_64 diffutils-3.12-3.fc43.x86_64 dwz-0.16-2.fc43.x86_64 ed-1.22.2-1.fc43.x86_64 efi-srpm-macros-6-4.fc43.noarch elfutils-0.194-1.fc43.x86_64 elfutils-debuginfod-client-0.194-1.fc43.x86_64 elfutils-default-yama-scope-0.194-1.fc43.noarch elfutils-libelf-0.194-1.fc43.x86_64 elfutils-libs-0.194-1.fc43.x86_64 fedora-gpg-keys-43-1.noarch fedora-release-43-25.noarch fedora-release-common-43-25.noarch fedora-release-identity-basic-43-25.noarch fedora-repos-43-1.noarch file-5.46-8.fc43.x86_64 file-libs-5.46-8.fc43.x86_64 filesystem-3.18-50.fc43.x86_64 filesystem-srpm-macros-3.18-50.fc43.noarch findutils-4.10.0-6.fc43.x86_64 fonts-srpm-macros-2.0.5-23.fc43.noarch forge-srpm-macros-0.4.0-3.fc43.noarch fpc-srpm-macros-1.3-15.fc43.noarch gap-srpm-macros-2-1.fc43.noarch gawk-5.3.2-2.fc43.x86_64 gdb-minimal-16.3-6.fc43.x86_64 gdbm-libs-1.23-10.fc43.x86_64 ghc-srpm-macros-1.9.2-3.fc43.noarch glibc-2.42-4.fc43.x86_64 glibc-common-2.42-4.fc43.x86_64 glibc-gconv-extra-2.42-4.fc43.x86_64 glibc-minimal-langpack-2.42-4.fc43.x86_64 gmp-6.3.0-4.fc43.x86_64 gnat-srpm-macros-6-8.fc43.noarch gnulib-l10n-20241231-1.fc43.noarch gnupg2-2.4.8-4.fc43.x86_64 gnupg2-dirmngr-2.4.8-4.fc43.x86_64 gnupg2-gpg-agent-2.4.8-4.fc43.x86_64 gnupg2-gpgconf-2.4.8-4.fc43.x86_64 gnupg2-keyboxd-2.4.8-4.fc43.x86_64 gnupg2-verify-2.4.8-4.fc43.x86_64 gnutls-3.8.10-3.fc43.x86_64 go-srpm-macros-3.8.0-1.fc43.noarch gpg-pubkey-c6e7f081cf80e13146676e88829b606631645531-66b6dccf gpgverify-2.2-3.fc43.noarch grep-3.12-2.fc43.x86_64 gzip-1.13-4.fc43.x86_64 ima-evm-utils-libs-1.6.2-6.fc43.x86_64 info-7.2-6.fc43.x86_64 jansson-2.14-3.fc43.x86_64 java-srpm-macros-1-7.fc43.noarch json-c-0.18-7.fc43.x86_64 kernel-srpm-macros-1.0-27.fc43.noarch keyutils-libs-1.6.3-6.fc43.x86_64 krb5-libs-1.21.3-7.fc43.x86_64 libacl-2.3.2-4.fc43.x86_64 libarchive-3.8.1-3.fc43.x86_64 libassuan-2.5.7-4.fc43.x86_64 libattr-2.5.2-6.fc43.x86_64 libblkid-2.41.1-17.fc43.x86_64 libbrotli-1.1.0-10.fc43.x86_64 libcap-2.76-3.fc43.x86_64 libcap-ng-0.8.5-8.fc43.x86_64 libcom_err-1.47.3-2.fc43.x86_64 libcurl-8.15.0-3.fc43.x86_64 libeconf-0.7.9-2.fc43.x86_64 libevent-2.1.12-16.fc43.x86_64 libfdisk-2.41.1-17.fc43.x86_64 libffi-3.5.1-2.fc43.x86_64 libfsverity-1.6-3.fc43.x86_64 libgcc-15.2.1-4.fc43.x86_64 libgcrypt-1.11.1-2.fc43.x86_64 libgomp-15.2.1-4.fc43.x86_64 libgpg-error-1.55-2.fc43.x86_64 libidn2-2.3.8-2.fc43.x86_64 libksba-1.6.7-4.fc43.x86_64 liblastlog2-2.41.1-17.fc43.x86_64 libmount-2.41.1-17.fc43.x86_64 libnghttp2-1.66.0-2.fc43.x86_64 libpkgconf-2.3.0-3.fc43.x86_64 libpsl-0.21.5-6.fc43.x86_64 libselinux-3.9-5.fc43.x86_64 libsemanage-3.9-4.fc43.x86_64 libsepol-3.9-2.fc43.x86_64 libsmartcols-2.41.1-17.fc43.x86_64 libssh-0.11.3-1.fc43.x86_64 libssh-config-0.11.3-1.fc43.noarch libstdc++-15.2.1-4.fc43.x86_64 libtasn1-4.20.0-2.fc43.x86_64 libtool-ltdl-2.5.4-7.fc43.x86_64 libunistring-1.1-10.fc43.x86_64 libusb1-1.0.29-4.fc43.x86_64 libuuid-2.41.1-17.fc43.x86_64 libverto-0.3.2-11.fc43.x86_64 libxcrypt-4.5.2-1.fc43.x86_64 libxml2-2.12.10-5.fc43.x86_64 libzstd-1.5.7-2.fc43.x86_64 lua-libs-5.4.8-3.fc43.x86_64 lua-srpm-macros-1-16.fc43.noarch lz4-libs-1.10.0-3.fc43.x86_64 mpfr-4.2.2-2.fc43.x86_64 ncurses-base-6.5-7.20250614.fc43.noarch ncurses-libs-6.5-7.20250614.fc43.x86_64 nettle-3.10.1-2.fc43.x86_64 npth-1.8-3.fc43.x86_64 ocaml-srpm-macros-11-2.fc43.noarch openblas-srpm-macros-2-20.fc43.noarch openldap-2.6.10-4.fc43.x86_64 openssl-libs-3.5.4-1.fc43.x86_64 p11-kit-0.25.8-1.fc43.x86_64 p11-kit-trust-0.25.8-1.fc43.x86_64 package-notes-srpm-macros-0.5-14.fc43.noarch pam-libs-1.7.1-3.fc43.x86_64 patch-2.8-2.fc43.x86_64 pcre2-10.46-1.fc43.x86_64 pcre2-syntax-10.46-1.fc43.noarch perl-srpm-macros-1-60.fc43.noarch pkgconf-2.3.0-3.fc43.x86_64 pkgconf-m4-2.3.0-3.fc43.noarch pkgconf-pkg-config-2.3.0-3.fc43.x86_64 popt-1.19-9.fc43.x86_64 publicsuffix-list-dafsa-20250616-2.fc43.noarch pyproject-srpm-macros-1.18.5-1.fc43.noarch python-srpm-macros-3.14-5.fc43.noarch qt5-srpm-macros-5.15.18-1.fc43.noarch qt6-srpm-macros-6.10.0-1.fc43.noarch readline-8.3-2.fc43.x86_64 redhat-rpm-config-343-11.fc43.noarch rpm-6.0.0-1.fc43.x86_64 rpm-build-6.0.0-1.fc43.x86_64 rpm-build-libs-6.0.0-1.fc43.x86_64 rpm-libs-6.0.0-1.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 rpm-sign-libs-6.0.0-1.fc43.x86_64 rust-srpm-macros-26.4-1.fc43.noarch sed-4.9-5.fc43.x86_64 setup-2.15.0-26.fc43.noarch shadow-utils-4.18.0-3.fc43.x86_64 sqlite-libs-3.50.2-2.fc43.x86_64 systemd-libs-258.2-1.fc43.x86_64 systemd-standalone-sysusers-258.2-1.fc43.x86_64 tar-1.35-6.fc43.x86_64 tpm2-tss-4.1.3-8.fc43.x86_64 tree-sitter-srpm-macros-0.4.2-1.fc43.noarch unzip-6.0-67.fc43.x86_64 util-linux-2.41.1-17.fc43.x86_64 util-linux-core-2.41.1-17.fc43.x86_64 which-2.23-3.fc43.x86_64 xxhash-libs-0.8.3-3.fc43.x86_64 xz-5.8.1-2.fc43.x86_64 xz-libs-5.8.1-2.fc43.x86_64 zig-srpm-macros-1-5.fc43.noarch zip-3.0-44.fc43.x86_64 zlib-ng-compat-2.2.5-2.fc43.x86_64 zstd-1.5.7-2.fc43.x86_64 Start: buildsrpm Start: rpmbuild -bs Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1763424000 Wrote: /builddir/build/SRPMS/composable_kernel-7.1.0-2.fc43.src.rpm Finish: rpmbuild -bs INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-43-x86_64-1763473447.786242/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names Finish: buildsrpm INFO: Done(/var/lib/copr-rpmbuild/workspace/workdir-rra3r7ts/composable_kernel/composable_kernel.spec) Config(child) 0 minutes 22 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot INFO: Start(/var/lib/copr-rpmbuild/results/composable_kernel-7.1.0-2.fc43.src.rpm) Config(fedora-43-x86_64) Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1763473447.786242/root. INFO: reusing tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1763473447.786242/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-1763473447.786242/root. INFO: calling preinit hooks INFO: enabled root cache Start: unpacking root cache Finish: unpacking root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-6.0.0-1.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 dnf5-5.2.17.0-2.fc43.x86_64 dnf5-plugins-5.2.17.0-2.fc43.x86_64 Finish: chroot init Start: build phase for composable_kernel-7.1.0-2.fc43.src.rpm Start: build setup for composable_kernel-7.1.0-2.fc43.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1763424000 Wrote: /builddir/build/SRPMS/composable_kernel-7.1.0-2.fc43.src.rpm Updating and loading repositories: Copr repository 100% | 63.8 KiB/s | 1.5 KiB | 00m00s fedora 100% | 56.2 KiB/s | 33.0 KiB | 00m01s updates 100% | 104.4 KiB/s | 32.0 KiB | 00m00s Repositories loaded. Package Arch Version Repository Size Installing: cmake x86_64 3.31.6-4.fc43 fedora 34.5 MiB fdupes x86_64 1:2.4.0-2.fc43 fedora 118.1 KiB gcc-c++ x86_64 15.2.1-4.fc43 copr_base 41.4 MiB git x86_64 2.51.1-1.fc43 updates 56.4 KiB ninja-build x86_64 1.13.1-4.fc43 fedora 480.7 KiB rocm-cmake noarch 7.1.0-1.fc43 copr_base 129.5 KiB rocm-comgr-devel x86_64 20-7.rocm7.1.0.fc43 copr_base 100.5 KiB rocm-compilersupport-macros noarch 20-7.rocm7.1.0.fc43 copr_base 160.0 B rocm-hip-devel x86_64 7.1.0-1.fc43 copr_base 3.1 MiB rocm-rpm-macros x86_64 7.0.0-1.fc43 copr_base 18.6 KiB rocm-runtime-devel x86_64 7.1.0-1.fc43 copr_base 683.4 KiB Installing dependencies: annobin-docs noarch 12.99-1.fc43 fedora 98.9 KiB annobin-plugin-gcc x86_64 12.99-1.fc43 fedora 1.0 MiB cmake-data noarch 3.31.6-4.fc43 fedora 8.5 MiB cmake-filesystem x86_64 3.31.6-4.fc43 fedora 0.0 B cmake-rpm-macros noarch 3.31.6-4.fc43 fedora 7.7 KiB cpp x86_64 15.2.1-4.fc43 copr_base 37.9 MiB emacs-filesystem noarch 1:30.0-5.fc43 fedora 0.0 B environment-modules x86_64 5.6.0-1.fc43 fedora 1.9 MiB expat x86_64 2.7.2-1.fc43 fedora 298.6 KiB gcc x86_64 15.2.1-4.fc43 copr_base 111.9 MiB gcc-plugin-annobin x86_64 15.2.1-4.fc43 copr_base 57.2 KiB git-core x86_64 2.51.1-1.fc43 updates 23.6 MiB git-core-doc noarch 2.51.1-1.fc43 updates 17.7 MiB glibc-devel x86_64 2.42-4.fc43 fedora 2.3 MiB groff-base x86_64 1.23.0-10.fc43 fedora 3.8 MiB hipcc x86_64 20-7.rocm7.1.0.fc43 copr_base 634.5 KiB hwdata noarch 0.401-1.fc43 updates 9.6 MiB jsoncpp x86_64 1.9.6-2.fc43 fedora 257.6 KiB kernel-headers x86_64 6.17.4-300.fc43 updates 6.7 MiB less x86_64 679-2.fc43 fedora 406.1 KiB libcbor x86_64 0.12.0-6.fc43 fedora 77.8 KiB libdrm x86_64 2.4.128-3.fc43 updates 399.9 KiB libedit x86_64 3.1-57.20251016cvs.fc43 updates 240.2 KiB libfido2 x86_64 1.16.0-3.fc43 fedora 238.5 KiB libmpc x86_64 1.3.1-8.fc43 fedora 160.6 KiB libpciaccess x86_64 0.16-16.fc43 fedora 44.5 KiB libpipeline x86_64 1.5.8-3.fc43 fedora 145.1 KiB libstdc++-devel x86_64 15.2.1-4.fc43 copr_base 37.3 MiB libtommath x86_64 1.3.1~rc1-6.fc43 fedora 126.4 KiB libuv x86_64 1:1.51.0-2.fc43 fedora 570.2 KiB libxcrypt-devel x86_64 4.5.2-1.fc43 updates 31.1 KiB make x86_64 1:4.4.1-11.fc43 fedora 1.8 MiB man-db x86_64 2.13.1-2.fc43 fedora 2.9 MiB mpdecimal x86_64 4.0.1-2.fc43 fedora 217.2 KiB ncurses x86_64 6.5-7.20250614.fc43 fedora 609.8 KiB numactl-libs x86_64 2.0.19-3.fc43 fedora 56.9 KiB openssh x86_64 10.0p1-5.fc43 fedora 1.4 MiB openssh-clients x86_64 10.0p1-5.fc43 fedora 2.6 MiB pcre2-utf32 x86_64 10.46-1.fc43 fedora 602.2 KiB perl-AutoLoader noarch 5.74-520.fc43 fedora 20.6 KiB perl-B x86_64 1.89-520.fc43 fedora 501.3 KiB perl-Carp noarch 1.54-520.fc43 fedora 46.6 KiB perl-Class-Struct noarch 0.68-520.fc43 fedora 25.4 KiB perl-Data-Dumper x86_64 2.191-521.fc43 fedora 115.6 KiB perl-Digest noarch 1.20-520.fc43 fedora 35.3 KiB perl-Digest-MD5 x86_64 2.59-520.fc43 fedora 59.7 KiB perl-DynaLoader x86_64 1.57-520.fc43 fedora 32.1 KiB perl-Encode x86_64 4:3.21-520.fc43 fedora 4.7 MiB perl-Errno x86_64 1.38-520.fc43 fedora 8.4 KiB perl-Error noarch 1:0.17030-2.fc43 fedora 76.7 KiB perl-Exporter noarch 5.79-520.fc43 fedora 54.3 KiB perl-Fcntl x86_64 1.20-520.fc43 fedora 48.8 KiB perl-File-Basename noarch 2.86-520.fc43 fedora 14.0 KiB perl-File-Copy noarch 2.41-520.fc43 fedora 19.7 KiB perl-File-Path noarch 2.18-520.fc43 fedora 63.5 KiB perl-File-Temp noarch 1:0.231.100-520.fc43 fedora 162.3 KiB perl-File-Which noarch 1.27-14.fc43 fedora 30.4 KiB perl-File-stat noarch 1.14-520.fc43 fedora 12.5 KiB perl-FileHandle noarch 2.05-520.fc43 fedora 9.4 KiB perl-Getopt-Long noarch 1:2.58-520.fc43 fedora 144.5 KiB perl-Getopt-Std noarch 1.14-520.fc43 fedora 11.2 KiB perl-Git noarch 2.51.1-1.fc43 updates 64.4 KiB perl-HTTP-Tiny noarch 0.090-521.fc43 fedora 154.4 KiB perl-IO x86_64 1.55-520.fc43 fedora 147.4 KiB perl-IO-Socket-IP noarch 0.43-521.fc43 fedora 100.3 KiB perl-IO-Socket-SSL noarch 2.095-2.fc43 fedora 714.5 KiB perl-IPC-Open3 noarch 1.24-520.fc43 fedora 27.7 KiB perl-MIME-Base32 noarch 1.303-24.fc43 fedora 30.7 KiB perl-MIME-Base64 x86_64 3.16-520.fc43 fedora 42.0 KiB perl-Net-SSLeay x86_64 1.94-11.fc43 fedora 1.3 MiB perl-POSIX x86_64 2.23-520.fc43 fedora 231.4 KiB perl-PathTools x86_64 3.94-520.fc43 fedora 180.0 KiB perl-Pod-Escapes noarch 1:1.07-520.fc43 fedora 24.9 KiB perl-Pod-Perldoc noarch 3.28.01-521.fc43 fedora 163.7 KiB perl-Pod-Simple noarch 1:3.47-3.fc43 fedora 565.3 KiB perl-Pod-Usage noarch 4:2.05-520.fc43 fedora 86.3 KiB perl-Scalar-List-Utils x86_64 5:1.70-1.fc43 fedora 144.9 KiB perl-SelectSaver noarch 1.02-520.fc43 fedora 2.2 KiB perl-Socket x86_64 4:2.040-2.fc43 fedora 120.3 KiB perl-Storable x86_64 1:3.37-521.fc43 fedora 231.2 KiB perl-Symbol noarch 1.09-520.fc43 fedora 6.8 KiB perl-Term-ANSIColor noarch 5.01-521.fc43 fedora 97.5 KiB perl-Term-Cap noarch 1.18-520.fc43 fedora 29.3 KiB perl-TermReadKey x86_64 2.38-26.fc43 fedora 64.0 KiB perl-Text-ParseWords noarch 3.31-520.fc43 fedora 13.6 KiB perl-Text-Tabs+Wrap noarch 2024.001-520.fc43 fedora 22.6 KiB perl-Time-Local noarch 2:1.350-520.fc43 fedora 69.0 KiB perl-URI noarch 5.34-2.fc43 updates 268.0 KiB perl-base noarch 2.27-520.fc43 fedora 12.6 KiB perl-constant noarch 1.33-521.fc43 fedora 26.2 KiB perl-if noarch 0.61.000-520.fc43 fedora 5.8 KiB perl-interpreter x86_64 4:5.42.0-520.fc43 fedora 118.6 KiB perl-lib x86_64 0.65-520.fc43 fedora 8.5 KiB perl-libnet noarch 3.15-521.fc43 fedora 289.4 KiB perl-libs x86_64 4:5.42.0-520.fc43 fedora 11.5 MiB perl-locale noarch 1.13-520.fc43 fedora 6.1 KiB perl-mro x86_64 1.29-520.fc43 fedora 41.6 KiB perl-overload noarch 1.40-520.fc43 fedora 71.6 KiB perl-overloading noarch 0.02-520.fc43 fedora 4.9 KiB perl-parent noarch 1:0.244-520.fc43 fedora 10.3 KiB perl-podlators noarch 1:6.0.2-520.fc43 fedora 317.5 KiB perl-vars noarch 1.05-520.fc43 fedora 3.9 KiB procps-ng x86_64 4.0.4-7.fc43.1 updates 1.0 MiB python-pip-wheel noarch 25.1.1-18.fc43 fedora 1.2 MiB python3 x86_64 3.14.0-2.fc43 updates 28.9 KiB python3-libs x86_64 3.14.0-2.fc43 updates 43.0 MiB rhash x86_64 1.4.5-3.fc43 fedora 351.1 KiB rocm-clang x86_64 20-7.rocm7.1.0.fc43 copr_base 68.5 MiB rocm-clang-devel x86_64 20-7.rocm7.1.0.fc43 copr_base 26.1 MiB rocm-clang-libs x86_64 20-7.rocm7.1.0.fc43 copr_base 94.1 MiB rocm-clang-runtime-devel x86_64 20-7.rocm7.1.0.fc43 copr_base 8.4 MiB rocm-comgr x86_64 20-7.rocm7.1.0.fc43 copr_base 126.3 MiB rocm-device-libs x86_64 20-7.rocm7.1.0.fc43 copr_base 3.2 MiB rocm-hip x86_64 7.1.0-1.fc43 copr_base 27.0 MiB rocm-libc++ x86_64 20-7.rocm7.1.0.fc43 copr_base 1.3 MiB rocm-libc++-devel x86_64 20-7.rocm7.1.0.fc43 copr_base 15.0 MiB rocm-lld x86_64 20-7.rocm7.1.0.fc43 copr_base 5.9 MiB rocm-llvm x86_64 20-7.rocm7.1.0.fc43 copr_base 52.5 MiB rocm-llvm-devel x86_64 20-7.rocm7.1.0.fc43 copr_base 28.3 MiB rocm-llvm-filesystem x86_64 20-7.rocm7.1.0.fc43 copr_base 0.0 B rocm-llvm-libs x86_64 20-7.rocm7.1.0.fc43 copr_base 91.6 MiB rocm-llvm-static x86_64 20-7.rocm7.1.0.fc43 copr_base 1.9 GiB rocm-runtime x86_64 7.1.0-1.fc43 copr_base 3.2 MiB tcl x86_64 1:9.0.2-1.fc43 fedora 4.3 MiB tzdata noarch 2025b-3.fc43 fedora 1.6 MiB vim-filesystem noarch 2:9.1.1914-1.fc43 updates 40.0 B zlib-ng-compat-devel x86_64 2.2.5-2.fc43 fedora 107.0 KiB Transaction Summary: Installing: 138 packages Total size of inbound packages is 537 MiB. Need to download 537 MiB. After this operation, 3 GiB extra will be used (install 3 GiB, remove 0 B). [ 1/138] fdupes-1:2.4.0-2.fc43.x86_64 100% | 3.8 MiB/s | 59.0 KiB | 00m00s [ 2/138] ninja-build-0:1.13.1-4.fc43.x 100% | 7.4 MiB/s | 198.0 KiB | 00m00s [ 3/138] git-0:2.51.1-1.fc43.x86_64 100% | 20.1 MiB/s | 41.1 KiB | 00m00s [ 4/138] rocm-cmake-0:7.1.0-1.fc43.noa 100% | 5.3 MiB/s | 38.2 KiB | 00m00s [ 5/138] gcc-c++-0:15.2.1-4.fc43.x86_6 100% | 231.3 MiB/s | 15.3 MiB | 00m00s [ 6/138] rocm-comgr-devel-0:20-7.rocm7 100% | 721.8 KiB/s | 33.2 KiB | 00m00s [ 7/138] cmake-0:3.31.6-4.fc43.x86_64 100% | 126.1 MiB/s | 12.2 MiB | 00m00s [ 8/138] rocm-compilersupport-macros-0 100% | 1.0 MiB/s | 15.8 KiB | 00m00s [ 9/138] rocm-rpm-macros-0:7.0.0-1.fc4 100% | 5.0 MiB/s | 15.3 KiB | 00m00s [ 10/138] pcre2-utf32-0:10.46-1.fc43.x8 100% | 16.0 MiB/s | 228.9 KiB | 00m00s [ 11/138] rocm-hip-devel-0:7.1.0-1.fc43 100% | 5.7 MiB/s | 263.3 KiB | 00m00s [ 12/138] cmake-data-0:3.31.6-4.fc43.no 100% | 117.5 MiB/s | 2.5 MiB | 00m00s [ 13/138] cmake-filesystem-0:3.31.6-4.f 100% | 1.9 MiB/s | 15.5 KiB | 00m00s [ 14/138] expat-0:2.7.2-1.fc43.x86_64 100% | 38.7 MiB/s | 118.9 KiB | 00m00s [ 15/138] jsoncpp-0:1.9.6-2.fc43.x86_64 100% | 32.9 MiB/s | 101.1 KiB | 00m00s [ 16/138] libuv-1:1.51.0-2.fc43.x86_64 100% | 86.6 MiB/s | 266.1 KiB | 00m00s [ 17/138] make-1:4.4.1-11.fc43.x86_64 100% | 142.9 MiB/s | 585.2 KiB | 00m00s [ 18/138] rhash-0:1.4.5-3.fc43.x86_64 100% | 48.3 MiB/s | 197.9 KiB | 00m00s [ 19/138] libmpc-0:1.3.1-8.fc43.x86_64 100% | 34.4 MiB/s | 70.4 KiB | 00m00s [ 20/138] perl-File-Basename-0:2.86-520 100% | 5.6 MiB/s | 17.2 KiB | 00m00s [ 21/138] perl-Getopt-Long-1:2.58-520.f 100% | 20.7 MiB/s | 63.6 KiB | 00m00s [ 22/138] perl-PathTools-0:3.94-520.fc4 100% | 28.4 MiB/s | 87.2 KiB | 00m00s [ 23/138] perl-IPC-Open3-0:1.24-520.fc4 100% | 5.8 MiB/s | 23.9 KiB | 00m00s [ 24/138] perl-interpreter-4:5.42.0-520 100% | 35.3 MiB/s | 72.4 KiB | 00m00s [ 25/138] perl-TermReadKey-0:2.38-26.fc 100% | 11.5 MiB/s | 35.2 KiB | 00m00s [ 26/138] rocm-runtime-devel-0:7.1.0-1. 100% | 1.9 MiB/s | 117.2 KiB | 00m00s [ 27/138] perl-lib-0:0.65-520.fc43.x86_ 100% | 2.9 MiB/s | 15.0 KiB | 00m00s [ 28/138] perl-Git-0:2.51.1-1.fc43.noar 100% | 4.7 MiB/s | 38.2 KiB | 00m00s [ 29/138] git-core-doc-0:2.51.1-1.fc43. 100% | 144.4 MiB/s | 3.0 MiB | 00m00s [ 30/138] git-core-0:2.51.1-1.fc43.x86_ 100% | 179.0 MiB/s | 5.0 MiB | 00m00s [ 31/138] perl-File-Copy-0:2.41-520.fc4 100% | 9.8 MiB/s | 20.1 KiB | 00m00s [ 32/138] perl-File-Which-0:1.27-14.fc4 100% | 7.0 MiB/s | 21.4 KiB | 00m00s [ 33/138] perl-Getopt-Std-0:1.14-520.fc 100% | 7.7 MiB/s | 15.7 KiB | 00m00s [ 34/138] perl-Scalar-List-Utils-5:1.70 100% | 24.4 MiB/s | 75.0 KiB | 00m00s [ 35/138] environment-modules-0:5.6.0-1 100% | 97.1 MiB/s | 795.3 KiB | 00m00s [ 36/138] rocm-hip-0:7.1.0-1.fc43.x86_6 100% | 140.2 MiB/s | 10.2 MiB | 00m00s [ 37/138] emacs-filesystem-1:30.0-5.fc4 100% | 3.7 MiB/s | 7.5 KiB | 00m00s [ 38/138] perl-Carp-0:1.54-520.fc43.noa 100% | 9.3 MiB/s | 28.7 KiB | 00m00s [ 39/138] perl-Exporter-0:5.79-520.fc43 100% | 7.5 MiB/s | 30.9 KiB | 00m00s [ 40/138] perl-Pod-Usage-4:2.05-520.fc4 100% | 19.8 MiB/s | 40.5 KiB | 00m00s [ 41/138] perl-Text-ParseWords-0:3.31-5 100% | 5.3 MiB/s | 16.3 KiB | 00m00s [ 42/138] perl-base-0:2.27-520.fc43.noa 100% | 5.3 MiB/s | 16.2 KiB | 00m00s [ 43/138] perl-constant-0:1.33-521.fc43 100% | 11.1 MiB/s | 22.8 KiB | 00m00s [ 44/138] perl-overload-0:1.40-520.fc43 100% | 44.5 MiB/s | 45.6 KiB | 00m00s [ 45/138] perl-Fcntl-0:1.20-520.fc43.x8 100% | 29.1 MiB/s | 29.8 KiB | 00m00s [ 46/138] perl-IO-0:1.55-520.fc43.x86_6 100% | 40.1 MiB/s | 82.2 KiB | 00m00s [ 47/138] perl-POSIX-0:2.23-520.fc43.x8 100% | 47.8 MiB/s | 97.8 KiB | 00m00s [ 48/138] perl-Symbol-0:1.09-520.fc43.n 100% | 13.9 MiB/s | 14.2 KiB | 00m00s [ 49/138] perl-Errno-0:1.38-520.fc43.x8 100% | 14.6 MiB/s | 14.9 KiB | 00m00s [ 50/138] rocm-runtime-0:7.1.0-1.fc43.x 100% | 7.0 MiB/s | 641.3 KiB | 00m00s [ 51/138] perl-libs-4:5.42.0-520.fc43.x 100% | 255.8 MiB/s | 2.6 MiB | 00m00s [ 52/138] perl-DynaLoader-0:1.57-520.fc 100% | 8.5 MiB/s | 26.0 KiB | 00m00s [ 53/138] perl-vars-0:1.05-520.fc43.noa 100% | 12.7 MiB/s | 13.0 KiB | 00m00s [ 54/138] less-0:679-2.fc43.x86_64 100% | 63.6 MiB/s | 195.3 KiB | 00m00s [ 55/138] openssh-clients-0:10.0p1-5.fc 100% | 182.3 MiB/s | 746.7 KiB | 00m00s [ 56/138] perl-Error-1:0.17030-2.fc43.n 100% | 9.8 MiB/s | 40.2 KiB | 00m00s [ 57/138] numactl-libs-0:2.0.19-3.fc43. 100% | 10.1 MiB/s | 31.1 KiB | 00m00s [ 58/138] perl-Pod-Perldoc-0:3.28.01-52 100% | 27.4 MiB/s | 84.3 KiB | 00m00s [ 59/138] perl-podlators-1:6.0.2-520.fc 100% | 25.1 MiB/s | 128.3 KiB | 00m00s [ 60/138] perl-mro-0:1.29-520.fc43.x86_ 100% | 9.7 MiB/s | 29.9 KiB | 00m00s [ 61/138] perl-overloading-0:0.02-520.f 100% | 12.6 MiB/s | 12.9 KiB | 00m00s [ 62/138] perl-File-stat-0:1.14-520.fc4 100% | 8.3 MiB/s | 17.1 KiB | 00m00s [ 63/138] perl-SelectSaver-0:1.02-520.f 100% | 5.7 MiB/s | 11.7 KiB | 00m00s [ 64/138] perl-Socket-4:2.040-2.fc43.x8 100% | 17.9 MiB/s | 54.9 KiB | 00m00s [ 65/138] perl-locale-0:1.13-520.fc43.n 100% | 13.2 MiB/s | 13.5 KiB | 00m00s [ 66/138] libfido2-0:1.16.0-3.fc43.x86_ 100% | 32.0 MiB/s | 98.5 KiB | 00m00s [ 67/138] openssh-0:10.0p1-5.fc43.x86_6 100% | 82.9 MiB/s | 339.6 KiB | 00m00s [ 68/138] groff-base-0:1.23.0-10.fc43.x 100% | 157.0 MiB/s | 1.1 MiB | 00m00s [ 69/138] libpipeline-0:1.5.8-3.fc43.x8 100% | 19.5 MiB/s | 59.9 KiB | 00m00s [ 70/138] perl-File-Temp-1:0.231.100-52 100% | 28.8 MiB/s | 59.0 KiB | 00m00s [ 71/138] perl-HTTP-Tiny-0:0.090-521.fc 100% | 27.5 MiB/s | 56.3 KiB | 00m00s [ 72/138] perl-Pod-Simple-1:3.47-3.fc43 100% | 71.6 MiB/s | 219.9 KiB | 00m00s [ 73/138] man-db-0:2.13.1-2.fc43.x86_64 100% | 23.0 MiB/s | 1.4 MiB | 00m00s [ 74/138] perl-parent-1:0.244-520.fc43. 100% | 4.8 MiB/s | 14.8 KiB | 00m00s [ 75/138] perl-Term-ANSIColor-0:5.01-52 100% | 15.5 MiB/s | 47.6 KiB | 00m00s [ 76/138] perl-Term-Cap-0:1.18-520.fc43 100% | 10.7 MiB/s | 21.9 KiB | 00m00s [ 77/138] perl-Class-Struct-0:0.68-520. 100% | 21.6 MiB/s | 22.1 KiB | 00m00s [ 78/138] libcbor-0:0.12.0-6.fc43.x86_6 100% | 16.4 MiB/s | 33.5 KiB | 00m00s [ 79/138] perl-File-Path-0:2.18-520.fc4 100% | 17.1 MiB/s | 35.1 KiB | 00m00s [ 80/138] perl-IO-Socket-SSL-0:2.095-2. 100% | 75.4 MiB/s | 231.5 KiB | 00m00s [ 81/138] perl-MIME-Base64-0:3.16-520.f 100% | 14.5 MiB/s | 29.7 KiB | 00m00s [ 82/138] perl-Time-Local-2:1.350-520.f 100% | 33.6 MiB/s | 34.4 KiB | 00m00s [ 83/138] perl-Net-SSLeay-0:1.94-11.fc4 100% | 122.0 MiB/s | 374.8 KiB | 00m00s [ 84/138] perl-Pod-Escapes-1:1.07-520.f 100% | 19.3 MiB/s | 19.8 KiB | 00m00s [ 85/138] perl-Text-Tabs+Wrap-0:2024.00 100% | 10.6 MiB/s | 21.6 KiB | 00m00s [ 86/138] perl-if-0:0.61.000-520.fc43.n 100% | 4.6 MiB/s | 14.0 KiB | 00m00s [ 87/138] ncurses-0:6.5-7.20250614.fc43 100% | 104.1 MiB/s | 426.2 KiB | 00m00s [ 88/138] perl-IO-Socket-IP-0:0.43-521. 100% | 20.5 MiB/s | 42.1 KiB | 00m00s [ 89/138] perl-AutoLoader-0:5.74-520.fc 100% | 20.7 MiB/s | 21.2 KiB | 00m00s [ 90/138] perl-Encode-4:3.21-520.fc43.x 100% | 210.3 MiB/s | 1.1 MiB | 00m00s [ 91/138] perl-Storable-1:3.37-521.fc43 100% | 19.2 MiB/s | 98.5 KiB | 00m00s [ 92/138] perl-URI-0:5.34-2.fc43.noarch 100% | 72.9 MiB/s | 149.3 KiB | 00m00s [ 93/138] perl-Data-Dumper-0:2.191-521. 100% | 18.3 MiB/s | 56.3 KiB | 00m00s [ 94/138] perl-MIME-Base32-0:1.303-24.f 100% | 9.9 MiB/s | 20.4 KiB | 00m00s [ 95/138] perl-libnet-0:3.15-521.fc43.n 100% | 41.8 MiB/s | 128.3 KiB | 00m00s [ 96/138] perl-B-0:1.89-520.fc43.x86_64 100% | 43.4 MiB/s | 177.7 KiB | 00m00s [ 97/138] perl-Digest-MD5-0:2.59-520.fc 100% | 17.5 MiB/s | 35.8 KiB | 00m00s [ 98/138] perl-FileHandle-0:2.05-520.fc 100% | 15.1 MiB/s | 15.5 KiB | 00m00s [ 99/138] perl-Digest-0:1.20-520.fc43.n 100% | 12.1 MiB/s | 24.8 KiB | 00m00s [100/138] libedit-0:3.1-57.20251016cvs. 100% | 34.3 MiB/s | 105.2 KiB | 00m00s [101/138] python3-0:3.14.0-2.fc43.x86_6 100% | 13.5 MiB/s | 27.7 KiB | 00m00s [102/138] mpdecimal-0:4.0.1-2.fc43.x86_ 100% | 31.6 MiB/s | 97.1 KiB | 00m00s [103/138] python-pip-wheel-0:25.1.1-18. 100% | 120.5 MiB/s | 1.2 MiB | 00m00s [104/138] tzdata-0:2025b-3.fc43.noarch 100% | 116.2 MiB/s | 713.9 KiB | 00m00s [105/138] vim-filesystem-2:9.1.1914-1.f 100% | 5.0 MiB/s | 15.5 KiB | 00m00s [106/138] procps-ng-0:4.0.4-7.fc43.1.x8 100% | 59.3 MiB/s | 364.4 KiB | 00m00s [107/138] python3-libs-0:3.14.0-2.fc43. 100% | 218.2 MiB/s | 9.8 MiB | 00m00s [108/138] tcl-1:9.0.2-1.fc43.x86_64 100% | 64.9 MiB/s | 1.2 MiB | 00m00s [109/138] libtommath-0:1.3.1~rc1-6.fc43 100% | 9.0 MiB/s | 64.3 KiB | 00m00s [110/138] libpciaccess-0:0.16-16.fc43.x 100% | 8.5 MiB/s | 26.2 KiB | 00m00s [111/138] libdrm-0:2.4.128-3.fc43.x86_6 100% | 31.6 MiB/s | 162.0 KiB | 00m00s [112/138] hipcc-0:20-7.rocm7.1.0.fc43.x 100% | 4.8 MiB/s | 133.3 KiB | 00m00s [113/138] rocm-device-libs-0:20-7.rocm7 100% | 3.6 MiB/s | 496.6 KiB | 00m00s [114/138] rocm-clang-devel-0:20-7.rocm7 100% | 15.6 MiB/s | 2.5 MiB | 00m00s [115/138] rocm-clang-0:20-7.rocm7.1.0.f 100% | 116.0 MiB/s | 15.9 MiB | 00m00s [116/138] rocm-clang-libs-0:20-7.rocm7. 100% | 148.7 MiB/s | 23.1 MiB | 00m00s [117/138] rocm-clang-runtime-devel-0:20 100% | 7.9 MiB/s | 637.8 KiB | 00m00s [118/138] rocm-libc++-devel-0:20-7.rocm 100% | 10.9 MiB/s | 1.2 MiB | 00m00s [119/138] rocm-comgr-0:20-7.rocm7.1.0.f 100% | 39.2 MiB/s | 31.2 MiB | 00m01s [120/138] rocm-llvm-filesystem-0:20-7.r 100% | 1.3 MiB/s | 25.7 KiB | 00m00s [121/138] rocm-libc++-0:20-7.rocm7.1.0. 100% | 5.4 MiB/s | 374.1 KiB | 00m00s [122/138] rocm-lld-0:20-7.rocm7.1.0.fc4 100% | 9.7 MiB/s | 1.6 MiB | 00m00s [123/138] rocm-llvm-devel-0:20-7.rocm7. 100% | 18.6 MiB/s | 4.0 MiB | 00m00s [124/138] rocm-llvm-libs-0:20-7.rocm7.1 100% | 23.9 MiB/s | 21.2 MiB | 00m01s [125/138] gcc-0:15.2.1-4.fc43.x86_64 100% | 271.4 MiB/s | 39.6 MiB | 00m00s [126/138] libstdc++-devel-0:15.2.1-4.fc 100% | 191.0 MiB/s | 5.2 MiB | 00m00s [127/138] cpp-0:15.2.1-4.fc43.x86_64 100% | 287.2 MiB/s | 12.9 MiB | 00m00s [128/138] glibc-devel-0:2.42-4.fc43.x86 100% | 61.4 MiB/s | 565.9 KiB | 00m00s [129/138] hwdata-0:0.401-1.fc43.noarch 100% | 237.0 MiB/s | 1.7 MiB | 00m00s [130/138] kernel-headers-0:6.17.4-300.f 100% | 154.3 MiB/s | 1.7 MiB | 00m00s [131/138] libxcrypt-devel-0:4.5.2-1.fc4 100% | 14.6 MiB/s | 30.0 KiB | 00m00s [132/138] zlib-ng-compat-devel-0:2.2.5- 100% | 18.7 MiB/s | 38.3 KiB | 00m00s [133/138] annobin-plugin-gcc-0:12.99-1. 100% | 138.9 MiB/s | 996.0 KiB | 00m00s [134/138] annobin-docs-0:12.99-1.fc43.n 100% | 21.9 MiB/s | 89.5 KiB | 00m00s [135/138] cmake-rpm-macros-0:3.31.6-4.f 100% | 3.6 MiB/s | 14.8 KiB | 00m00s [136/138] gcc-plugin-annobin-0:15.2.1-4 100% | 19.2 MiB/s | 59.0 KiB | 00m00s [137/138] rocm-llvm-0:20-7.rocm7.1.0.fc 100% | 16.8 MiB/s | 13.5 MiB | 00m01s [138/138] rocm-llvm-static-0:20-7.rocm7 100% | 33.8 MiB/s | 281.9 MiB | 00m08s -------------------------------------------------------------------------------- [138/138] Total 100% | 57.6 MiB/s | 537.5 MiB | 00m09s Running transaction [ 1/140] Verify package files 100% | 59.0 B/s | 138.0 B | 00m02s [ 2/140] Prepare transaction 100% | 1.2 KiB/s | 138.0 B | 00m00s [ 3/140] Installing cmake-filesystem-0 100% | 3.7 MiB/s | 7.6 KiB | 00m00s [ 4/140] Installing vim-filesystem-2:9 100% | 4.6 MiB/s | 4.7 KiB | 00m00s [ 5/140] Installing less-0:679-2.fc43. 100% | 26.7 MiB/s | 409.4 KiB | 00m00s [ 6/140] Installing libmpc-0:1.3.1-8.f 100% | 158.3 MiB/s | 162.1 KiB | 00m00s [ 7/140] Installing expat-0:2.7.2-1.fc 100% | 21.0 MiB/s | 300.7 KiB | 00m00s [ 8/140] Installing rocm-llvm-filesyst 100% | 6.2 MiB/s | 19.1 KiB | 00m00s [ 9/140] Installing rocm-libc++-0:20-7 100% | 44.4 MiB/s | 1.3 MiB | 00m00s [ 10/140] Installing rocm-llvm-libs-0:2 100% | 70.7 MiB/s | 91.6 MiB | 00m01s [ 11/140] Installing rocm-clang-libs-0: 100% | 68.8 MiB/s | 94.1 MiB | 00m01s [ 12/140] Installing groff-base-0:1.23. 100% | 106.8 MiB/s | 3.8 MiB | 00m00s [ 13/140] Installing numactl-libs-0:2.0 100% | 56.4 MiB/s | 57.8 KiB | 00m00s [ 14/140] Installing emacs-filesystem-1 100% | 0.0 B/s | 544.0 B | 00m00s [ 15/140] Installing rocm-comgr-0:20-7. 100% | 67.3 MiB/s | 126.3 MiB | 00m02s [ 16/140] Installing make-1:4.4.1-11.fc 100% | 90.0 MiB/s | 1.8 MiB | 00m00s [ 17/140] Installing rocm-lld-0:20-7.ro 100% | 64.7 MiB/s | 5.9 MiB | 00m00s [ 18/140] Installing rocm-libc++-devel- 100% | 108.9 MiB/s | 15.4 MiB | 00m00s [ 19/140] Installing cpp-0:15.2.1-4.fc4 100% | 321.6 MiB/s | 38.0 MiB | 00m00s [ 20/140] Installing zlib-ng-compat-dev 100% | 106.0 MiB/s | 108.5 KiB | 00m00s [ 21/140] Installing annobin-docs-0:12. 100% | 32.6 MiB/s | 100.1 KiB | 00m00s [ 22/140] Installing kernel-headers-0:6 100% | 196.4 MiB/s | 6.9 MiB | 00m00s [ 23/140] Installing glibc-devel-0:2.42 100% | 168.1 MiB/s | 2.4 MiB | 00m00s [ 24/140] Installing libxcrypt-devel-0: 100% | 32.6 MiB/s | 33.4 KiB | 00m00s [ 25/140] Installing gcc-0:15.2.1-4.fc4 100% | 368.4 MiB/s | 112.0 MiB | 00m00s [ 26/140] Installing hwdata-0:0.401-1.f 100% | 457.8 MiB/s | 9.6 MiB | 00m00s [ 27/140] Installing libpciaccess-0:0.1 100% | 44.8 MiB/s | 45.9 KiB | 00m00s [ 28/140] Installing libdrm-0:2.4.128-3 100% | 197.1 MiB/s | 403.7 KiB | 00m00s [ 29/140] Installing rocm-runtime-0:7.1 100% | 460.0 MiB/s | 3.2 MiB | 00m00s [ 30/140] Installing rocm-runtime-devel 100% | 223.8 MiB/s | 687.6 KiB | 00m00s [ 31/140] Installing libstdc++-devel-0: 100% | 451.6 MiB/s | 37.5 MiB | 00m00s [ 32/140] Installing rocm-clang-runtime 100% | 128.5 MiB/s | 8.5 MiB | 00m00s [ 33/140] Installing libtommath-0:1.3.1 100% | 124.5 MiB/s | 127.5 KiB | 00m00s [ 34/140] Installing tcl-1:9.0.2-1.fc43 100% | 154.9 MiB/s | 4.3 MiB | 00m00s [ 35/140] Installing procps-ng-0:4.0.4- 100% | 48.1 MiB/s | 1.0 MiB | 00m00s [ 36/140] Installing tzdata-0:2025b-3.f 100% | 61.0 MiB/s | 1.9 MiB | 00m00s [ 37/140] Installing python-pip-wheel-0 100% | 622.6 MiB/s | 1.2 MiB | 00m00s [ 38/140] Installing mpdecimal-0:4.0.1- 100% | 35.6 MiB/s | 218.8 KiB | 00m00s [ 39/140] Installing python3-libs-0:3.1 100% | 314.1 MiB/s | 43.3 MiB | 00m00s [ 40/140] Installing python3-0:3.14.0-2 100% | 2.1 MiB/s | 30.6 KiB | 00m00s [ 41/140] Installing cmake-rpm-macros-0 100% | 8.1 MiB/s | 8.3 KiB | 00m00s [ 42/140] Installing rocm-llvm-0:20-7.r 100% | 65.9 MiB/s | 52.5 MiB | 00m01s [ 43/140] Installing rocm-llvm-devel-0: 100% | 88.9 MiB/s | 28.7 MiB | 00m00s [ 44/140] Installing rocm-llvm-static-0 100% | 89.9 MiB/s | 1.9 GiB | 00m22s [ 45/140] Installing libedit-0:3.1-57.2 100% | 118.1 MiB/s | 241.9 KiB | 00m00s [ 46/140] Installing ncurses-0:6.5-7.20 100% | 37.6 MiB/s | 616.4 KiB | 00m00s [ 47/140] Installing perl-Digest-0:1.20 100% | 36.2 MiB/s | 37.1 KiB | 00m00s [ 48/140] Installing perl-Digest-MD5-0: 100% | 60.1 MiB/s | 61.6 KiB | 00m00s [ 49/140] Installing perl-FileHandle-0: 100% | 0.0 B/s | 9.8 KiB | 00m00s [ 50/140] Installing perl-B-0:1.89-520. 100% | 246.4 MiB/s | 504.7 KiB | 00m00s [ 51/140] Installing perl-libnet-0:3.15 100% | 143.9 MiB/s | 294.7 KiB | 00m00s [ 52/140] Installing perl-Data-Dumper-0 100% | 114.8 MiB/s | 117.5 KiB | 00m00s [ 53/140] Installing perl-MIME-Base32-0 100% | 31.4 MiB/s | 32.2 KiB | 00m00s [ 54/140] Installing perl-URI-0:5.34-2. 100% | 91.7 MiB/s | 281.8 KiB | 00m00s [ 55/140] Installing perl-AutoLoader-0: 100% | 0.0 B/s | 21.0 KiB | 00m00s [ 56/140] Installing perl-IO-Socket-IP- 100% | 99.8 MiB/s | 102.2 KiB | 00m00s [ 57/140] Installing perl-IO-Socket-SSL 100% | 233.9 MiB/s | 718.6 KiB | 00m00s [ 58/140] Installing perl-Net-SSLeay-0: 100% | 226.4 MiB/s | 1.4 MiB | 00m00s [ 59/140] Installing perl-if-0:0.61.000 100% | 0.0 B/s | 6.2 KiB | 00m00s [ 60/140] Installing perl-Text-Tabs+Wra 100% | 0.0 B/s | 23.9 KiB | 00m00s [ 61/140] Installing perl-Pod-Escapes-1 100% | 0.0 B/s | 25.9 KiB | 00m00s [ 62/140] Installing perl-Time-Local-2: 100% | 68.9 MiB/s | 70.6 KiB | 00m00s [ 63/140] Installing perl-File-Path-0:2 100% | 0.0 B/s | 64.5 KiB | 00m00s [ 64/140] Installing perl-locale-0:1.13 100% | 0.0 B/s | 6.5 KiB | 00m00s [ 65/140] Installing perl-HTTP-Tiny-0:0 100% | 152.8 MiB/s | 156.4 KiB | 00m00s [ 66/140] Installing perl-Pod-Simple-1: 100% | 187.1 MiB/s | 574.9 KiB | 00m00s [ 67/140] Installing perl-File-Temp-1:0 100% | 160.2 MiB/s | 164.1 KiB | 00m00s [ 68/140] Installing perl-Class-Struct- 100% | 0.0 B/s | 25.9 KiB | 00m00s [ 69/140] Installing perl-IPC-Open3-0:1 100% | 0.0 B/s | 28.5 KiB | 00m00s [ 70/140] Installing perl-Term-Cap-0:1. 100% | 0.0 B/s | 30.6 KiB | 00m00s [ 71/140] Installing perl-Term-ANSIColo 100% | 96.9 MiB/s | 99.2 KiB | 00m00s [ 72/140] Installing perl-POSIX-0:2.23- 100% | 227.2 MiB/s | 232.6 KiB | 00m00s [ 73/140] Installing perl-Pod-Perldoc-0 100% | 11.0 MiB/s | 169.2 KiB | 00m00s [ 74/140] Installing perl-podlators-1:6 100% | 22.4 MiB/s | 321.4 KiB | 00m00s [ 75/140] Installing perl-File-stat-0:1 100% | 0.0 B/s | 13.1 KiB | 00m00s [ 76/140] Installing perl-Socket-4:2.04 100% | 119.4 MiB/s | 122.3 KiB | 00m00s [ 77/140] Installing perl-SelectSaver-0 100% | 0.0 B/s | 2.6 KiB | 00m00s [ 78/140] Installing perl-Symbol-0:1.09 100% | 0.0 B/s | 7.3 KiB | 00m00s [ 79/140] Installing perl-Pod-Usage-4:2 100% | 6.6 MiB/s | 87.9 KiB | 00m00s [ 80/140] Installing perl-IO-0:1.55-520 100% | 148.1 MiB/s | 151.7 KiB | 00m00s [ 81/140] Installing perl-overloading-0 100% | 0.0 B/s | 5.6 KiB | 00m00s [ 82/140] Installing perl-mro-0:1.29-52 100% | 41.7 MiB/s | 42.7 KiB | 00m00s [ 83/140] Installing perl-Fcntl-0:1.20- 100% | 48.7 MiB/s | 49.9 KiB | 00m00s [ 84/140] Installing perl-base-0:2.27-5 100% | 0.0 B/s | 13.0 KiB | 00m00s [ 85/140] Installing perl-Text-ParseWor 100% | 0.0 B/s | 14.6 KiB | 00m00s [ 86/140] Installing perl-File-Basename 100% | 0.0 B/s | 14.6 KiB | 00m00s [ 87/140] Installing perl-Getopt-Long-1 100% | 143.8 MiB/s | 147.2 KiB | 00m00s [ 88/140] Installing perl-Storable-1:3. 100% | 227.4 MiB/s | 232.8 KiB | 00m00s [ 89/140] Installing perl-MIME-Base64-0 100% | 43.2 MiB/s | 44.3 KiB | 00m00s [ 90/140] Installing perl-parent-1:0.24 100% | 0.0 B/s | 11.0 KiB | 00m00s [ 91/140] Installing perl-overload-0:1. 100% | 0.0 B/s | 72.0 KiB | 00m00s [ 92/140] Installing perl-vars-0:1.05-5 100% | 0.0 B/s | 4.3 KiB | 00m00s [ 93/140] Installing perl-Errno-0:1.38- 100% | 0.0 B/s | 8.8 KiB | 00m00s [ 94/140] Installing perl-constant-0:1. 100% | 0.0 B/s | 27.4 KiB | 00m00s [ 95/140] Installing perl-Scalar-List-U 100% | 145.2 MiB/s | 148.7 KiB | 00m00s [ 96/140] Installing perl-Getopt-Std-0: 100% | 0.0 B/s | 11.8 KiB | 00m00s [ 97/140] Installing perl-Encode-4:3.21 100% | 187.8 MiB/s | 4.7 MiB | 00m00s [ 98/140] Installing perl-DynaLoader-0: 100% | 0.0 B/s | 32.5 KiB | 00m00s [ 99/140] Installing perl-PathTools-0:3 100% | 180.2 MiB/s | 184.6 KiB | 00m00s [100/140] Installing perl-Exporter-0:5. 100% | 0.0 B/s | 55.6 KiB | 00m00s [101/140] Installing perl-Carp-0:1.54-5 100% | 23.3 MiB/s | 47.7 KiB | 00m00s [102/140] Installing perl-libs-4:5.42.0 100% | 284.1 MiB/s | 11.6 MiB | 00m00s [103/140] Installing perl-interpreter-4 100% | 8.4 MiB/s | 120.3 KiB | 00m00s [104/140] Installing perl-TermReadKey-0 100% | 64.6 MiB/s | 66.2 KiB | 00m00s [105/140] Installing perl-lib-0:0.65-52 100% | 0.0 B/s | 8.9 KiB | 00m00s [106/140] Installing perl-File-Copy-0:2 100% | 0.0 B/s | 20.2 KiB | 00m00s [107/140] Installing perl-File-Which-0: 100% | 0.0 B/s | 31.4 KiB | 00m00s [108/140] Installing perl-Error-1:0.170 100% | 78.1 MiB/s | 80.0 KiB | 00m00s [109/140] Installing libcbor-0:0.12.0-6 100% | 77.3 MiB/s | 79.2 KiB | 00m00s [110/140] Installing libfido2-0:1.16.0- 100% | 234.4 MiB/s | 240.0 KiB | 00m00s [111/140] Installing libpipeline-0:1.5. 100% | 11.9 MiB/s | 146.6 KiB | 00m00s [112/140] Installing man-db-0:2.13.1-2. 100% | 76.7 MiB/s | 2.9 MiB | 00m00s [113/140] Installing environment-module 100% | 60.9 MiB/s | 1.9 MiB | 00m00s [114/140] Installing openssh-0:10.0p1-5 100% | 81.9 MiB/s | 1.4 MiB | 00m00s [115/140] Installing openssh-clients-0: 100% | 96.6 MiB/s | 2.6 MiB | 00m00s [116/140] Installing git-core-0:2.51.1- 100% | 324.4 MiB/s | 23.7 MiB | 00m00s [117/140] Installing git-core-doc-0:2.5 100% | 344.2 MiB/s | 17.9 MiB | 00m00s [118/140] Installing git-0:2.51.1-1.fc4 100% | 56.4 MiB/s | 57.7 KiB | 00m00s [119/140] Installing perl-Git-0:2.51.1- 100% | 63.8 MiB/s | 65.4 KiB | 00m00s [120/140] Installing rocm-clang-0:20-7. 100% | 71.9 MiB/s | 68.5 MiB | 00m01s [121/140] Installing rocm-clang-devel-0 100% | 116.7 MiB/s | 26.3 MiB | 00m00s [122/140] Installing rocm-device-libs-0 100% | 88.2 MiB/s | 3.3 MiB | 00m00s [123/140] Installing rocm-comgr-devel-0 100% | 49.7 MiB/s | 101.9 KiB | 00m00s [124/140] Installing hipcc-0:20-7.rocm7 100% | 29.6 MiB/s | 635.9 KiB | 00m00s [125/140] Installing rocm-hip-0:7.1.0-1 100% | 337.2 MiB/s | 27.0 MiB | 00m00s [126/140] Installing rhash-0:1.4.5-3.fc 100% | 20.5 MiB/s | 356.4 KiB | 00m00s [127/140] Installing libuv-1:1.51.0-2.f 100% | 279.8 MiB/s | 573.0 KiB | 00m00s [128/140] Installing jsoncpp-0:1.9.6-2. 100% | 126.5 MiB/s | 259.2 KiB | 00m00s [129/140] Installing cmake-0:3.31.6-4.f 100% | 287.5 MiB/s | 34.5 MiB | 00m00s [130/140] Installing cmake-data-0:3.31. 100% | 109.2 MiB/s | 9.1 MiB | 00m00s [131/140] Installing pcre2-utf32-0:10.4 100% | 294.5 MiB/s | 603.1 KiB | 00m00s [132/140] Installing fdupes-1:2.4.0-2.f 100% | 8.4 MiB/s | 120.0 KiB | 00m00s [133/140] Installing rocm-cmake-0:7.1.0 100% | 18.8 MiB/s | 134.6 KiB | 00m00s [134/140] Installing rocm-hip-devel-0:7 100% | 157.6 MiB/s | 3.2 MiB | 00m00s [135/140] Installing rocm-rpm-macros-0: 100% | 0.0 B/s | 19.1 KiB | 00m00s [136/140] Installing ninja-build-0:1.13 100% | 31.5 MiB/s | 483.8 KiB | 00m00s [137/140] Installing gcc-c++-0:15.2.1-4 100% | 311.0 MiB/s | 41.4 MiB | 00m00s [138/140] Installing annobin-plugin-gcc 100% | 65.8 MiB/s | 1.0 MiB | 00m00s [139/140] Installing gcc-plugin-annobin 100% | 3.8 MiB/s | 58.8 KiB | 00m00s [140/140] Installing rocm-compilersuppo 100% | 2.8 KiB/s | 440.0 B | 00m00s Warning: skipped OpenPGP checks for 28 packages from repository: copr_base Complete! Finish: build setup for composable_kernel-7.1.0-2.fc43.src.rpm Start: rpmbuild composable_kernel-7.1.0-2.fc43.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1763424000 Executing(%mkbuilddir): /bin/sh -e /var/tmp/rpm-tmp.D55Z8O Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.YfAKd4 + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + rm -rf composable_kernel-rocm-7.1.0 + /usr/lib/rpm/rpmuncompress -x /builddir/build/SOURCES/composable_kernel-7.1.0.tar.gz + STATUS=0 + '[' 0 -ne 0 ']' + cd composable_kernel-rocm-7.1.0 + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w . + /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f + /usr/lib/rpm/rpmuncompress /builddir/build/SOURCES/0001-composable_kernel-per-dir-build.patch + sed -i -e 's@add_compile_options(-Werror)@#add_compile_options(-Werror)@' CMakeLists.txt + sed -i -e /-Werror/d cmake/EnableCompilerWarnings.cmake + sed -i -e 's@add_compile_options(-Weverything)@#add_compile_options(-Weverything)@' CMakeLists.txt + sed -i -e /-Wextra/d cmake/EnableCompilerWarnings.cmake + sed -i -e /-Wunused/d cmake/EnableCompilerWarnings.cmake + sed -i -e /-Weverything/d cmake/EnableCompilerWarnings.cmake + sed -i -e 's@-Wno-unknown-warning-option@-Wno-unknown-warning-option -Wno-unused-parameter@' cmake/EnableCompilerWarnings.cmake + sed -i -e 's@CK_TIME_KERNEL 1@CK_TIME_KERNEL 0@' include/ck/ck.hpp + sed -i -e 's@add_subdirectory(example)@#add_subdirectory(example)@' CMakeLists.txt + sed -i -e 's@add_subdirectory(profiler)@#add_subdirectory(profiler)@' CMakeLists.txt + sed -i -e s@STATIC@SHARED@ library/src/utility/CMakeLists.txt library/src/tensor_operation_instance/gpu/CMakeLists.txt + sed -i -e 's@POSITION_INDEPENDENT_CODE ON@POSITION_INDEPENDENT_CODE ON SOVERSION \"7.1.0\"@' library/src/utility/CMakeLists.txt library/src/tensor_operation_instance/gpu/CMakeLists.txt + RPM_EC=0 ++ jobs -p + exit 0 Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.qC5pXW + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=thin -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=hipcc + export CC + CXX=hipcc + export CXX + cd composable_kernel-rocm-7.1.0 ++ cat /proc/cpuinfo ++ grep -m 1 'cpu cores' ++ awk '{ print $4 }' + COMPILE_JOBS=2 + '[' 2x = x ']' + '[' 2 = 1 ']' + BUILD_MEM=6 + MEM_KB=0 ++ cat /proc/meminfo ++ awk '{ print $2 }' ++ grep MemTotal + MEM_KB=7953344 ++ eval 'expr 7953344 / 1024' +++ expr 7953344 / 1024 + MEM_MB=7766 ++ eval 'expr 7766 / 1024' +++ expr 7766 / 1024 + MEM_GB=7 ++ eval 'expr 1 + 7 / 6' +++ expr 1 + 7 / 6 + COMPILE_JOBS_MEM=2 + '[' 2 -lt 2 ']' + LINK_MEM=12 ++ eval 'expr 1 + 7 / 12' +++ expr 1 + 7 / 12 + LINK_JOBS=1 + CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=thin -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=hipcc + export CC + CXX=hipcc + export CXX + /usr/bin/cmake -S . -B redhat-linux-build -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON -G Ninja -DBUILD_TESTING=OFF -DCK_BUILD_DEVICE_CONV=ON -DCK_BUILD_DEVICE_CONTRACTION=ON '-DCK_BUILD_DEVICE_GEMM=%{build_ck_gem}' -DCK_BUILD_DEVICE_MHA=ON -DCK_BUILD_DEVICE_OTHER=ON -DCK_BUILD_DEVICE_REDUCTION=ON -DCK_PARALLEL_COMPILE_JOBS=2 -DCK_PARALLEL_LINK_JOBS=1 -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_COMPILER=/usr/lib64/rocm/llvm/bin/clang++ -DCMAKE_CXX_FLAGS=-fuse-ld=bfd -DCMAKE_EXPORT_COMPILE_COMMANDS=OFF '-DCMAKE_HIP_ARCHITECTURES=gfx11-generic;gfx12-generic' -DCMAKE_HIP_COMPILER=/usr/lib64/rocm/llvm/bin/clang++ -DCMAKE_INSTALL_LIBDIR=/usr/lib64 -DENABLE_CLANG_CPP_CHECKS=OFF '-DGPU_ARCHS=gfx11-generic;gfx12-generic' -DHIP_PLATFORM=amd -DROCM_SYMLINK_LIBS=OFF -- The CXX compiler identification is Clang 20.0.0 -- The HIP compiler identification is Clang 20.0.0 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/lib64/rocm/llvm/bin/clang++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Detecting HIP compiler ABI info -- Detecting HIP compiler ABI info - done -- Check for working HIP compiler: /usr/lib64/rocm/llvm/bin/clang++ - skipped -- Detecting HIP compile features -- Detecting HIP compile features - done -- Found Python3: /usr/bin/python3.14 (found suitable version "3.14.0", minimum required is "3.8") found components: Interpreter -- Found Git: /usr/bin/git (found version "2.51.1") fatal: not a git repository (or any of the parent directories): .git CMake Deprecation Warning at /usr/share/rocm/cmake/ROCMConfig.cmake:12 (message): Use of find_package(ROCM) is deprecated as of ROCm 6.4. Please use find_package(ROCmCMakeBuildTools) Call Stack (most recent call first): CMakeLists.txt:148 (find_package) -- GPU_TARGETS= -- GPU_ARCHS= gfx11-generic;gfx12-generic -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success -- hip_version_flat=700125436 -- checking which targets are supported -- Performing Test COMPILER_HAS_TARGET_ID_gfx11_generic -- Performing Test COMPILER_HAS_TARGET_ID_gfx11_generic - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx12_generic -- Performing Test COMPILER_HAS_TARGET_ID_gfx12_generic - Success -- Building CK for the following targets: gfx11-generic;gfx12-generic -- Enabling WMMA instances -- Enabling WMMA FP8 gemms on native architectures -- Performing Test HAS_NO_OFFLOAD_UNIFORM_BLOCK -- Performing Test HAS_NO_OFFLOAD_UNIFORM_BLOCK - Success -- Adding the fno-offload-uniform-block compiler flag -- Performing Test HAS_LSR_DROP_SOLUTION -- Performing Test HAS_LSR_DROP_SOLUTION - Success -- Adding the lsr-drop-solution=1 compiler flag -- Performing Test HAS_ENABLE_POST_MISCHED -- Performing Test HAS_ENABLE_POST_MISCHED - Success -- Adding the enable-post-misched=0 compiler flag -- Performing Test check-coerce -- Performing Test check-coerce - Success -- Adding the amdgpu-coerce-illegal-types=1 -- Adding -amdgpu-early-inline-all=true and -amdgpu-function-calls=false -- CMAKE_CXX_COMPILER: /usr/lib64/rocm/llvm/bin/clang++ -- CMAKE_HIP_COMPILER: /usr/lib64/rocm/llvm/bin/clang++ -- OpenMP_CXX_LIB_NAMES: libomp;libgomp;libiomp5 -- OpenMP_gomp_LIBRARY: -- OpenMP_pthread_LIBRARY: -- OpenMP_CXX_FLAGS: -fopenmp=libomp -Wno-unused-command-line-argument -- Build with HIP -- CMAKE_CXX_FLAGS: -fuse-ld=bfd -- Generating sharded instantiations for target: device_grouped_conv2d_fwd_xdl_ngchw_gkcyx_ngkhw_bf16_instances -- Generating sharded instantiations for target: device_grouped_conv2d_fwd_xdl_ngchw_gkcyx_ngkhw_f16_instances -- Generating sharded instantiations for target: device_grouped_conv2d_fwd_xdl_ngchw_gkcyx_ngkhw_bf16_comp_instances -- Generating sharded instantiations for target: device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_mem_inter_instances -- Generating sharded instantiations for target: device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_mem_intra_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_bf16_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f16_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_bf16_mem_inter_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f16_mem_inter_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f32_mem_inter_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_bf16_mem_intra_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f16_mem_intra_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f32_mem_intra_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_comp_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_f16_comp_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_bf16_comp_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f16_comp_instances -- Could NOT find Python3 (missing: Python3_INCLUDE_DIRS Python3_LIBRARIES Development Development.Module Development.Embed) (found version "3.14.0") -- Configuring done (8.5s) -- Generating done (0.7s) CMake Warning: Manually-specified variables were not used by the project: CMAKE_CXX_FLAGS_RELEASE CMAKE_C_FLAGS_RELEASE CMAKE_Fortran_FLAGS_RELEASE CMAKE_INSTALL_DO_STRIP CMAKE_VERBOSE_MAKEFILE LIB_SUFFIX SHARE_INSTALL_PREFIX SYSCONF_INSTALL_DIR -- Build files have been written to: /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build + /usr/bin/cmake --build redhat-linux-build --verbose Change Dir: '/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build' Run Build Command(s): /usr/bin/ninja-build -v [1/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/device_avg_pool2d_bwd_nhwc_f16_instance.cpp [2/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/device_avg_pool2d_bwd_nhwc_bf16_instance.cpp [3/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/device_avg_pool2d_bwd_nhwc_f32_instance.cpp [4/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/device_avg_pool2d_bwd_nhwc_f8_instance.cpp [5/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/device_avg_pool2d_bwd_nhwc_int8_instance.cpp [6/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp [7/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp [8/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp [9/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gkn_gmn_instance.cpp [10/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gnk_gmn_instance.cpp [11/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gkn_gmn_instance.cpp [12/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gnk_gmn_instance.cpp [13/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp [14/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp [15/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp [16/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp [17/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_instance.cpp [18/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_instance.cpp [19/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_instance.cpp [20/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_instance.cpp [21/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_irregular_instance.cpp [22/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_irregular_instance.cpp [23/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_irregular_instance.cpp [24/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_irregular_instance.cpp [25/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_instance.cpp [26/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_instance.cpp [27/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_instance.cpp [28/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_instance.cpp [29/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_irregular_instance.cpp [30/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_irregular_instance.cpp [31/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_irregular_instance.cpp [32/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_forward_f16_instance.cpp [33/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_irregular_instance.cpp [34/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_forward_f32_instance.cpp [35/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_forward_bf16_instance.cpp [36/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f64_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f64_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f64_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_forward_f64_instance.cpp [37/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_backward_f16_instance.cpp [38/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_backward_f32_instance.cpp [39/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_backward_bf16_instance.cpp [40/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f64_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f64_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f64_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_backward_f64_instance.cpp [41/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_infer_f16_instance.cpp [42/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_infer_f32_instance.cpp [43/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f64_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f64_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f64_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_infer_f64_instance.cpp [44/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnwc_1d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnwc_1d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnwc_1d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_gnwc_1d_instance.cpp [45/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_infer_bf16_instance.cpp [46/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnhwc_2d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnhwc_2d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnhwc_2d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_gnhwc_2d_instance.cpp [47/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gndhwc_3d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gndhwc_3d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gndhwc_3d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_gndhwc_3d_instance.cpp [48/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nwgc_1d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nwgc_1d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nwgc_1d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_nwgc_1d_instance.cpp [49/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nhwgc_2d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nhwgc_2d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nhwgc_2d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_nhwgc_2d_instance.cpp [50/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_ndhwgc_3d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_ndhwgc_3d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_ndhwgc_3d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_ndhwgc_3d_instance.cpp [51/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx11-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx12-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for host. [52/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx11-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx12-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for host. [53/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx11-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx12-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for host. [54/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/elementwise/CMakeFiles/device_elementwise_instance.dir/device_normalize_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/elementwise/CMakeFiles/device_elementwise_instance.dir/device_normalize_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/elementwise/CMakeFiles/device_elementwise_instance.dir/device_normalize_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/elementwise/device_normalize_instance.cpp [55/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/elementwise_normalization/CMakeFiles/device_elementwise_normalization_instance.dir/device_elementwise_normalization_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/elementwise_normalization/CMakeFiles/device_elementwise_normalization_instance.dir/device_elementwise_normalization_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/elementwise_normalization/CMakeFiles/device_elementwise_normalization_instance.dir/device_elementwise_normalization_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/elementwise_normalization/device_elementwise_normalization_f16_instance.cpp [56/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp [57/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp [58/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp [59/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp [60/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp [61/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp [62/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp [63/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp [64/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp [65/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp [66/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp [67/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp [68/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp [69/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp [70/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp [71/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp [72/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp [73/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp [74/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp [75/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp [76/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp [77/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp [78/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp [79/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp [80/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp [81/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp [82/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp [83/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp [84/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp [85/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp [86/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp [87/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp [88/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp [89/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp [90/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp [91/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp [92/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp [93/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp [94/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp [95/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp [96/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp [97/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp [98/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp [99/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp [100/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp [101/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp [102/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp [103/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp [104/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp [105/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp [106/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp [107/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp [108/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp [109/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp [110/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp [111/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp [112/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp [113/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp [114/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp [115/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_f16_f16_f16_mk_kn_mn_instance.cpp [116/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_f16_f16_f16_mk_nk_mn_instance.cpp [117/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp [118/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_f16_f16_f16_km_nk_mn_instance.cpp [119/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_f16_f16_f16_km_kn_mn_instance.cpp [120/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_bf16_bf16_bf16_mk_nk_mn_instance.cpp [121/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_bf16_bf16_bf16_mk_kn_mn_instance.cpp [122/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_bf16_bf16_bf16_km_nk_mn_instance.cpp [123/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_bf16_bf16_bf16_km_kn_mn_instance.cpp [124/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_int8_int8_int8_mk_nk_mn_instance.cpp [125/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_int8_int8_int8_mk_kn_mn_instance.cpp [126/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_int8_int8_int8_km_kn_mn_instance.cpp [127/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_int8_int8_int8_km_nk_mn_instance.cpp [128/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_kn_mn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_kn_mn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_kn_mn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_bilinear/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_kn_mn_mn_instance.cpp [129/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_b_scale/CMakeFiles/device_gemm_b_scale_instance.dir/device_gemm_b_scale_wmma_f16_i4_f16/device_gemm_b_scale_wmma_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_b_scale/CMakeFiles/device_gemm_b_scale_instance.dir/device_gemm_b_scale_wmma_f16_i4_f16/device_gemm_b_scale_wmma_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_b_scale/CMakeFiles/device_gemm_b_scale_instance.dir/device_gemm_b_scale_wmma_f16_i4_f16/device_gemm_b_scale_wmma_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_b_scale/device_gemm_b_scale_wmma_f16_i4_f16/device_gemm_b_scale_wmma_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp [130/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_nk_mn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_nk_mn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_nk_mn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_bilinear/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_nk_mn_mn_instance.cpp [131/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_kn_mn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_kn_mn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_kn_mn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_bilinear/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_kn_mn_mn_instance.cpp [132/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_nk_mn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_nk_mn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_nk_mn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_bilinear/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_nk_mn_mn_instance.cpp [133/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp [134/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp [135/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp [136/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp [137/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp [138/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp [139/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp [140/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp [141/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_default_instance.cpp [142/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_kpadding_instance.cpp [143/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp [144/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp [145/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_default_instance.cpp [146/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_kpadding_instance.cpp [147/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp [148/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp [149/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp [150/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp [151/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp [152/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp [153/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp [154/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_kpadding_instance.cpp [155/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnpadding_instance.cpp [156/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp [157/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp [158/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp [159/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp [160/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp [161/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp [162/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_kpadding_instance.cpp [163/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnpadding_instance.cpp [164/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_mk_nk_mn_comp_default_instance.cpp [165/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnkpadding_instance.cpp [166/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_km_nk_mn_comp_default_instance.cpp [167/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_mk_nk_mn_comp_default_instance.cpp [168/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_km_nk_mn_comp_default_instance.cpp [169/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp [170/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp [171/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp [172/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp [173/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp [174/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp [175/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp [176/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp [177/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_default_instance.cpp [178/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_kpadding_instance.cpp [179/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp [180/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp [181/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_default_instance.cpp [182/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_kpadding_instance.cpp [183/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp [184/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp [185/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp [186/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp [187/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp [188/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp [189/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp [190/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp [191/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnpadding_instance.cpp [192/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnkpadding_instance.cpp [193/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_default_instance.cpp [194/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_kpadding_instance.cpp [195/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnpadding_instance.cpp [196/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnkpadding_instance.cpp [197/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_default_instance.cpp [198/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_kpadding_instance.cpp [199/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnpadding_instance.cpp [200/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnkpadding_instance.cpp [201/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_default_instance.cpp [202/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_kpadding_instance.cpp [203/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnpadding_instance.cpp [204/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp [205/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp [206/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp [207/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnpadding_instance.cpp [208/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp [209/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f32_instance.cpp [210/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f16_instance.cpp [211/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_bf16_f32_bf16_instance.cpp [212/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f32_instance.cpp [213/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f16_instance.cpp [214/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_bf16_f32_bf16_instance.cpp [215/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp [216/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp [217/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp [218/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp [219/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp [220/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp [221/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp [222/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp [223/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp [224/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp [225/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_bf16_f32_bf16_instance.cpp [226/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp [227/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp [228/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_bf16_f32_bf16_instance.cpp [229/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp [230/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp [231/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp [232/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp [233/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp [234/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1p0_instance.cpp [235/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp [236/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1p0_instance.cpp [237/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp [238/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_oddc_instance.cpp [239/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp [240/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp [241/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_oddc_instance.cpp [242/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1p0_instance.cpp [243/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp [244/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp [245/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1p0_instance.cpp [246/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_oddc_instance.cpp [247/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp [248/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_oddc_instance.cpp [249/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp [250/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp [251/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp [252/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp [253/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp [254/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp [255/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp [256/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp [257/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp [258/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp [259/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_bf16_f32_bf16_instance.cpp [260/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp [261/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp [262/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_bf16_f32_bf16_instance.cpp [263/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp [264/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp [265/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp [266/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp [267/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp [268/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp [269/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp [270/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp [271/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp [272/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp [273/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp [274/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1p0_instance.cpp [275/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp [276/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1p0_instance.cpp [277/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1p0_instance.cpp [278/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp [279/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1p0_instance.cpp [280/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp [281/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp [282/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_oddc_instance.cpp [283/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp [284/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_oddc_instance.cpp [285/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_oddc_instance.cpp [286/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnwc_1d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnwc_1d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnwc_1d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_gnwc_1d_instance.cpp [287/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnhwc_2d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnhwc_2d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnhwc_2d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_gnhwc_2d_instance.cpp [288/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gndhwc_3d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gndhwc_3d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gndhwc_3d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_gndhwc_3d_instance.cpp [289/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nwgc_1d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nwgc_1d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nwgc_1d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_nwgc_1d_instance.cpp [290/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nhwgc_2d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nhwgc_2d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nhwgc_2d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_nhwgc_2d_instance.cpp [291/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_ndhwgc_3d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_ndhwgc_3d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_ndhwgc_3d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_ndhwgc_3d_instance.cpp [292/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/max_pool_bwd/device_max_pool_bwd_f16_instance.cpp [293/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/max_pool_bwd/device_max_pool_bwd_bf16_instance.cpp [294/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/max_pool_bwd/device_max_pool_bwd_f32_instance.cpp [295/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/max_pool_bwd/device_max_pool_bwd_f8_instance.cpp [296/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/max_pool_bwd/device_max_pool_bwd_int8_instance.cpp [297/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_groupnorm_bwd_data_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_groupnorm_bwd_data_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_groupnorm_bwd_data_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_data/device_groupnorm_bwd_data_f32_instance.cpp [298/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_data/device_layernorm2d_bwd_data_f16_instance.cpp [299/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_data/device_layernorm2d_bwd_data_f32_instance.cpp [300/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_groupnorm_bwd_gamma_beta_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_groupnorm_bwd_gamma_beta_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_groupnorm_bwd_gamma_beta_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/device_groupnorm_bwd_gamma_beta_f32_instance.cpp [301/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_oddc_instance.cpp [302/455] : && /usr/lib64/rocm/llvm/bin/clang++ -fPIC -fuse-ld=bfd -O2 -g -DNDEBUG -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Xlinker --dependency-file=library/src/tensor_operation_instance/gpu/CMakeFiles/device_conv_operations.dir/link.d -shared -Wl,-soname,libdevice_conv_operations.so.1 -o lib/libdevice_conv_operations.so.1.1.0 library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_oddc_instance.cpp.o -Wl,-rpath,::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib64/libamdhip64.so.7.1.25436 --hip-link /usr/lib64/rocm/llvm/lib/clang/20/lib/linux/libclang_rt.builtins-x86_64.a && : clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] [303/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/device_layernorm2d_bwd_gamma_beta_f32_instance.cpp [304/455] /usr/bin/cmake -E cmake_symlink_library lib/libdevice_conv_operations.so.1.1.0 lib/libdevice_conv_operations.so.1 lib/libdevice_conv_operations.so && : [305/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/device_layernorm2d_bwd_gamma_beta_f16_instance.cpp [306/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_layernorm2d_fwd_f16_instance.cpp [307/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_layernorm4d_fwd_f16_instance.cpp [308/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_groupnorm_fwd_f16_instance.cpp [309/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_groupnorm_fwd_swish_f16_instance.cpp [310/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_layernorm2d_fwd_f32_instance.cpp [311/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_f32_f32_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_f32_f32_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_f32_f32_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_groupnorm_fwd_swish_f16_f32_f32_f16_instance.cpp [312/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_layernorm4d_fwd_f32_instance.cpp [313/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_groupnorm_fwd_f32_instance.cpp [314/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_1d_fp16_instances.cpp [315/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_groupnorm_fwd_swish_f32_instance.cpp [316/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_2d_fp16_instances.cpp [317/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_3d_fp16_instances.cpp [318/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_4d_fp16_instances.cpp [319/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_5d_fp16_instances.cpp [320/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_1d_fp32_instances.cpp [321/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_6d_fp16_instances.cpp [322/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_2d_fp32_instances.cpp [323/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_3d_fp32_instances.cpp [324/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_4d_fp32_instances.cpp [325/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_5d_fp32_instances.cpp [326/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_6d_fp32_instances.cpp [327/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_avg_pool2d_fwd_nhwc_f16_instance.cpp [328/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_max_pool2d_fwd_nhwc_f16_instance.cpp [329/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_avg_pool2d_fwd_nhwc_f32_instance.cpp [330/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_max_pool2d_fwd_nhwc_f32_instance.cpp [331/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_avg_pool2d_fwd_nhwc_bf16_instance.cpp [332/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_fp8_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_fp8_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_fp8_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_6d_fp32_fp8_instances.cpp [333/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_max_pool2d_fwd_nhwc_bf16_instance.cpp [334/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_avg_pool2d_fwd_nhwc_i8_instance.cpp [335/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_max_pool2d_fwd_nhwc_i8_instance.cpp [336/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_avg_pool2d_fwd_nhwc_f8_instance.cpp [337/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_max_pool2d_fwd_nhwc_f8_instance.cpp [338/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_avg_pool3d_fwd_ndhwc_f16_instance.cpp [339/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_max_pool3d_fwd_ndhwc_f16_instance.cpp [340/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_max_pool3d_fwd_ndhwc_f8_instance.cpp [341/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_avg_pool3d_fwd_ndhwc_f8_instance.cpp [342/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_max_pool3d_fwd_ndhwc_i8_instance.cpp [343/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_avg_pool3d_fwd_ndhwc_i8_instance.cpp [344/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_avg_pool3d_fwd_ndhwc_f32_instance.cpp [345/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_max_pool3d_fwd_ndhwc_f32_instance.cpp [346/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_avg_pool3d_fwd_ndhwc_bf16_instance.cpp [347/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_max_pool3d_fwd_ndhwc_bf16_instance.cpp [348/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp [349/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp [350/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp [351/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp [352/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp [353/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp [354/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp [355/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp [356/455] : && /usr/lib64/rocm/llvm/bin/clang++ -fPIC -fuse-ld=bfd -O2 -g -DNDEBUG -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Xlinker --dependency-file=library/src/tensor_operation_instance/gpu/CMakeFiles/device_gemm_operations.dir/link.d -shared -Wl,-soname,libdevice_gemm_operations.so.1 -o lib/libdevice_gemm_operations.so.1.1.0 library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_b_scale/CMakeFiles/device_gemm_b_scale_instance.dir/device_gemm_b_scale_wmma_f16_i4_f16/device_gemm_b_scale_wmma_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_kn_mn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_nk_mn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_kn_mn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_nk_mn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp.o -Wl,-rpath,::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib64/libamdhip64.so.7.1.25436 --hip-link /usr/lib64/rocm/llvm/lib/clang/20/lib/linux/libclang_rt.builtins-x86_64.a && : clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] [357/455] /usr/bin/cmake -E cmake_symlink_library lib/libdevice_gemm_operations.so.1.1.0 lib/libdevice_gemm_operations.so.1 lib/libdevice_gemm_operations.so && : [358/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_min.cpp [359/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_max.cpp [360/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_amax.cpp [361/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_add.cpp [362/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_avg.cpp [363/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_norm2.cpp [364/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_add.cpp [365/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_avg.cpp [366/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_norm2.cpp [367/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_min.cpp [368/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_max.cpp [369/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_add.cpp [370/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_avg.cpp [371/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_norm2.cpp [372/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_add.cpp [373/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_amax.cpp [374/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_avg.cpp [375/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_norm2.cpp [376/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_min.cpp [377/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_max.cpp [378/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i32_i8_add.cpp [379/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_amax.cpp [380/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i32_i8_avg.cpp [381/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_min.cpp [382/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_max.cpp [383/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_add.cpp [384/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_amax.cpp [385/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_avg.cpp [386/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_norm2.cpp [387/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_min.cpp [388/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_max.cpp [389/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_min.cpp [390/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_max.cpp [391/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_amax.cpp [392/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_add.cpp [393/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_amax.cpp [394/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_avg.cpp [395/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_norm2.cpp [396/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_add.cpp [397/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_avg.cpp [398/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_norm2.cpp [399/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_min.cpp [400/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_max.cpp [401/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_amax.cpp [402/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_add.cpp [403/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_avg.cpp [404/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_norm2.cpp [405/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_add.cpp [406/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_avg.cpp [407/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_norm2.cpp [408/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_min.cpp [409/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_max.cpp [410/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_amax.cpp [411/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i32_i8_add.cpp [412/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i32_i8_avg.cpp [413/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_min.cpp [414/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_max.cpp [415/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_amax.cpp [416/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_add.cpp [417/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_avg.cpp [418/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_norm2.cpp [419/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_min.cpp [420/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_max.cpp [421/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_amax.cpp [422/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.cpp [423/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.cpp [424/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.cpp [425/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.cpp [426/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.cpp [427/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.cpp [428/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.cpp [429/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.cpp [430/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.cpp [431/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.cpp [432/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce1.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce1.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce1.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce1.cpp [433/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce2.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce2.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce2.cpp [434/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce1.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce1.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce1.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce1.cpp [435/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce3.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce3.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce3.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce3.cpp [436/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce3.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce3.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce3.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce3.cpp [437/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce2.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce2.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce2.cpp [438/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce1.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce1.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce1.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce1.cpp [439/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce4.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce4.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce4.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce4.cpp [440/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce2.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce2.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce2.cpp [441/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce1.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce1.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce1.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce1.cpp [442/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce3.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce3.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce3.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce3.cpp [443/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce2.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce2.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce2.cpp [444/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce3.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce3.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce3.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce3.cpp [445/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o -MF library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o.d -o library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/transpose/device_transpose_instances_3d.cpp [446/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce4.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce4.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce4.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce4.cpp [447/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Dutility_EXPORTS -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -x hip -MD -MT library/src/utility/CMakeFiles/utility.dir/device_memory.cpp.o -MF library/src/utility/CMakeFiles/utility.dir/device_memory.cpp.o.d -o library/src/utility/CMakeFiles/utility.dir/device_memory.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/utility/device_memory.cpp [448/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Dutility_EXPORTS -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -x hip -MD -MT library/src/utility/CMakeFiles/utility.dir/host_tensor.cpp.o -MF library/src/utility/CMakeFiles/utility.dir/host_tensor.cpp.o.d -o library/src/utility/CMakeFiles/utility.dir/host_tensor.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/utility/host_tensor.cpp [449/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Dutility_EXPORTS -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -x hip -MD -MT library/src/utility/CMakeFiles/utility.dir/convolution_parameter.cpp.o -MF library/src/utility/CMakeFiles/utility.dir/convolution_parameter.cpp.o.d -o library/src/utility/CMakeFiles/utility.dir/convolution_parameter.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/utility/convolution_parameter.cpp [450/455] : && /usr/lib64/rocm/llvm/bin/clang++ -fPIC -fuse-ld=bfd -O2 -g -DNDEBUG -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Xlinker --dependency-file=library/src/tensor_operation_instance/gpu/CMakeFiles/device_other_operations.dir/link.d -shared -Wl,-soname,libdevice_other_operations.so.1 -o lib/libdevice_other_operations.so.1.1.0 library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f64_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f64_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f64_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnwc_1d_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnhwc_2d_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gndhwc_3d_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nwgc_1d_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nhwgc_2d_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_ndhwgc_3d_instance.cpp.o library/src/tensor_operation_instance/gpu/elementwise/CMakeFiles/device_elementwise_instance.dir/device_normalize_instance.cpp.o library/src/tensor_operation_instance/gpu/elementwise_normalization/CMakeFiles/device_elementwise_normalization_instance.dir/device_elementwise_normalization_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnwc_1d_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnhwc_2d_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gndhwc_3d_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nwgc_1d_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nhwgc_2d_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_ndhwgc_3d_instance.cpp.o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_groupnorm_bwd_data_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_groupnorm_bwd_gamma_beta_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_f32_f32_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_fp8_instances.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o -Wl,-rpath,::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib64/libamdhip64.so.7.1.25436 --hip-link /usr/lib64/rocm/llvm/lib/clang/20/lib/linux/libclang_rt.builtins-x86_64.a && : clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] [451/455] /usr/bin/cmake -E cmake_symlink_library lib/libdevice_other_operations.so.1.1.0 lib/libdevice_other_operations.so.1 lib/libdevice_other_operations.so && : [452/455] : && /usr/lib64/rocm/llvm/bin/clang++ -fPIC -fuse-ld=bfd -O2 -g -DNDEBUG -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Xlinker --dependency-file=library/src/tensor_operation_instance/gpu/CMakeFiles/device_reduction_operations.dir/link.d -shared -Wl,-soname,libdevice_reduction_operations.so.1 -o lib/libdevice_reduction_operations.so.1.1.0 library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce1.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce2.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce3.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce1.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce2.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce3.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce4.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce1.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce2.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce3.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce1.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce2.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce3.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce4.cpp.o -Wl,-rpath,::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib64/libamdhip64.so.7.1.25436 --hip-link /usr/lib64/rocm/llvm/lib/clang/20/lib/linux/libclang_rt.builtins-x86_64.a && : clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] [453/455] /usr/bin/cmake -E cmake_symlink_library lib/libdevice_reduction_operations.so.1.1.0 lib/libdevice_reduction_operations.so.1 lib/libdevice_reduction_operations.so && : [454/455] : && /usr/lib64/rocm/llvm/bin/clang++ -fPIC -fuse-ld=bfd -O2 -g -DNDEBUG -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Xlinker --dependency-file=library/src/utility/CMakeFiles/utility.dir/link.d -shared -Wl,-soname,libutility.so.1 -o lib/libutility.so.1.1.0 library/src/utility/CMakeFiles/utility.dir/device_memory.cpp.o library/src/utility/CMakeFiles/utility.dir/host_tensor.cpp.o library/src/utility/CMakeFiles/utility.dir/convolution_parameter.cpp.o -Wl,-rpath,::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib64/libamdhip64.so.7.1.25436 --hip-link /usr/lib64/rocm/llvm/lib/clang/20/lib/linux/libclang_rt.builtins-x86_64.a && : clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] [455/455] /usr/bin/cmake -E cmake_symlink_library lib/libutility.so.1.1.0 lib/libutility.so.1 lib/libutility.so && : + RPM_EC=0 ++ jobs -p + exit 0 Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.MvHr7D + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + '[' /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT '!=' / ']' + rm -rf /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT ++ dirname /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT + mkdir -p /builddir/build/BUILD/composable_kernel-7.1.0-build + mkdir /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT + CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=thin -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=hipcc + export CC + CXX=hipcc + export CXX + cd composable_kernel-rocm-7.1.0 + DESTDIR=/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT + /usr/bin/cmake --install redhat-linux-build -- Install configuration: "RelWithDebInfo" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/remod.py -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ref -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ref/naive_attention.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ref/README.md -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/pipeline/topk_softmax_warp_per_row_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/pipeline/topk_softmax_warp_per_row_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/pipeline/topk_softmax_warp_per_row_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/kernel/topk_softmax_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk/block/block_topk_stream_2d_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk/block/block_topk_stream_2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/softmax -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/softmax/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/softmax/block/block_softmax_2d_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/softmax/block/block_softmax_2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_two_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_one_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/kernel/smoothquant_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/kernel/moe_smoothquant_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_pipeline_two_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_pipeline_one_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_pipeline_model_sensitive_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/kernel/rmsnorm2d_fwd_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce/block/block_reduce2d_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce/block/block_reduce2d_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce/block/block_reduce2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce/block/block_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute/pipeline/generic_petmute_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute/kernel/generic_permute_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce/thread -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce/thread/thread_welford.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce/block/block_norm_reduce_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce/block/block_norm_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline/layernorm2d_fwd_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline/layernorm2d_fwd_pipeline_two_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline/layernorm2d_fwd_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline/layernorm2d_fwd_pipeline_one_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline/layernorm2d_fwd_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/kernel/layernorm2d_fwd_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column/pipeline/tile_image_to_column_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column/pipeline/block_image_to_column_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column/kernel/image_to_column_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/utils -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/utils/transform_conv_fwd_to_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/utils/transform_conv_bwd_weight_to_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/utils/grouped_convolution_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/utils/convolution_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/kernel/grouped_convolution_forward_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/kernel/grouped_convolution_backward_weight_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/tile_gemm_aquant_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/gemm_group_quant_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/gemm_aquant_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/gemm_aquant_pipeline_ag_bg_cr_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/gemm_aquant_pipeline_ag_bg_cr_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/gemm_aquant_pipeline_ag_bg_cr_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/kernel/gemm_aquant_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/block/block_universal_gemm_as_aquant_bs_cr.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_smfmac_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_dispatcher.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_attribute_smfmac_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_attribute_smfmac.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_attribute_mfma_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_attribute_mfma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/wp_pipeline_agmem_bgmem_creg_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/wp_pipeline_agmem_bgmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/wp_pipeline_agmem_bgmem_creg_base_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/tile_gemm_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/tile_gemm_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_universal_pipeline_ag_bg_cr_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_agmem_bgmem_creg_v2_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_agmem_bgmem_creg_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_agmem_bgmem_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_agmem_bgmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_scheduler.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_mem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_comp_v5_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_comp_v5.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_comp_v4_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_comp_v4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_comp_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/universal_gemm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/grouped_gemm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/gemm_tile_partitioner.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/gemm_multi_d_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/gemm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/batched_gemm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_wp_asmem_bsmem_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_wp_asmem_bsmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_universal_gemm_as_bs_cr.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_bsmem_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_bsmem_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_bsmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_breg_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_breg_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_breg_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v2r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v2_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v2_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_one_warp_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_breg_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_breg_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_breg_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bgmem_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bgmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/moe_sorting_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/moe_sorting_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/fused_moegemm_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/fused_moegemm_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/fused_moegemm_pipeline_flatmm_uk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/fused_moegemm_pipeline_flatmm_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/fused_moegemm_pipeline_flatmm_ex.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel/moe_sorting_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel/moe_sorting_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel/fused_moegemm_tile_partitioner.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel/fused_moegemm_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel/fused_moegemm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/tile_fmha_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/tile_fmha_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qx_ks_vs_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qs_ks_vs_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qs_ks_vs.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_whole_k_prefetch_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_whole_k_prefetch.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_fp8.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_async_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_async.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_enum.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_qr_ks_vs_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_qr_ks_vs.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_nwarp_sshuffle_qr_ks_vs_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_nwarp_sshuffle_qr_ks_vs.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_combine_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_combine_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_pagedkv_pipeline_qr_ks_vs_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_pagedkv_pipeline_qr_ks_vs.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_appendkv_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_appendkv_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_pipeline_enum.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_dq_dk_dv_pipeline_kr_ktr_vr_iglp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_dq_dk_dv_pipeline_kr_ktr_vr.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_dot_do_o.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_convert_dq.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_batch_prefill_pipeline_qr_ks_vs_async_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_batch_prefill_pipeline_qr_ks_vs_async.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_splitkv_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_splitkv_combine_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_pagedkv_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_appendkv_tile_partitioner.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_appendkv_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_bwd_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_batch_prefill_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/variants.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/page_block_navigator.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/block_rotary_embedding.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/block_position_encoding.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/block_masking.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/block_dropout.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/block_attention_bias_enum.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/pipeline/tile_flatmm_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/pipeline/flatmm_pipeline_agmem_bgmem_creg_v1_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/pipeline/flatmm_pipeline_agmem_bgmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/kernel/flatmm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/uk -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/uk/flatmm_uk_gfx9_32x512x128_1x1x1_16x16x16.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/uk/flatmm_sn_uk_gfx9_32x128x512_1x4x1_16x16x16_itl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/uk/flatmm_sn_uk_gfx9_32x128x512_1x4x1_16x16x16.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/uk/README.md -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/flatmm_uk_config.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/flatmm_sn_32x128x512_1x4x1_16x16x32_itl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/flatmm_sn_32x128x512_1x4x1_16x16x32.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/flatmm_32x512x128_1x4x1_16x16x32.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/block_flatmm_asmem_bsmem_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/block_flatmm_asmem_bsmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue/dynamic_quant_epilogue.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue/default_2d_epilogue.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue/default_2d_and_dynamic_quant_epilogue.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue/cshuffle_epilogue.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/unary_element_wise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/pipeline/elementwise_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/pipeline/elementwise_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/pipeline/elementwise_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/kernel/elementwise_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/binary_elementwise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common/utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common/tensor_layout.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common/generic_2d_block_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common/README.md -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_lds_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_lds_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_lds_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_common_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/kernel/batched_transpose_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/pipeline/add_rmsnorm2d_rdquant_fwd_pipeline_three_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/pipeline/add_rmsnorm2d_rdquant_fwd_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/pipeline/add_rmsnorm2d_rdquant_fwd_pipeline_one_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/pipeline/add_rmsnorm2d_rdquant_fwd_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/kernel/add_rmsnorm2d_rdquant_fwd_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/timer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/stream_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/stream_config.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/rotating_buffers.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_transpose.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_topk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_rowwise_quantization2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_rmsnorm2d_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_moe_sorting.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_layernorm2d_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_im2col.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_grouped_conv_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_grouped_conv_bwd_weight.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_fused_moe.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_transpose.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_rotary_position_embedding.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_masking.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_dropout.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/ranges.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/kernel_launch.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/joinable_thread.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/host_tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/hip_check_error.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/flush_icache.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/fill.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/device_prop.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/device_memory.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/convolution_parameter.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/convolution_host_tensor_descriptor_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/concat.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/check_err.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/arg_parser.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/unary_element_function.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/type_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/transpose_vectors.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/to_sequence.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/static_counter.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/reduce_operator.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/random.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/philox_rand.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/magic_div.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/literals.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/ignore.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/functional_with_tuple.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/functional.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/env.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/debug.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/bit_cast.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/update_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/transpose_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_window_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_window_linear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_window_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_window.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_scatter_gather.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_distribution_encoding.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_distribution.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tensor_view.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tensor_descriptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tensor_coordinate.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tensor_adaptor_coordinate.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tensor_adaptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/sweep_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/store_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/static_distributed_tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/slice_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/shuffle_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/null_tile_window.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/null_tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/load_tile_transpose.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/load_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/buffer_view.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/vector_type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/type_convert.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/pk_int4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/pk_fp4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/numeric.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/null_type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/mxfp_convert.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/math.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/integral_constant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/integer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/int8.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/half.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/float8.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/bfloat16.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/tuple.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/thread_buffer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/statically_indexed_array.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/span.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/sequence.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/multi_index.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/meta_data_buffer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/map.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/container_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/array.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/config.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/workgroup_barrier.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/utility.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/generic_memory_space_atomic.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/arch.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/amd_transpose_load_encoding.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/amd_buffer_addressing_builtins.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/amd_buffer_addressing.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm/static_encoding_pattern.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm/space_filling_curve.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm/indexing_adaptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm/coordinate_transform.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm/cluster_descriptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/README.md -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/README.md -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/bias.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/mask.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rotary.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1.1.0 -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1 -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1.1.0" to "$ORIGIN/../lib:$ORIGIN/../lib/composable_kernel/lib" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_other_operationsTargets.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_other_operationsTargets-relwithdebinfo.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1.1.0 -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1 -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1.1.0" to "$ORIGIN/../lib:$ORIGIN/../lib/composable_kernel/lib" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_gemm_operationsTargets.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_gemm_operationsTargets-relwithdebinfo.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1.1.0 -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1 -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1.1.0" to "$ORIGIN/../lib:$ORIGIN/../lib/composable_kernel/lib" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_conv_operationsTargets.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_conv_operationsTargets-relwithdebinfo.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1.1.0 -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1 -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1.1.0" to "$ORIGIN/../lib:$ORIGIN/../lib/composable_kernel/lib" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_reduction_operationsTargets.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_reduction_operationsTargets-relwithdebinfo.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/ck.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/utils -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/utils/tensor_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/utils/tensor_partition.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/utils/layout_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/utils/kernel_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/traits -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/traits/blockwise_gemm_xdl_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/operations -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/operations/gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/operations/copy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/layout.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/version.h.in -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/workgroup_synchronization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/workgroup_barrier.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/type_convert.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/tuple_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/tuple.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/transpose_vectors.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/thread_group.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/synchronization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/statically_indexed_array_multi_index.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/statically_indexed_array.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/static_buffer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/span.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/sequence_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/sequence.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/scaled_type_convert.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/reduction_operator.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/reduction_functions_accumulate.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/reduction_enums.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/reduction_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/random_gen.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/numeric_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/numeric_limits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/number.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/mxfp_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/mxf8_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/mxf6_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/mxf4_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/multi_index.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/math_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/math.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/magic_division.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/loop_scheduler.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/is_known_at_compile_time.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/is_detected.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/integral_constant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/inner_product_dpp8.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/inner_product.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/ignore.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/get_shift.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/get_id.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/generic_memory_space_atomic.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/functional4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/functional3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/functional2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/functional.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/flush_icache.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/filter_tuple.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/f8_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/env.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/enable_if.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/e8m0.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/dynamic_buffer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/dtype_vector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/dtype_fp64.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/debug.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/data_type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/container_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/container_element_picker.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/common_header.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/c_style_pointer_cast.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/blkgemmpipe_scheduler.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/array_multi_index.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/array.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_xdlops.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_wave_read_first_lane.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_smfmac.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_lds.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_inline_asm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_gemm_dpp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_ck_fp8.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_buffer_addressing_builtins.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_buffer_addressing.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_address_space.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_conv_ngchw_to_nhwgc.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_conv_fwd_to_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_conv_bwd_weight_to_gemm_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_conv_bwd_weight_to_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_conv_bwd_data_to_gemm_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_contraction_to_gemm_arraybase.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_contraction_to_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/warp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/warp/xdlops_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/warp/wmma_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/warp/smfmac_xdlops_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/warp/dpp_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_welford.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v7r3_scatter.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v7r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v7r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v7.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v6r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v6r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v6r1r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v6r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v5r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v4r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v3r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v3r1_gather.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v3r1_dequant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v3r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_util.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_set.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_gemm_dlops_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_contraction_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/reduction_functions_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_welford_variance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_splitk_2nd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_splitk_1st.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_naive_variance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_bwd_gamma_beta.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_tensor_rearrange.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_sparse_embeddings_forward_layernorm_builtins.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_sparse_embeddings_forward_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_set_multiple_buffer_value.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_set_buffer_value.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_put_element_1d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_moe_mx_gemm_bpreshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_moe_mx_gemm_bns.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_moe_mx_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_moe_gemm_blockscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_moe_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v3r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v3r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v3r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v2r4r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v2r4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v2r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_streamk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_splitk_lds_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_skip_b_lds_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_bwd_weight.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_waveletmodel_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_layernorm_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_mx_bpreshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_multi_d_blockscale_b_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_multi_d_b_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_multi_d_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_multi_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_b_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_streamk_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_conv_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_wmma_cshuffle_v3_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_wmma_cshuffle_v3_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_wmma_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_waveletmodel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_split_k_multiple_d_xdl_cshuffle_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_split_k_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_reduce_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_v4_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_xdl_splitk_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_xdl_cshuffle_lds_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_multiple_r_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_abd_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_dpp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_dl_v1r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_dl_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_bias_add_reduce_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_fpAintB_gemm_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_elementwise_layernorm_welford_variance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_elementwise_2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_elementwise_1d_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batchnorm_forward_blockwise_welford.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batchnorm_backward_blockwise_welford.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batched_gemm_softmax_gemm_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batched_gemm_softmax_gemm_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batched_gemm_multiple_d_softmax_gemm_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batched_gemm_multiple_d_gemm_multiple_d_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batched_gemm_gemm_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_2d_reduction_threadwise_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_2d_reduction_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_2d_reduction_multiblock.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_2d_multiple_reduction_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_2d_multiple_reduction_multiblock.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gemm_layernorm -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gemm_layernorm/gridwise_welford_second_half_layernorm2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gemm_layernorm/gridwise_gemm_multiple_d_welford_first_half_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/block_to_ctile_map.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock/gridwise_multiblock_welford_second_half_multiblock_reduce_first_half.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock/gridwise_multiblock_welford_second_half_batchnorm_forward_final_obsolete.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock/gridwise_multiblock_welford_first_half.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock/gridwise_multiblock_reduce_second_half_batchnorm_backward_final.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock/gridwise_multiblock_batchnorm_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element/unary_element_wise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element/quantization_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element/element_wise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element/combined_element_wise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element/binary_element_wise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/welford_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/tensor_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/tensor_layout.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/reduction_operator_mapping.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/matrix_padder.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/masking_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/split_k_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/split_k_arg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_splitk_contraction_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_sparse_embeddings_forward_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_softmax_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_reduce_threadwise_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_reduce_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_reduce_multiblock.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_reduce_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_put_element_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_pool3d_fwd_ndhwc_ndhwc.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_pool2d_fwd_nhwc_nhwc.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_permute_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_normalization_fwd_splitk_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_normalization_fwd_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_normalization_bwd_gamma_beta_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_normalization_bwd_data_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_multiple_reduce_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_multiple_reduce_multiblock.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_multi_query_attention_forward_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_moe_mx_gemm_bpreshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_moe_mx_gemm_bns.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_moe_mx_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_moe_gemm_blockscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_moe_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_max_pool_bwd_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_image_to_column_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_query_attention_forward_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_xdl_splitk_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_xdl_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_softmax_gemm_permute_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_multiple_d_xdl_cshuffle_tile_loop.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_multiple_d_splitk_xdl_cshuffle_two_stage.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_multiple_d_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_multi_abd_xdl_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_d_xdl_large_tensor_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_d_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_d_multiple_r_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_d_multiple_r.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_abd_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_abd_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_dl_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_dl_multiple_d_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_two_stage_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_explicit_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_data_multiple_d_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_data_multiple_d_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_contraction_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_waveletmodel_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_streamk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_splitk_c_shuffle_lds_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_splitk_c_shuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_skip_b_lds.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_layernorm_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v3r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v3_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v3_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v3_b_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_streamk_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_lds_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_wmma_cshuffle_v3_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_wmma_cshuffle_v3_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_wmma_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_reduce_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle_v3_blockscale_bpreshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle_v3_b_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle_v3_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle_lds_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_multiple_r_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_layernorm_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_abd_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_dpp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_bias_add_reduce_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_fpAintB_gemm_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_elementwise_scale_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_elementwise_normalization_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_elementwise_dynamic_vector_dims_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_convnd_bwd_data_nwc_kxc_nwk_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_convnd_bwd_data_nwc_kxc_nwk_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv3d_fwd_xdl_ndhwc_kzyxc_ndhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv3d_fwd_naive_ndhwc_kzyxc_ndhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_fwd_xdl_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_fwd_xdl_c_shuffle_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_fwd_xdl_c_shuffle_bias_activation_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_fwd_xdl_c_shuffle_bias_activation_add_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_bwd_data_xdl_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_backward_weight_xdl_c_shuffle_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_contraction_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_contraction_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_contraction_multiple_abd_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_column_to_image_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_cgemm_4gemm_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batchnorm_forward_impl_obsolete.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batchnorm_forward_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batchnorm_backward_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_xdl_fpAintB_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_wmma_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_softmax_gemm_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_softmax_gemm_permute_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_softmax_gemm_permute_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_reduce_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multiple_d_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multiple_d_gemm_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multiple_d_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multi_d_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_gemm_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_e_permute_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_contraction_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_contraction_multiple_d_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_avgpool3d_bwd_ndhwc_ndhwc.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_avgpool2d_bwd_nhwc_nhwc.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/codegen_device_grouped_conv_fwd_multiple_abd_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/gemm_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_splitk_contraction_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_reduce_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_put_element.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_pool_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_normalization_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_normalization_bwd_gamma_beta.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_normalization_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_multiple_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_max_pool_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_tile_loop.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_splitk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_softmax_gemm_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_multi_abd_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_multi_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_fwd_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_fwd_multiple_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_bwd_weight_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_bwd_weight.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_bwd_data_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_contraction_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_streamk_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_streamk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_splitk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_multiple_d_multiple_r.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_multiple_d_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_multiple_d_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_multiple_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_dequantB.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_bias_e_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_elementwise_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_elementwise_normalization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_conv_tensor_rearrange.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_conv_fwd_bias_activation_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_conv_fwd_bias_activation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_conv_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_conv_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_contraction_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_contraction_multiple_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_cgemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batchnorm_infer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batchnorm_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batchnorm_backward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_softmax_gemm_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_softmax_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_multiple_d_gemm_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_e_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_contraction_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_avgpool_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/convolution_forward_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/convolution_backward_weight_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/convolution_backward_data_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/conv_tensor_rearrange_op.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v7r3_scatter.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v7r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v7r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v7.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v6r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v6r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v6r1r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v6r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v4r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v4r1_gather.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v4r1_dequant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v4r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_gather_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/reduction_functions_blockwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_welford.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_tensor_slice_transfer_v5r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_xdlops_skip_b_lds.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_xdlops.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_smfmac_xdlops.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v5.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v4_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v3_mx_bpreshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v3_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v3_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v3_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v2_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v2_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v1_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v1_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v1_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_nbs_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_nbs_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_nbs_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_nbs_gufusion_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_gufusion_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_bpreshuffle_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_moe_blockscale_b_preshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_moe_blockscale_b_preshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_moe_blockscale_b_preshuffle_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_moe_blockscale_b_preshuffle_gufusion_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_moe_blockscale_b_preshuffle_gufusion_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_blockscale_b_preshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_blockscale_b_preshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_blockscale_b_preshuffle_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_scale_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_mx_moe_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_mx_moe_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_mx_moe_gufusion_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_gufusion_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_gufusion_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_gufusion_dequant_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_dequant_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_dequant_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_ab_scale_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_wmmaops_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_wmmaops_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_wmmaops_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_wmmaops.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_wmma_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_mx_pipeline_xdlops_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_dpp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_dlops_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_dlops_v2r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_dl_v2r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/tensor_space_filling_curve.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/tensor_descriptor_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/tensor_descriptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/tensor_adaptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/multi_index_transform_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/multi_index_transform.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/cluster_descriptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor/static_tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/stream_config.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/problem_transform -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/problem_transform/transform_forward_convolution3d_into_gemm_v4r4r4_ndhwc_kzyxc_ndhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/thread.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/ranges.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/numeric.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/literals.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/iterator.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/host_tensor_generator.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/host_tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/host_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/host_common_util.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/fill.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/device_memory.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/convolution_parameter.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/convolution_host_tensor_descriptor_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/conv_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/check_err.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/algorithm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/stream_utility.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/kernel_launch.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/io.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/hip_check_error.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/flush_cache.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/device_prop.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/filesystem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/config.h.in -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/README.md -- Up-to-date: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck -- Up-to-date: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/transpose_3d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/transpose -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/transpose/device_transpose_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i32_i8_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i32_i8_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_impl_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i32_i8_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i32_i8_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization/grouped_convolution_forward_perlayer_quantization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization/grouped_convolution_forward_perchannel_quantization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization/grouped_convolution_bias_forward_perlayer_quantization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization/grouped_convolution_bias_forward_perchannel_quantization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization/gemm_quantization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/pool3d_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/pool2d_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/permute_scale -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/permute_scale/device_permute_scale_instances.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/permute_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/normalization_fwd_swish.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/normalization_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/max_pool_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/layernorm_bwd_gamma_beta.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/layernorm_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/groupnorm_bwd_gamma_beta.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/groupnorm_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_tile_loop_multiply.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_tile_loop.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_multi_abd_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_fastgelu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_bias.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm/device_grouped_gemm_xdl_splitk_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_xdl_merged_groups.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_xdl_large_tensor.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_wmma.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_scaleadd_scaleadd_relu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_scaleadd_ab.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_mem_intra_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_mem_inter_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_dynamic_op.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_dl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_convscale_relu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_convscale_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_convscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_convinvscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_comp_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_clamp_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_clamp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_bilinear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_bias_clamp_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_bias_clamp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_wmma.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_explicit_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_dl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_bilinear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_data_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_data_wmma.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_data_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_data_bilinear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_scaleadd_scaleadd_relu_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_scaleadd_ab_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_scale_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_outelementop_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_merged_groups_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_mem_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_large_tensor_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_dynamic_op_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_comp_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_binary_outelementop_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_bilinear_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_wmma_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_dl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_xdl_scale_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_xdl_bilinear_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_wmma_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_v3_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_two_stage_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_dl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_exp_gemm_xdl_universal_km_kn_mn_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_xdl_scale_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_xdl_bilinear_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_i8_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_f16_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_transpose_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_wmma.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_wmma.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_streamk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_preshuffle.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_batched.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_streamk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_splitk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_multiply_multiply_wp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_multiply_multiply.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_multiply_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_multi_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_fastgelu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_dpp.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_dl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_blockscale_wp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_bilinear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_silu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_relu_add_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_relu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_multiply.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_fastgelu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_add_fastgelu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/elementwise_normalization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v2_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_interwave_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/device_gemm_mean_squaremean_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/device_elementwise_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/convolution_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/convolution_backward_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/conv_tensor_rearrange -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/conv_tensor_rearrange/device_image_to_column_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/conv_tensor_rearrange/device_column_to_image_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/conv_tensor_rearrange.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/contraction_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/contraction_bilinear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/contraction -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/contraction/device_contraction_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batchnorm_infer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batchnorm_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batchnorm_backward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_softmax_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_bias_softmax_gemm_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_bias_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_add_relu_gemm_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/avg_pool3d_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/avg_pool2d_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/device_operation_instance_factory.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/add_grouped_conv_bwd_wei_exp_device_operation_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/add_device_operation_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/gpu -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/gpu/reference_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/gpu/naive_conv_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_sparse_embedding3_forward_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_pool_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_mx_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_mx_gemm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_mx_gemm1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_gemm2_blockscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_gemm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_gemm1_blockscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_maxpool_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_layernorm_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_image_to_column.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_groupnorm_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_groupnorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_gemm_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_gemm_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_fpAintB_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_conv_fwd_bias_activation_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_conv_fwd_bias_activation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_conv_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_conv_bwd_weight.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_conv_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_contraction.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_column_to_image.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_cgemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_batchnorm_infer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_batchnorm_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_batchnorm_backward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_batched_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_avgpool_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libutility.so.1.1.0 -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libutility.so.1 -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libutility.so.1.1.0" to "$ORIGIN/../lib:$ORIGIN/../lib/composable_kernel/lib" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libutility.so -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kernelutilityTargets.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kernelutilityTargets-relwithdebinfo.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kernelConfig.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kernelConfigVersion.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/version.h -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/config.h -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/doc/composablekernel/LICENSE + cp -p -r include/ck_tile /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include + _target= + _symlinks=0 + fdupes -q -n -r -p /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr + read _file + test -z '' + _target=/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_i8_instance.hpp + read _file + test -z /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_i8_instance.hpp + test -z /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_f16_instance.hpp + test 0 = 1 + ln -f /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_i8_instance.hpp /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_f16_instance.hpp + read _file + test -z /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_i8_instance.hpp + test -z '' + _target= + continue + read _file + rm -f /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/doc/composablekernel/LICENSE + /usr/bin/find-debuginfo -j4 --strict-build-id -m -i --build-id-seed 7.1.0-2.fc43 --unique-debug-suffix -7.1.0-2.fc43.x86_64 --unique-debug-src-base composable_kernel-7.1.0-2.fc43.x86_64 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 110000000 -S debugsourcefiles.list /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0 find-debuginfo: starting Extracting debug info from 5 files debugedit: debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubnames debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubtypes /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubnames debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubtypes debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1.1.0: Unit type 4 unhandled debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1.1.0: Unit type 4 unhandled debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubnames debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubtypes debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1.1.0: Unit type 4 unhandled debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubnames debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubtypes debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1.1.0: Unit type 4 unhandled DWARF-compressing 5 files dwz: ./usr/lib64/libdevice_conv_operations.so.1.1.0-7.1.0-2.fc43.x86_64.debug: Unknown debugging section .debug_addr dwz: ./usr/lib64/libdevice_gemm_operations.so.1.1.0-7.1.0-2.fc43.x86_64.debug: Unknown debugging section .debug_addr dwz: ./usr/lib64/libdevice_other_operations.so.1.1.0-7.1.0-2.fc43.x86_64.debug: Unknown debugging section .debug_addr dwz: ./usr/lib64/libdevice_reduction_operations.so.1.1.0-7.1.0-2.fc43.x86_64.debug: Unknown debugging section .debug_addr dwz: ./usr/lib64/libutility.so.1.1.0-7.1.0-2.fc43.x86_64.debug: Unknown debugging section .debug_addr dwz: Too few files for multifile optimization sepdebugcrcfix: Updated 0 CRC32s, 5 CRC32s did match. Creating .debug symlinks for symlinks to ELF files Copying sources found by 'debugedit -l' to /usr/src/debug/composable_kernel-7.1.0-2.fc43.x86_64 find-debuginfo: done + /usr/lib/rpm/check-buildroot + /usr/lib/rpm/redhat/brp-ldconfig + /usr/lib/rpm/brp-compress + /usr/lib/rpm/redhat/brp-strip-lto /usr/bin/strip + /usr/lib/rpm/check-rpaths + /usr/lib/rpm/redhat/brp-mangle-shebangs + /usr/lib/rpm/brp-remove-la-files + /usr/lib/rpm/redhat/brp-python-rpm-in-distinfo + env /usr/lib/rpm/redhat/brp-python-bytecompile '' 1 0 -j4 + /usr/lib/rpm/redhat/brp-python-hardlink + /usr/bin/add-determinism --brp -j4 /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT Scanned 141 directories and 1232 files, processed 0 inodes, 0 modified (0 replaced + 0 rewritten), 0 unsupported format, 0 errors Reading /builddir/build/BUILD/composable_kernel-7.1.0-build/SPECPARTS/rpm-debuginfo.specpart Processing files: composable_kernel-7.1.0-2.fc43.x86_64 Executing(%doc): /bin/sh -e /var/tmp/rpm-tmp.SztDT7 + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + cd composable_kernel-rocm-7.1.0 + DOCDIR=/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/doc/composable_kernel + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export DOCDIR + /usr/bin/mkdir -p /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/doc/composable_kernel + cp -pr /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/README.md /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/doc/composable_kernel + RPM_EC=0 ++ jobs -p + exit 0 Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.iOzMNy + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + cd composable_kernel-rocm-7.1.0 + LICENSEDIR=/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/licenses/composable_kernel + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export LICENSEDIR + /usr/bin/mkdir -p /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/licenses/composable_kernel + cp -pr /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/LICENSE /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/licenses/composable_kernel + RPM_EC=0 ++ jobs -p + exit 0 Provides: composable_kernel = 7.1.0-2.fc43 composable_kernel(x86-64) = 7.1.0-2.fc43 libdevice_conv_operations.so.1()(64bit) libdevice_gemm_operations.so.1()(64bit) libdevice_other_operations.so.1()(64bit) libdevice_reduction_operations.so.1()(64bit) libutility.so.1()(64bit) Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: libamdhip64.so.7()(64bit) libamdhip64.so.7(hip_4.2)(64bit) libamdhip64.so.7(hip_6.0)(64bit) libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_2.32)(64bit) libc.so.6(GLIBC_2.38)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.5)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.14)(64bit) libstdc++.so.6(GLIBCXX_3.4.15)(64bit) libstdc++.so.6(GLIBCXX_3.4.20)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.26)(64bit) libstdc++.so.6(GLIBCXX_3.4.29)(64bit) libstdc++.so.6(GLIBCXX_3.4.31)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) libstdc++.so.6(GLIBCXX_3.4.9)(64bit) rtld(GNU_HASH) Processing files: composable_kernel-devel-7.1.0-2.fc43.x86_64 Provides: cmake(composable_kernel) = 1.1.0 composable_kernel-devel = 7.1.0-2.fc43 composable_kernel-devel(x86-64) = 7.1.0-2.fc43 composable_kernel-static = 7.1.0-2.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PartialHardlinkSets) <= 4.0.4-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: cmake-filesystem(x86-64) libdevice_conv_operations.so.1()(64bit) libdevice_gemm_operations.so.1()(64bit) libdevice_other_operations.so.1()(64bit) libdevice_reduction_operations.so.1()(64bit) libutility.so.1()(64bit) Processing files: composable_kernel-debugsource-7.1.0-2.fc43.x86_64 Provides: composable_kernel-debugsource = 7.1.0-2.fc43 composable_kernel-debugsource(x86-64) = 7.1.0-2.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Processing files: composable_kernel-debuginfo-7.1.0-2.fc43.x86_64 Provides: composable_kernel-debuginfo = 7.1.0-2.fc43 composable_kernel-debuginfo(x86-64) = 7.1.0-2.fc43 debuginfo(build-id) = 51ab0f773549b169ee42dd2b1546388b5115e6ff debuginfo(build-id) = b02060f1b8211a6525bec0e79179219e421e3d02 debuginfo(build-id) = bbdf7d4fd2804a3b457bae408eb31df764781739 debuginfo(build-id) = cd8aca73f8c0d30294cb1865ff2a18f3ef3441f0 debuginfo(build-id) = db62cc9baaffec6b9d0032f1d6f7334a74d3d9fc libdevice_conv_operations.so.1.1.0-7.1.0-2.fc43.x86_64.debug()(64bit) libdevice_gemm_operations.so.1.1.0-7.1.0-2.fc43.x86_64.debug()(64bit) libdevice_other_operations.so.1.1.0-7.1.0-2.fc43.x86_64.debug()(64bit) libdevice_reduction_operations.so.1.1.0-7.1.0-2.fc43.x86_64.debug()(64bit) libutility.so.1.1.0-7.1.0-2.fc43.x86_64.debug()(64bit) Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Recommends: composable_kernel-debugsource(x86-64) = 7.1.0-2.fc43 Checking for unpackaged file(s): /usr/lib/rpm/check-files /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT Wrote: /builddir/build/RPMS/composable_kernel-debugsource-7.1.0-2.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/composable_kernel-devel-7.1.0-2.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/composable_kernel-7.1.0-2.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/composable_kernel-debuginfo-7.1.0-2.fc43.x86_64.rpm Executing(rmbuild): /bin/sh -e /var/tmp/rpm-tmp.kz0nIp + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + test -d /builddir/build/BUILD/composable_kernel-7.1.0-build + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w /builddir/build/BUILD/composable_kernel-7.1.0-build + rm -rf /builddir/build/BUILD/composable_kernel-7.1.0-build + RPM_EC=0 ++ jobs -p + exit 0 Finish: rpmbuild composable_kernel-7.1.0-2.fc43.src.rpm Finish: build phase for composable_kernel-7.1.0-2.fc43.src.rpm INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-43-x86_64-1763473447.786242/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names INFO: Done(/var/lib/copr-rpmbuild/results/composable_kernel-7.1.0-2.fc43.src.rpm) Config(child) 1799 minutes 7 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot Finish: run Running RPMResults tool Package info: { "packages": [ { "name": "composable_kernel-devel", "epoch": null, "version": "7.1.0", "release": "2.fc43", "arch": "x86_64" }, { "name": "composable_kernel", "epoch": null, "version": "7.1.0", "release": "2.fc43", "arch": "src" }, { "name": "composable_kernel-debuginfo", "epoch": null, "version": "7.1.0", "release": "2.fc43", "arch": "x86_64" }, { "name": "composable_kernel", "epoch": null, "version": "7.1.0", "release": "2.fc43", "arch": "x86_64" }, { "name": "composable_kernel-debugsource", "epoch": null, "version": "7.1.0", "release": "2.fc43", "arch": "x86_64" } ] } RPMResults finished